Connect with us

Books, Courses & Certifications

Raiza Martin on Building AI Applications for Audio – O’Reilly

Published

on


Generative AI in the Real World

Generative AI in the Real World: Raiza Martin on Building AI Applications for Audio



Loading





/

Audio is being added to AI everywhere: both in multimodal models that can understand and generate audio and in applications that use audio for input. Now that we can work with spoken language, what does that mean for the applications that we can develop? How do we think about audio interfaces—how will people use them, and what will they want to do? Raiza Martin, who worked on Google’s groundbreaking NotebookLM, joins Ben Lorica to discuss how she thinks about audio and what you can build with it.

About the Generative AI in the Real World podcast: In 2023, ChatGPT put AI on everyone’s agenda. In 2025, the challenge will be turning those agendas into reality. In Generative AI in the Real World, Ben Lorica interviews leaders who are building with AI. Learn from their experience to help put AI to work in your enterprise.

Check out other episodes of this podcast on the O’Reilly learning platform.

Timestamps

  • 0:00: Introduction to Raiza Martin, who cofounded Huxe and formerly led Google’s NotebookLM team. What made you think this was the time to trade the comforts of big tech for a garage startup?
  • 1:01: It was a personal decision for all of us. It was a pleasure to take NotebookLM from an idea to something that resonated so widely. We realized that AI was really blowing up. We didn’t know what it would be like at a startup, but we wanted to try. Seven months down the road, we’re having a great time.
  • 1:54: For the 1% who aren’t familiar with NotebookLM, give a short description.
  • 2:06: It’s basically contextualized intelligence, where you give NotebookLM the sources you care about and NotebookLM stays grounded to those sources. One of our most common use cases was that students would create notebooks and upload their class materials, and it became an expert that you could talk with.
  • 2:43: Here’s a use case for homeowners: put all your user manuals in there. 
  • 3:14: We have had a lot of people tell us that they use NotebookLM for Airbnbs. They put all the manuals and instructions in there, and users can talk to it.
  • 3:41: Why do people need a personal daily podcast?
  • 3:57: There are a lot of different ways that I think about building new products. On one hand, there are acute pain points. But Huxe comes from a different angle: What if we could try to build very delightful things? The inputs are a little different. We tried to imagine what the average person’s daily life is like. You wake up, you check your phone, you travel to work; we thought about opportunities to make something more delightful. I think a lot about TikTok. When do I use it? When I’m standing in line. We landed on transit time or commute time. We wanted to do something novel and interesting with that space in time. So one of the first things was creating really personalized audio content. That was the provocation: What do people want to listen to? Even in this short time, we’ve learned a lot about the amount of opportunity.
  • 6:04: Huxe is mobile first, audio first, right? Why audio?
  • 6:45: Coming from our learnings from NotebookLM, you learn fundamentally different things when you change the modality of something. When I go on walks with ChatGPT, I just talk about my day. I noticed that was a very different interaction from when I type things out to ChatGPT. The flip side is less about interaction and more about consumption. Something about the audio format made the types of sources different as well. The sources we uploaded to NotebookLM were different as a result of wanting audio output. By focusing on audio, I think we’ll learn different use cases than the chat use cases. Voice is still largely untapped. 
  • 8:24: Even in text, people started exploring other form factors: long articles, bullet points. What kinds of things are available for voice?
  • 8:49: I think of two formats: one passive and one interactive. With passive formats, there are a lot of different things you can create for the user. The things you end up playing with are (1) what is the content about and (2) how flexible is the content? Is it short, long, malleable to user feedback? With interactive content, maybe I’m listening to audio, but I want to interact with it. Maybe I want to join in. Maybe I want my friends to join in. Both of those contexts are new. I think this is what’s going to emerge in the next few years. I think we’ll learn that the types of things we will use audio for are fundamentally different from the things we use chat for.
  • 10:19: What are some of the key lessons to avoid from smart speakers?
  • 10:25: I’ve owned so many of them. And I love them. My primary use for the smart speakers is still a timer. It’s expensive and doesn’t live up to the promise. I just don’t think the technology was ready for what people really wanted to do. It’s hard to think about how that could have worked without AI. Second, one of the most difficult things about audio is that there is no UI. A smart speaker is a physical device. There’s nothing that tells you what to do. So the learning curve is steep. So now you have a user who doesn’t know what they can use the thing for. 
  • 12:20: Now it can do so much more. Even without a UI, the user can just try things. But there’s a risk in that it still requires input from the user. How do we think about a system that is so supportive that you don’t have to come up with how to make it work? That’s the challenge from the smart speaker era.
  • 12:56: It’s interesting that you point out the UI. With a chatbot you have to type something. With a smart speaker, people started getting creeped out by surveillance. So, will Huxe surveil me?
  • 13:18: I think there’s something simple about it, which is the wake word. Because smart speakers are triggered by wake words, they are always on. If the user says something, it’s probably picking it up, and it’s probably logged somewhere. With Huxe, we want to be really careful about where we believe consumer readiness is. You want to push a little bit but not too far. If you push too far, people get creeped out. 
  • 14:32: For Huxe, you have to turn it on to use it. It’s clunky in some ways, but we can push on that boundary and see if we can push for something that’s more ambiently on. We’re starting to see the emergence of more tools that are always on. There are tools like Granola and Cluely: They’re always on, looking at your screen, transcribing your audio. I’m curious—are we ready for technology like that? In real life, you can probably get the most utility from something that is always on. But whether consumers are ready is still TBD.
  • 15:25: So you’re ingesting calendars, email, and other things from the users. What about privacy? What are the steps you’ve taken?
  • 15:48: We’re very privacy focused. I think that comes from building NotebookLM. We wanted to make sure we were very respectful of user data. We didn’t train on any user data; user data stayed private. We’re taking the same approach with Huxe. We use the data you share with Huxe to improve your personal experience. There’s something interesting in creating personal recommendation models that don’t go beyond your usage of the app. It’s a little harder for us to build something good, but it respects privacy, and that’s what it takes to get people to trust.
  • 17:08: Huxe may notice that I have a flight tomorrow and tell me that the flight is delayed. To do so, it has had to contact an external service, which now knows about my flight.
  • 17:26: That’s a good point. I think about building Huxe like this: If I were in your pocket, what would I do? If I saw a calendar that said “Ben has a flight,” I can check that flight without leaking your personal information. I can just look up the flight number. There are a lot of ways you can do something that provides utility but doesn’t leak data to another service. We’re trying to understand things that are much more action oriented. We try to tell you about weather, about traffic; these are things we can do without stepping on user privacy.
  • 18:38: The way you described the system, there’s no social component. But you end up learning things about me. So there is the potential for building a more sophisticated filter bubble. How do you make sure that I’m ingesting things beyond my filter bubble?
  • 19:08: It comes down to what I believe a person should or shouldn’t be consuming. That’s always tricky. We’ve seen what these feeds can do to us. I don’t know the correct formula yet. There’s something interesting about “How do I get enough user input so I can give them a better experience?” There’s signal there. I try to think about a user’s feed from the perspective of relevance and less from an editorial perspective. I think the relevance of information is probably enough. We’ll probably test this once we start surfacing more personalized information. 
  • 20:42: The other thing that’s really important is surfacing the correct controls: I like this; here’s why. I don’t like this; why not? Where you inject tension in the system, where you think the system should push back—that takes a little time to figure out how to do it right.
  • 21:01: What about the boundary between giving me content and providing companionship?
  • 21:09: How do we know the difference between an assistant and a companion? Fundamentally the capabilities are the same. I don’t know if the question matters. The user will use it how the user intends to use it. That question matters most in the packaging and the marketing. I talk to people who talk about ChatGPT as their best friend. I talk to others who talk about it as an employee. On a capabilities level, they’re probably the same thing. On a marketing level, they’re different.
  • 22:22: For Huxe, the way I think about this is which set of use cases you prioritize. Beyond a simple conversation, the capabilities will probably start diverging. 
  • 22:47: You’re now part of a very small startup. I assume you’re not building your own models; you’re using external models. Walk us through privacy, given that you’re using external models. As that model learns more about me, how much does that model retain over time? To be a really good companion, you can’t be clearing that cache every time I log out.
  • 23:21: That question pertains to where we store data and how it’s passed off. We opt for models that don’t train on the data we send them. The next layer is how we think about continuity. People expect ChatGPT to have knowledge of all the conversations you have. 
  • 24:03: To support that you have to build a very durable context layer. But you don’t have to imagine that all of that gets passed to the model. A lot of technical limitations prevent you from doing that anyway. That context is stored at the application layer. We store it, and we try to figure out the right things to pass to the model, passing as little as possible.
  • 25:17: You’re from Google. I know that you measure, measure, measure. What are some of the signals you measure? 
  • 25:40: I think about metrics a little differently in the early stages. Metrics in the beginning are nonobvious. You’ll get a lot of trial behavior in the beginning. It’s a little harder to understand the initial user experience from the raw metrics. There are some basic metrics that I care about—the rate at which people are able to onboard. But as far as crossing the chasm (I think of product building as a series of chasms that never end), you look for people who really love it, who rave about it; you have to listen to them. And then the people who used the product and hated it. When you listen to them, you discover that they expected it to do something and it didn’t. It let them down. You have to listen to these two groups, and then you can triangulate what the product looks like to the outside world. The thing I’m trying to figure out is less “Is it a hit?” but “Is the market ready for it? Is the market ready for something this weird?” In the AI world, the reality is that you’re testing consumer readiness and need, and how they are evolving together. We did this with NotebookLM. When we showed it to students, there was zero time between when they saw it and when they understood it. That’s the first chasm. Can you find people who understand what they think it is and feel strongly about it?
  • 28:45: Now that you’re outside of Google, what would you want the foundation model builders to focus on? What aspects of these models would you like to see improved?
  • 29:20: We share so much feedback with the model providers—I can provide feedback to all the labs, not just Google, and that’s been fun. The universe of things right now is pretty well known. We haven’t touched the space where we’re pushing for new things yet. We always try to drive down latency. It’s a conversation—you can interrupt. There’s some basic behavior there that the models can get better at. Things like tool-calling, making it better and parallelizing it with voice model synthesis. Even just the diversity of voices, languages, and accents; that sounds basic, but it’s actually pretty hard. Those top three things are pretty well known, but it will take us through the rest of the year.
  • 30:48: And narrowing the gap between the cloud model and the on-device model.
  • 30:52: That’s interesting too. Today we’re making a lot of progress on the smaller on-device models, but when you think of supporting an LLM and a voice model on top of it, it actually gets a little bit hairy, where most people would just go back to commercial models.
  • 31:26: What’s one prediction in the consumer AI space that you would make that most people would find surprising?
  • 31:37: A lot of people use AI for companionship, and not in the ways that we imagine. Almost everyone I talk to, the utility is very personal. There are a lot of work use cases. But the emerging side of AI is personal. There’s a lot more area for discovery. For example, I use ChatGPT as my running coach. It ingests all of my running data and creates running plans for me. Where would I slot that? It’s not productivity, but it’s not my best friend; it’s just my running coach. More and more people are doing these complicated personal things that are closer to companionship than enterprise use cases. 
  • 33:02: You were supposed to say Gemini!
  • 33:04: I love all of the models. I have a use case for all of them. But we all use all the models. I don’t know anyone who only uses one. 
  • 33:22: What you’re saying about the nonwork use cases is so true. I come across so many people who treat chatbots as their friends. 
  • 33:36: I do it all the time now. Once you start doing it, it’s a lot stickier than the work use cases. I took my dog to get groomed, and they wanted me to upload his rabies vaccine. So I started thinking about how well it’s protected. I opened up ChatGPT, and spent eight minutes talking about rabies. People are becoming more curious, and now there’s an immediate outlet for that curiosity. It’s so much fun. There’s so much opportunity for us to continue to explore that. 
  • 34:48: Doesn’t this indicate that these models will get sticky over time? If I talk to Gemini a lot, why would I switch to ChatGPT?
  • 35:04: I agree. We see that now. I like Claude. I like Gemini. But I really like the ChatGPT app. Because the app is a good experience, there’s no reason for me to switch. I’ve talked to ChatGPT so much that there’s no way for me to port my data. There’s data lock-in.



Source link

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Books, Courses & Certifications

Teachers Turn Toward Virtual Schools for Better Work-Life Balance

Published

on


As Molly Hamill explains the origin of the Declaration of Independence to her students, she dons a white wig fashioned into a ponytail, appearing as John Adams, before sporting a bald cap in homage to Benjamin Franklin, then wearing a red wig to imitate Thomas Jefferson. But instead of looking out to an enraptured sea of 28 fifth graders leaning forward in their desks, she is speaking directly into a camera.

Hamill is one of a growing number of educators who forwent brick-and-mortar schools post-pandemic. She now teaches fully virtually through the public, online school California Virtual Academies, having swapped desks for desktops.

Molly Hamill teaches a lesson about the Declaration of Independence.

After the abrupt shift to virtual schooling during the COVID-19 health crisis — and the stress for many educators because of it — voluntarily choosing the format may seem unthinkable.

“You hear people say, ‘I would never want to go back to virtual,’ and I get it, it was super stressful because we were building the plane as we were flying it, deciding if we were going to have live video or recordings, and adapt all the teaching materials to virtual,” Hamill says. “But my school is a pretty well-oiled machine … there’s a structure already in place. And kids are adaptable, they already like being on a computer.”

And for Hamill, and thousands of other teachers, instructing through a virtual school is a way to attempt striking a rare work-life balance in the education world.

More Flexibility for Teaching Students

The number of virtual schools has grown, as has the number of U.S. children enrolled in them. In the 2022-2023 school year, about 2.5 percent of K-12 students were enrolled in full-time virtual education (1.8 percent of them through public or private online schools, and 0.7 percent as homeschoolers), according to data published in 2024 by the National Center for Education Statistics. And parents reported that 7 percent of students who learned at home that year took at least one virtual course.

There’s been an accompanying rise in the number of teachers instructing remotely via virtual schools.

The number of teachers employed by K12, which is under the parent company Stride Inc. and one of the largest and longest-running providers of virtual schools, has jumped from 6,500 to 8,000 over the last three or four years, says Niyoka McCoy, chief learning officer at the company.

McCoy credits the growth in part to teachers wanting to homeschool their own children, and therefore needing to do their own work from home, but she also thinks it is a sign of a shifting preference for technology-based offerings.

“They think this is the future, that more online programs will open up,” McCoy says.

Connections Academy, which is the parent company of Pearson Online Academy and a similarly long-standing online learning provider, employs 3,500 teachers. Nik Osborne, senior vice president of partnerships and customer success at Pearson, says it’s been easy to both recruit and keep teachers: roughly 91 percent of teachers in the 2024-2025 school year returned this academic year.

“Teaching in a virtual space is very different than brick-and-mortar; even the type of role teachers play appeals to some teachers,” Osborne says. “They become more of a guide to help the kids understand content.”

Courtney Entsminger, a middle school math teacher at the public, online school Virginia Connections Academy, teaches asynchronously and likes the ability to record her own lesson plans in addition to teaching them live, which she says helps a wider variety of learners. Hamill, who teaches synchronously, similarly likes that the virtual format can be leveraged to build more creative lesson plans, like her Declaration of Independence video, or a fake livestream of George Washington during the Battle of Trenton, both which are on her YouTube channel.

Whether a school is asynchronous or not largely depends on the standard of the provider. Pearson, which runs the Virtual Academies where Entsminger teaches, is asynchronous. For other standalone public school districts, such as Georgia Cyber Academy, the decision comes down to what students need: if they are performing at or above grade level, they get more flexibility, but if they come to the school below grade level — reading at a second grade level, for example, but placed in a fourth grade classroom — they need more structure.

“I do feel like a TikTok star where I record myself teaching through different aspects of that curriculum because students work in different ways,” says Entsminger, who has 348 online students across three grades. “In person you’re able to realize ‘this student works this way,’ and I’ll do a song and dance in front of you. Online, I can do it in different mediums.”

Karen Bacon, a transition liaison at Ohio Virtual Academy who works with middle and high school students in special education, was initially drawn to virtual teaching because of its flexibility for supporting students through a path that works best for them.

“I always like a good challenge and thought this was interesting to dive into how this works and different ways to help students,” says Bacon, who was a high school French teacher before making the switch to virtual in 2017. “There’s obviously a lot to learn and understand, but once you dive in and see all the options, there really are a lot of different possibilities out there.”

Bacon says there are “definitely less distractions,” than in a brick-and-mortar environment, allowing her to get more creative. For example, she had noticed stories crop up across the nation showcasing special education students in physical environments working to serve coffee to teachers and students as a way to learn workplace skills. She, adapting to the virtual environment, created the “Cardinal Cafe,” where students can accomplish the same goals, albeit with a virtual cup of joe.

The “Cardinal Cafe” allows students to “serve” coffee to fellow “customers,” similar to brick-and-mortar setups.

“I don’t really consider myself super tech-y, but I have that curiosity and love going outside the box and looking at ways to really help my students,” she says.

A Way to Curb Teacher Burnout?

The flexibility that comes with teaching in a virtual environment is not just appealing for what it offers students. Teachers say it can also help cushion the consistently lower wages and lack of benefits most educators grapple with, conditions that drive many to leave the field.

“So many of us have said, ‘I felt so burned out, I wasn’t sure I could keep teaching,’” Hamill says, adding she felt similarly at the start of her career as a first grade teacher. “But doing it this way helps it feel sustainable. We’re still underpaid and not appreciated enough as a whole profession, but at least virtually some of the big glaring issues aren’t there in terms of how we’re treated.”

Entsminger was initially drawn to teaching in part because she hoped it would allow her to have more time with her future children than other careers might offer. But as she became a mother while teaching for a decade in a brick-and-mortar environment — both at the elementary school and the high school level — she found she was unable to pick up or drop her daughter off at school, despite working in the same district her daughter attended.

In contrast, while teaching online,“in this environment I’m able to take her to school, make her breakfast,” she says. “I’m able to do life and my job. On the daily, I’m able to be ‘Mom’ and ‘Ms. Entsminger’ with less fighting for my time.”

Because of the more-flexible schedule for students enrolled in virtual learning programs, teachers do not have to be “on” for eight straight hours. And they do not necessarily have to participate in the sorts of shared systems that keep physical schools running. In a brick-and-mortar school, even if Bacon, Hamill or Entsminger were not slated to teach a class, they might be assigned to spend their time walking their students to their next class or the bus stop, or tasked with supervising the cafeteria during a lunch period. But in the virtual environment, they have the ability to close their laptop, and to quietly plan lessons or grade papers.

However, that is not to say these teachers operate as islands. Hamill says one of the largest perks of teaching virtual school is working with other fifth grade teachers across the nation, who often share PowerPoints or other lesson plans, whereas, she says, “I think sometimes in person, people can be a little precious about that.”

The workload varies for teachers in virtual programs. Entsminger’s 300-plus students are enrolled in three grades. Some live as close as her same city, others as far-flung as Europe, where they play soccer. Hamill currently has 28 students, expecting to get to 30 as the school continuously admits more. According to the National Policy Education Center, the average student-teacher ratio in the nation’s public schools was 14.8 students per teacher in 2023, with virtual schools reporting having 24.4 students per teacher.

Hamill also believes that virtual environments keep both teachers and students safer. She says she was sick for nine months of the year her first year teaching, getting strep throat twice. She also points to the seemingly endless onslaught of school shootings and the worsening of behavior issues among children.

“The trade-off for not having to do classroom management of behavioral issues is huge,” she says. “If the kid is mean in the chat, I turn off the chat. If kids aren’t listening, I can mute everyone and say, ‘I’ll let you talk one at a time.’ Versus, in my last classroom, the kids threw chairs at me.”

There are still adjustments to managing kids remotely, the teachers acknowledge. Hamill coaches her kids through internet safety and online decorum, like learning that typing in all-caps, for example, can come across rudely.

And while the virtual teachers were initially concerned about bonding with their students, they have found those worries largely unfounded. During online office hours, Hamill plays Pictionary with her students and has met most of their pets over a screen. Meanwhile, Entsminger offers online tutoring and daily opportunities to meet, where she has “learned more than I ever thought about K-pop this year.”

There are also opportunities for in-person gatherings with students. Hamill does once-a-month meetups, often in a park. Bacon attended an in-person picnic earlier this month to meet the students who live near her. And both K12 and Connections Academy hold multiple in-person events for students, including field trips and extracurriculars, like sewing or bowling clubs.

“Of course I wish I could see them more in person, and do arts and crafts time — that’s a big thing I miss,” Hamill says. “But we have drawing programs or ways they can post their artwork; we find ways to adapt to it.”

And that adaptation is largely worth it to virtual teachers.

“Teaching is teaching; even if I’m behind a computer screen, kids are still going to be kids,” Entsminger says. “The hurdles are still there. We’re still working hard, but it’s really nice to work with my students, and then walk to my kitchen to get coffee, then come back to connect to my students again.”



Source link

Continue Reading

Books, Courses & Certifications

Schedule topology-aware workloads using Amazon SageMaker HyperPod task governance

Published

on


Today, we are excited to announce a new capability of Amazon SageMaker HyperPod task governance to help you optimize training efficiency and network latency of your AI workloads. SageMaker HyperPod task governance streamlines resource allocation and facilitates efficient compute resource utilization across teams and projects on Amazon Elastic Kubernetes Service (Amazon EKS) clusters. Administrators can govern accelerated compute allocation and enforce task priority policies, improving resource utilization. This helps organizations focus on accelerating generative AI innovation and reducing time to market, rather than coordinating resource allocation and replanning tasks. Refer to Best practices for Amazon SageMaker HyperPod task governance for more information.

Generative AI workloads typically demand extensive network communication across Amazon Elastic Compute Cloud (Amazon EC2) instances, where network bandwidth impacts both workload runtime and processing latency. The network latency of these communications depends on the physical placement of instances within a data center’s hierarchical infrastructure. Data centers can be organized into nested organizational units such as network nodes and node sets, with multiple instances per network node and multiple network nodes per node set. For example, instances within the same organizational unit experience faster processing time compared to those across different units. This means fewer network hops between instances results in lower communication.

To optimize the placement of your generative AI workloads in your SageMaker HyperPod clusters by considering the physical and logical arrangement of resources, you can use EC2 network topology information during your job submissions. An EC2 instance’s topology is described by a set of nodes, with one node in each layer of the network. Refer to How Amazon EC2 instance topology works for details on how EC2 topology is arranged. Network topology labels offer the following key benefits:

  • Reduced latency by minimizing network hops and routing traffic to nearby instances
  • Improved training efficiency by optimizing workload placement across network resources

With topology-aware scheduling for SageMaker HyperPod task governance, you can use topology network labels to schedule your jobs with optimized network communication, thereby improving task efficiency and resource utilization for your AI workloads.

In this post, we introduce topology-aware scheduling with SageMaker HyperPod task governance by submitting jobs that represent hierarchical network information. We provide details about how to use SageMaker HyperPod task governance to optimize your job efficiency.

Solution overview

Data scientists interact with SageMaker HyperPod clusters. Data scientists are responsible for the training, fine-tuning, and deployment of models on accelerated compute instances. It’s important to make sure data scientists have the necessary capacity and permissions when interacting with clusters of GPUs.

To implement topology-aware scheduling, you first confirm the topology information for all nodes in your cluster, then run a script that tells you which instances are on the same network nodes, and finally schedule a topology-aware training task on your cluster. This workflow facilitates higher visibility and control over the placement of your training instances.

In this post, we walk through viewing node topology information and submitting topology-aware tasks to your cluster. For reference, NetworkNodes describes the network node set of an instance. In each network node set, three layers comprise the hierarchical view of the topology for each instance. Instances that are closest to each other will share the same layer 3 network node. If there are no common network nodes in the bottom layer (layer 3), then see if there is commonality at layer 2.

Prerequisites

To get started with topology-aware scheduling, you must have the following prerequisites:

  • An EKS cluster
  • A SageMaker HyperPod cluster with instances enabled for topology information
  • The SageMaker HyperPod task governance add-on installed (version 1.2.2 or later)
  • Kubectl installed
  • (Optional) The SageMaker HyperPod CLI installed

Get node topology information

Run the following command to show node labels in your cluster. This command provides network topology information for each instance.

kubectl get nodes -L topology.k8s.aws/network-node-layer-1
kubectl get nodes -L topology.k8s.aws/network-node-layer-2
kubectl get nodes -L topology.k8s.aws/network-node-layer-3

Instances with the same network node layer 3 are as close as possible, following EC2 topology hierarchy. You should see a list of node labels that look like the following:topology.k8s.aws/network-node-layer-3: nn-33333exampleRun the following script to show the nodes in your cluster that are on the same layers 1, 2, and 3 network nodes:

git clone https://github.com/aws-samples/awsome-distributed-training.git
cd awsome-distributed-training/1.architectures/7.sagemaker-hyperpod-eks/task-governance 
chmod +x visualize_topology.sh
bash visualize_topology.sh

The output of this script will print a flow chart that you can use in a flow diagram editor such as Mermaid.js.org to visualize the node topology of your cluster. The following figure is an example of the cluster topology for a seven-instance cluster.

Submit tasks

SageMaker HyperPod task governance offers two ways to submit tasks using topology awareness. In this section, we discuss these two options and a third alternative option to task governance.

Modify your Kubernetes manifest file

First, you can modify your existing Kubernetes manifest file to include one of two annotation options:

  • kueue.x-k8s.io/podset-required-topology – Use this option if you must have all pods scheduled on nodes on the same network node layer in order to begin the job
  • kueue.x-k8s.io/podset-preferred-topology – Use this option if you ideally want all pods scheduled on nodes in the same network node layer, but you have flexibility

The following code is an example of a sample job that uses the kueue.x-k8s.io/podset-required-topology setting to schedule pods that share the same layer 3 network node:

apiVersion: batch/v1
kind: Job
metadata:
  name: test-tas-job
  namespace: hyperpod-ns-team-a
  labels:
    kueue.x-k8s.io/queue-name: hyperpod-ns-team-a-localqueue
    kueue.x-k8s.io/priority-class: inference-priority
spec:
  parallelism: 10
  completions: 10
  suspend: true
  template:
    metadata:
      labels:
        kueue.x-k8s.io/queue-name: hyperpod-ns-team-a-localqueue
      annotations:
        kueue.x-k8s.io/podset-required-topology: "topology.k8s.aws/network-node-layer-3"
    spec:
      containers:
        - name: dummy-job
          image: public.ecr.aws/docker/library/alpine:latest
          command: ["sleep", "3600s"]
          resources:
            requests:
              cpu: "1"
      restartPolicy: Never

To verify which nodes your pods are running on, use the following command to view node IDs per pod:kubectl get pods -n hyperpod-ns-team-a -o wide

Use the SageMaker HyperPod CLI

The second way to submit a job is through the SageMaker HyperPod CLI. Be sure to install the latest version (version pending) to use topology-aware scheduling. To use topology-aware scheduling with the SageMaker HyperPod CLI, you can include either the --preferred-topology parameter or the --required-topology parameter in your create job command.

The following code is an example command to start a topology-aware mnist training job using the SageMaker HyperPod CLI, replace XXXXXXXXXXXX with your AWS account ID:

hyp create hyp-pytorch-job \
--job-name test-pytorch-job-cli \
--image XXXXXXXXXXXX.dkr.ecr.us-west-2.amazonaws.com/ptjob:mnist \
--pull-policy "Always" \
--tasks-per-node 1 \
--max-retry 1 \
--preferred-topology topology.k8s.aws/network-node-layer-3

Clean up

If you deployed new resources while following this post, refer to the Clean Up section in the SageMaker HyperPod EKS workshop to make sure you don’t accrue unwanted charges.

Conclusion

During large language model (LLM) training, pod-to-pod communication distributes the model across multiple instances, requiring frequent data exchange between these instances. In this post, we discussed how SageMaker HyperPod task governance helps schedule workloads to enable job efficiency by optimizing throughput and latency. We also walked through how to schedule jobs using SageMaker HyperPod topology network information to optimize network communication latency for your AI tasks.

We encourage you to try out this solution and share your feedback in the comments section.


About the authors

Nisha Nadkarni is a Senior GenAI Specialist Solutions Architect at AWS, where she guides companies through best practices when deploying large scale distributed training and inference on AWS. Prior to her current role, she spent several years at AWS focused on helping emerging GenAI startups develop models from ideation to production.

Siamak Nariman is a Senior Product Manager at AWS. He is focused on AI/ML technology, ML model management, and ML governance to improve overall organizational efficiency and productivity. He has extensive experience automating processes and deploying various technologies.

Zican Li is a Senior Software Engineer at Amazon Web Services (AWS), where he leads software development for Task Governance on SageMaker HyperPod. In his role, he focuses on empowering customers with advanced AI capabilities while fostering an environment that maximizes engineering team efficiency and productivity.

Anoop Saha is a Sr GTM Specialist at Amazon Web Services (AWS) focusing on generative AI model training and inference. He partners with top frontier model builders, strategic customers, and AWS service teams to enable distributed training and inference at scale on AWS and lead joint GTM motions. Before AWS, Anoop held several leadership roles at startups and large corporations, primarily focusing on silicon and system architecture of AI infrastructure.



Source link

Continue Reading

Books, Courses & Certifications

How msg enhanced HR workforce transformation with Amazon Bedrock and msg.ProfileMap

Published

on


This post is co-written with Stefan Walter from msg.

With more than 10,000 experts in 34 countries, msg is both an independent software vendor and a system integrator operating in highly regulated industries, with over 40 years of domain-specific expertise. msg.ProfileMap is a software as a service (SaaS) solution for skill and competency management. It’s an AWS Partner qualified software available on AWS Marketplace, currently serving more than 7,500 users. HR and strategy departments use msg.ProfileMap for project staffing and workforce transformation initiatives. By offering a centralized view of skills and competencies, msg.ProfileMap helps organizations map their workforce’s capabilities, identify skill gaps, and implement targeted development strategies. This supports more effective project execution, better alignment of talent to roles, and long-term workforce planning.

In this post, we share how msg automated data harmonization for msg.ProfileMap, using Amazon Bedrock to power its large language model (LLM)-driven data enrichment workflows, resulting in higher accuracy in HR concept matching, reduced manual workload, and improved alignment with compliance requirements under the EU AI Act and GDPR.

The importance of AI-based data harmonization

HR departments face increasing pressure to operate as data-driven organizations, but are often constrained by the inconsistent, fragmented nature of their data. Critical HR documents are unstructured, and legacy systems use mismatched formats and data models. This not only impairs data quality but also leads to inefficiencies and decision-making blind spots.Accurate and harmonized HR data is foundational for key activities such as matching candidates to roles, identifying internal mobility opportunities, conducting skills gap analysis, and planning workforce development. msg identified that without automated, scalable methods to process and unify this data, organizations would continue to struggle with manual overhead and inconsistent results.

Solution overview

HR data is typically scattered across diverse sources and formats, ranging from relational databases to Excel files, Word documents, and PDFs. Additionally, entities such as personnel numbers or competencies have different unique identifiers as well as different text descriptions, although with the same semantics. msg addressed this challenge with a modular architecture, tailored for IT workforce scenarios. As illustrated in the following diagram, at the core of msg.ProfileMap is a robust text extraction layer, which transforms heterogeneous inputs into structured data. This is then passed to an AI-powered harmonization engine that provides consistency across data sources by avoiding duplication and aligning disparate concepts.

The harmonization process uses a hybrid retrieval approach that combines vector-based semantic similarity and string-based matching techniques. These methods align incoming data with existing entities in the system. Amazon Bedrock is used to semantically enrich data, improving cross-source compatibility and matching precision. Extracted and enriched data is indexed and stored using Amazon OpenSearch Service and Amazon DynamoDB, facilitating fast and accurate retrieval, as shown in the following diagram.

HR document processing architecture using AWS Bedrock for extraction, OpenSearch for indexing, and DynamoDB for ontology

The framework is designed to be unsupervised and domain independent. Although it’s optimized for IT workforce use cases, it has demonstrated strong generalization capabilities in other domains as well.

msg.ProfileMap is a cloud-based application that uses several AWS services, notably Amazon Neptune, Amazon DynamoDB, and Amazon Bedrock. The following diagram illustrates the full solution architecture.

ProfileMap architecture featuring authentication, load balancing, and distributed services across availability zones

Results and technical validation

msg evaluated the effectiveness of the data harmonization framework through internal testing on IT workforce concepts and external benchmarking in the Bio-ML Track of the Ontology Alignment Evaluation Initiative (OAEI), an international and EU-funded research initiative that evaluates ontology matching technologies since 2004.

During internal testing, the system processed 2,248 concepts across multiple suggestion types. High-probability merge recommendations reached 95.5% accuracy, covering nearly 60% of all inputs. This helped msg reduce manual validation workload by over 70%, significantly improving time-to-value for HR teams.

During OAEI 2024, msg.ProfileMap ranked at the top of the 2024 Bio-ML benchmark, outperforming other systems across multiple biomedical datasets. On NCIT-DOID, it achieved a 0.918 F1 score, with Hits@1 exceeding 92%, validating the engine’s generalizability beyond the HR domain. Additional details are available in the official test results.

Why Amazon Bedrock

msg relies on LLMs to semantically enrich data in near real time. These workloads require low-latency inference, flexible scaling, and operational simplicity. Amazon Bedrock met these needs by providing a fully managed, serverless interface to leading foundation models—without the need to manage infrastructure or deploy custom machine learning stacks.

Unlike hosting models on Amazon Elastic Compute Cloud (Amazon EC2) or Amazon SageMaker, Amazon Bedrock abstracts away provisioning, versioning, scaling, and model selection. Its consumption-based pricing aligns directly with msg’s SaaS delivery model—resources are used (and billed) only when needed. This simplified integration reduced overhead and helped msg scale elastically as customer demand grew.

Amazon Bedrock also helped msg meet compliance goals under the EU AI Act and GDPR by enabling tightly scoped, auditable interactions with model APIs—critical for HR use cases that handle sensitive workforce data.

Conclusion

msg’s successful integration of Amazon Bedrock into msg.ProfileMap demonstrates that large-scale AI adoption doesn’t require complex infrastructure or specialized model training. By combining modular design, ontology-based harmonization, and the fully managed LLM capabilities of Amazon Bedrock, msg delivered an AI-powered workforce intelligence platform that is accurate, scalable, and compliant.This solution improved concept match precision and achieved top marks in international AI benchmarks, demonstrating what’s possible when generative AI is paired with the right cloud-based service. With Amazon Bedrock, msg has built a platform that’s ready for today’s HR challenges—and tomorrow’s.

msg.ProfileMap is available as a SaaS offering on AWS Marketplace. If you are interested in knowing more, you can reach out to msg.hcm.backoffice@msg.group.

The content and opinions in this blog post are those of the third-party author and AWS is not responsible for the content or accuracy of this post.


About the authors

Stefan Walter is Senior Vice President of AI SaaS Solutions at msg. With over 25 years of experience in IT software development, architecture, and consulting, Stefan Walter leads with a vision for scalable SaaS innovation and operational excellence. As a BU lead at msg, Stefan has spearheaded transformative initiatives that bridge business strategy with technology execution, especially in complex, multi-entity environments.

Gianluca VegettiGianluca Vegetti is a Senior Enterprise Architect in the AWS Partner Organization, aligned to Strategic Partnership Collaboration and Governance (SPCG) engagements. In his role, he supports the definition and execution of Strategic Collaboration Agreements with selected AWS partners.

Yuriy BezsonovYuriy Bezsonov is a Senior Partner Solution Architect at AWS. With over 25 years in the tech, Yuriy has progressed from a software developer to an engineering manager and Solutions Architect. Now, as a Senior Solutions Architect at AWS, he assists partners and customers in developing cloud solutions, focusing on container technologies, Kubernetes, Java, application modernization, SaaS, developer experience, and GenAI. Yuriy holds AWS and Kubernetes certifications, and he is a recipient of the AWS Golden Jacket and the CNCF Kubestronaut Blue Jacket.



Source link

Continue Reading

Trending