Connect with us

Books, Courses & Certifications

6 Prompt Engineering Courses: Free Training for Beginners

Published

on


eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Artificial intelligence is a highly competitive field, and breaking into it requires a strategic approach. Beginners can seek out and earn certifications to learn the relevant skills needed to excel in the industry and demonstrate their understanding of core AI areas. Mastering how to craft and use prompts effectively is one of the first steps you can take to start exploring various AI tools and models, and the easiest way to get experience in prompt engineering is by enrolling in short, free online courses intended for beginners.

I evaluated a number of courses to see how they compared. Here are my picks for the six top free prompt engineering certifications for beginners:


Free Prompt Engineering Course Comparison

The chart below summarizes the certifying body, duration, use cases, and support solutions for the top six free prompt engineering courses for beginners. To choose which suits your interests and professional goals, continue reading for more detailed information.

Top 6 Free Prompt Engineering Courses for Beginners

Prompt engineering is the art of crafting effective prompts to guide AI models, and mastering this field is a valuable skill. While many paid courses are available, a free prompt engineering course offers an excellent starting point for beginners. It provides a solid foundation in the fundamentals of prompt engineering, allowing you to experiment with the capabilities of AI models without any financial commitment.

Understanding Prompt Engineering

Best for Learning Prompting with ChatGPT

Who It’s For: Beginners looking for a quick introductory course on prompt engineering using ChatGPT covering basic concepts and hands-on practice.

This beginner-friendly course, offered by DataCamp, covers the essentials of prompt engineering and the skills needed to master using ChatGPT. The course starts by unpacking the fundamental concepts of prompt engineering, teaching you to construct clear, specific, and open-ended prompts. It will also help you explore basic techniques and more advanced strategies like zero-shot, one-shot, and few-shot prompting using ChatGPT. Additionally, this course will equip you with skills to assess the quality of ChatGT’s responses, ensuring that you know how to check the accuracy and relevance of its answers.

Why I Picked It

I chose this course for its accessibility and structured curriculum designed for beginners who want to explore prompt engineering with ChatGPT. It covers both fundamental prompt engineering concepts and specialized training techniques that apply to ChatGPT. The short course also offers a hands-on experience in writing your very first prompt in a fun and interactive way. However, only the first chapter of the course is completely free. You need to upgrade to DataCamp’s paid version. It costs $13 per month, billed annually, to access the full course.

Skills Acquired

  • Basic knowledge of prompt engineering
  • Prompt strategies and techniques
  • Advanced prompt engineering

Key Course Details

The following is a high-level overview of what you need to know about course requirements, fees, duration, format, and content:

Course Requirements

Course Fee, Duration, and Format

  • Free (Chapter One)
  • Starts at $13 per month, billed annually for full access
  • One hour to complete
  • Self-paced online learning via DataCamp

Course Content and Assessments

  • Introduction to prompt engineering concepts and techniques
  • Practical demonstration with ChatGPT
  • Quick prompting exercises

The remaining two chapters are only accessible to paid users. These chapters include exercises and tests on advanced prompt engineering and prompting techniques. The course “Understanding Prompt Engineering” is part of the ChatGPT Fundamentals skill track. It helps learners master prompt crafting to maximize the AI chatbot’s capabilities.

ChatGPT Prompt Engineering

Best for Understanding Prompt Engineering Concepts

Who It’s For: Anyone who wants to learn basic prompt engineering concepts and practical examples.

The ChatGPT Prompt Engineering course, hosted on Udemy, is an excellent introductory lesson to prompt engineering concepts. It offers beginners a foundational knowledge of prompt engineering, artificial intelligence, large language models (LLMs), generative text models, and natural language processing (NLP). Unlike other basic ChatGPT lessons, this course also discusses how to use prompt engineering with Python and apply it to various use cases. Additionally, this course covers different practical examples of how prompt engineering is used in real-world cases.

ChatGPT Prompt Engineering course title screenshot.

Why I Picked It

The ChatGPT Prompt Engineering course offers digestible information on basic prompt engineering concepts and real-world examples. It’s also easily accessible, as you can simply sign up and view an hour of on-demand videos without any financial commitment. While it doesn’t offer hands-on exercises like other courses, learners can study and progress at their own pace through short videos. You can pause, rewind, or fast-forward as needed, making it easier for you to adjust to your learning style and schedule.

Skills Acquired

  • Prompt engineering definition
  • Different types of prompts
  • Prompt engineering terms (AI, NLP, GPT, and LLM)
  • Prompt engineering practical examples

Key Course Details

The following is a high-level overview of what you need to know about course requirements, fees, duration, format, and content:

Course Requirements

Course Fee, Duration, and Format

  • Free
  • One hour to complete
  • Self-paced online learning via Udemy 

Course Content and Assessments

  • Introduction to ChatGPT prompt engineering
  • Prompt engineering terms and concepts
  • Practical examples

Essentials of Prompt Engineering

Best for Mastering Prompt Engineering Techniques

Who It’s For: Beginners who want to learn more about different prompting techniques.

The Essentials of Prompt Engineering course offered by Amazon Web Services (AWS) via Coursera delves into the fundamentals of crafting effective prompts. In this course, beginners will learn how to craft, refine, and use prompts for different real-world use cases. You will explore various techniques like zero-shot, few-shot, and chain-of-thought prompting and learn how to fine-tune prompts for optimal results. Aside from basic prompting techniques, you will also understand how to identify possible risks associated with prompt engineering through readings and a self-reflective quiz.

AWS' Essentials of Prompt Engineering course title screenshot.

Why I Picked It

The AWS Essentials of Prompt Engineering is a great starter course for anyone who wants to learn about different prompting techniques. The course is easy to follow and provides clear examples of various prompting techniques, allowing beginners to choose which one to master later. While this course doesn’t offer a shareable certification, you still have the advantage of learning from experts in the industry. This short course is offered by Amazon Web Services (AWS), a leading provider of AI and cloud computing solutions, giving you access to real-world industry knowledge.

Skills Acquired

  • Learn about various prompt engineering technologies
  • Understand how to modify prompts effectively
  • Learn different prompting strategies
  • Identify prompt misuses and risks

Key Course Details

The following is a high-level overview of what you need to know about course requirements, fees, duration, format, and content:

Course Requirements

Course Fee, Duration, and Format

  • Free
  • One hour to complete
  • Self-paced online learning via Coursera

Course Content and Assessments

  • Prompt basics
  • Prompt misuses and risks

Learners must pass the knowledge check to finish this short module on prompt engineering essentials. 

ChatGPT for Everyone

Best for Using ChatGPT to Maximize Productivity

Who It’s For: Beginners who want to learn how to use ChatGPT to improve their productivity with different tasks.

Learn Prompting’s ChatGPT for Everyone is a user-friendly course created in collaboration with OpenAI. The course breaks down on how you can learn about ChatGPT, how to use it, and writing your first basic prompt. It will also tackle how to use and set up ChatGPT to help you improve your productivity for various purposes. You can watch the on-demand videos that are easy to follow taught by instructors Sander Schulhoff, the founder and chief executive officer of Learn Prompting and Shyamal Anadkat, a member of the Applied AI team at OpenAI.

ChatGPT for Everyone course title screenshot.

Why I Picked It

The ChatGPT for Everyone course stands out for its focus on practical applications and maximizing ChatGPT’s capability to increase productivity in different use cases. Unlike some technical AI certifications, this course prioritizes simple explanations and practical examples of real-world ChatGPT usage. Instructors discussed how to use ChatGPT for learning, preparing for a mock interview, working as tech support, as a personal tutor, software development, and more. You will also learn about ChatGPT’s limitations, biases, and data privacy concerns to allow you to fully evaluate the generative AI tool before deciding to use it.

Skills Acquired

  • Understand the practical applications of prompt engineering
  • Familiarize ChatGPT, GPT 3.5, GPT 4, and DALL·E 3
  • Learn how to use ChatGPT to maximize productivity

Key Course Details

The following is a high-level overview of what you need to know about course requirements, fees, duration, format, and content:

Course Requirements

Course Fee, Duration, and Format

  • Free
  • One hour to complete
  • Self-paced online learning via Learn Prompting 

Course Content and Assessments

  • Setting up ChatGPT
  • ChatGPT’s use cases
  • Basics of prompt engineering
  • Advanced ChatGPT interface and features
  • Limitations and bias
  • Data privacy

Prompt Engineering with Llama 2 &3

Best for Learning Prompting with Llama 2 and 3

Who It’s For: Anyone who wants to learn prompt engineering and Meta Llama 2 and Llama 3 models.

Prompt Engineering with Llama 2&3 is a project-based course offered by DeepLearning.AI and Coursera. It will help you learn how to craft prompts using Meta’s Llama 2 and 3 models to build applications and help you complete day-to-day tasks. Through the course’s hands-on approach, you’ll be able to experiment with prompt engineering techniques, learn how to use Code Llama to write and improve codes, and understand how to use LLMs responsibly. This short course is facilitated by Amit Sangani, the Senior Director of Partner Engineering at DeepLearning.AI.

DeepLearning.AI's Prompt Engineering with Llama 2 and 3 course screenshot.

Why I Picked It

I chose Prompt Engineering Llama 2&3 because it’s an excellent starter for anyone interested in learning prompt engineering and who wants to test Meta’s Llama models. You can learn various techniques for working effectively with Meta’s Llama models through a hands-on project you can add to your portfolio. The course is also accessible on the cloud, so you can use essential tools and resources without downloading or installing anything. Additionally, this course will introduce you to a thriving community of open-source developers building applications powered by Llama 2 and 3, which can be a first step into networking with other professionals in the AI industry.

Skills Acquired

  • Best practices to prompting Llama 2 and 3 models
  • How to build safe and responsible AI applications using the Llama Guard model
  • How to effectively interact with Meta Llama 2 Chat, Code Llama, and Llama Guard models

Key Course Details

The following is a high-level overview of what you need to know about course requirements, fees, duration, format, and content:

Course Requirements

Course Fee, Duration, and Format

  • Free
  • One hour to complete
  • Self-paced online learning via Coursera

Course Content and Assessments

  • Best practices for prompting with Llama 2 and 3 models
  • Using advanced prompting techniques with LLama 2
  • Applying Code Llama to write and improve codes
  • Promoting safe and responsible use of LLMs using Llama Guard check

Introduction to Prompt Engineering

Best for Understanding Prompt Engineering for Application Development

Who It’s For: Developers who want to learn how to use ChatGPT to create applications.

DataCamp’s Introduction to Prompt Engineering provides a solid foundation for learners who want to master prompt crafting for developing applications. The free chapter discusses prompt engineering principles, different types of prompts, and how to create structured outputs and conditional prompts. The paid chapters will discuss advanced prompt engineering strategies, prompt engineering’s real-world business applications, and developing chatbots using prompts. While learners need to upgrade to a paid subscription to access, the free chapter is a good starting point to learn more about using prompting for app development through comprehensive videos you can watch at your own pace.

ChatGPT Prompt Engineering for Developers

Why I Picked It

Developers who seek to apply ChatGPT to their projects will find this course an ideal starting point to explore prompt engineering. While this is not a complete beginner’s course, it offers a comprehensive introduction to developers new to ChatGPT and how they can use it for different use cases. It also has a simple pre-requisite, a beginner course on working with the OpenAI API. However, you have to upgrade to a paid plan starting at $13 per month, billed annually for full access to the course. Fortunately, you can access more free chapters on DataCamp to explore other courses on AI, LLMs, and software development at no cost.

Skills Acquired

  • Crafting effective prompts for building applications
  • Advanced techniques for prompt engineering
  • Using prompt engineering for business use cases
  • Chatbot development

Key Course Details

The following is a high-level overview of what you need to know about course requirements, fees, duration, format, and content:

Course Requirements

Course Fee, Duration, and Format

  • Free (Chapter One)
  • Additional chapters start at $13 per month, billed annually
  • Four hours to complete
  • Self-paced online learning via DataCamp

Course Content and Assessments

  • Introduction to prompt engineering
  • Structured outputs and conditional prompts

Frequently Asked Questions (FAQs)

Is a Prompt Engineering Certificate Worth It?

A prompt engineering certificate can be valuable, especially if you want to pursue a career in artificial intelligence. You can validate your skills to potential employers or clients and also gain the essential skills and knowledge you need to excel in the AI industry. However, the value of a certificate depends on the certifying body or institution that issued it and the specific skills it validates.

It’s essential to plan which prompt engineering courses or generative AI programs you should invest your time and resources in. Free courses for beginners are a great place to start since you don’t have to commit financially right away and you can also study at your own pace.

Can I Become a Prompt Engineer Without a Degree?

While having a degree in computer science or a related field can be beneficial, it’s possible to become a prompt engineer without one. Many AI professionals with diverse backgrounds have successfully transitioned into prompt engineering roles as long as they have the necessary skills and experience. If you’re interested in building a career in prompt engineering, gaining relevant AI certifications is well worth the time and money involved.

It’s also essential to explore other AI topics to expand your knowledge and skills, such as generative AI, machine learning, deep learning, and more. Another way to widen your perspective in prompt engineering and relevant fields is to attend top AI conferences to meet other AI professionals who can be a part of your network.

Does Prompt Engineering Require Coding?

Prompt engineering isn’t exclusively dependent on coding skills, but you should at least have foundational knowledge. Even if you’re not coding directly, prompt engineers have to work with other AI professionals and help guide an automation team.

While the primary focus of a prompt engineer is understanding natural language processing (NLP) and crafting effective prompts, knowing how to code in Python helps you learn NLP and deep learning models more easily. Additionally, mastering Python and other programming languages, if possible, gives you an advantage in a competitive AI field.

Bottom Line: Best Generative AI Certifications

Prompt engineering and generative AI is a rapidly growing field, and gaining a certification will give you an advantage in this competitive industry. There are many free certifications available to kickstart your prompt engineering career. The best option should align with your professional objectives and learning style. My recommendations include the top prompt engineering courses that beginners can take without financial commitment. Before deciding on the prompt engineering program you want, consider your long-term goals, the specific LLM you’re interested in, and the resources you’re willing to invest in.

Learn more about visionaries shaping the field of artificial intelligence and generative AI by reading our list of the top AI companies and leading generative AI companies.



Source link

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Books, Courses & Certifications

Qwen3 family of reasoning models now available in Amazon Bedrock Marketplace and Amazon SageMaker JumpStart

Published

on


Today, we are excited to announce that Qwen3, the latest generation of large language models (LLMs) in the Qwen family, is available through Amazon Bedrock Marketplace and Amazon SageMaker JumpStart. With this launch, you can deploy the Qwen3 models—available in 0.6B, 4B, 8B, and 32B parameter sizes—to build, experiment, and responsibly scale your generative AI applications on AWS.

In this post, we demonstrate how to get started with Qwen3 on Amazon Bedrock Marketplace and SageMaker JumpStart. You can follow similar steps to deploy the distilled versions of the models as well.

Solution overview

Qwen3 is the latest generation of LLMs in the Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

  • Unique support of seamless switching between thinking mode and non-thinking mode within a single model, providing optimal performance across various scenarios.
  • Significantly enhanced in its reasoning capabilities, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning.
  • Good human preference alignment, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience.
  • Expertise in agent capabilities, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open source models in complex agent-based tasks.
  • Support for over 100 languages and dialects with strong capabilities for multilingual instruction following and translation.

Prerequisites

To deploy Qwen3 models, make sure you have access to the recommended instance types based on the model size. You can find these instance recommendations on Amazon Bedrock Marketplace or the SageMaker JumpStart console. To verify you have the necessary resources, complete the following steps:

  1. Open the Service Quotas console.
  2. Under AWS Services, select Amazon SageMaker.
  3. Check that you have sufficient quota for the required instance type for endpoint deployment.
  4. Make sure at least one of these instance types is available in your target AWS Region.

If needed, request a quota increase and contact your AWS account team for support.

Deploy Qwen3 in Amazon Bedrock Marketplace

Amazon Bedrock Marketplace gives you access to over 100 popular, emerging, and specialized foundation models (FMs) through Amazon Bedrock. To access Qwen3 in Amazon Bedrock, complete the following steps:

  1. On the Amazon Bedrock console, in the navigation pane under Foundation models, choose Model catalog.
  2. Filter for Hugging Face as a provider and choose a Qwen3 model. For this example, we use the Qwen3-32B model.

The model detail page provides essential information about the model’s capabilities, pricing structure, and implementation guidelines. You can find detailed usage instructions, including sample API calls and code snippets for integration.

The page also includes deployment options and licensing information to help you get started with Qwen3-32B in your applications.

  1. To begin using Qwen3-32B, choose Deploy.

The details page displays comprehensive information about the Qwen3 32B model, including its version, delivery method, release date, model ID, and deployment status. The interface includes deployment options and playground access.

You will be prompted to configure the deployment details for Qwen3-32B. The model ID will be pre-populated.

  1. For Endpoint name, enter an endpoint name (between 1–50 alphanumeric characters).
  2. For Number of instances, enter a number of instances (between 1–100).
  3. For Instance type, choose your instance type. For optimal performance with Qwen3-32B, a GPU-based instance type like ml.g5-12xlarge is recommended.
  4. To deploy the model, choose Deploy.

he deployment configuration page displays essential settings for hosting a Bedrock model endpoint in SageMaker. It includes fields for Model ID, Endpoint name, Number of instances, and Instance type selection.

When the deployment is complete, you can test Qwen3-32B’s capabilities directly in the Amazon Bedrock playground.

  1. Choose Open in playground to access an interactive interface where you can experiment with different prompts and adjust model parameters like temperature and maximum length.

This is an excellent way to explore the model’s reasoning and text generation abilities before integrating it into your applications. The playground provides immediate feedback, helping you understand how the model responds to various inputs and letting you fine-tune your prompts for optimal results.You can quickly test the model in the playground through the UI. However, to invoke the deployed model programmatically with any Amazon Bedrock APIs, you must have the endpoint Amazon Resource Name (ARN).

Enable reasoning and non-reasoning responses with Converse API

The following code shows how to turn reasoning on and off with Qwen3 models using the Converse API, depending on your use case. By default, reasoning is left on for Qwen3 models, but you can streamline interactions by using the /no_think command within your prompt. When you add this to the end of your query, reasoning is turned off and the models will provide just the direct answer. This is particularly useful when you need quick information without explanations, are familiar with the topic, or want to maintain a faster conversational flow. At the time of writing, the Converse API doesn’t support tool use for Qwen3 models. Refer to the Invoke_Model API example later in this post to learn how to use reasoning and tools in the same completion.

import boto3
from botocore.exceptions import ClientError

# Create a Bedrock Runtime client in the AWS Region you want to use.
client = boto3.client("bedrock-runtime", region_name="us-west-2")

# Configuration
model_id = ""  # Replace with Bedrock Marketplace endpoint arn

# Start a conversation with the user message.
user_message = "hello, what is 1+1 /no_think" #remove /no_think to leave default reasoning on
conversation = [
    {
        "role": "user",
        "content": [{"text": user_message}],
    }
]

try:
    # Send the message to the model, using a basic inference configuration.
    response = client.converse(
        modelId=model_id,
        messages=conversation,
        inferenceConfig={"maxTokens": 512, "temperature": 0.5, "topP": 0.9},
    )

    # Extract and print the response text.
    #response_text = response["output"]["message"]["content"][0]["text"]
    #reasoning_content = response ["output"]["message"]["reasoning_content"][0]["text"]
    #print(response_text, reasoning_content)
    print(response)
    
except (ClientError, Exception) as e:
    print(f"ERROR: Can't invoke '{model_id}'. Reason: {e}")
    exit(1)

The following is a response using the Converse API, without default thinking:

{'ResponseMetadata': {'RequestId': 'f7f3953a-5747-4866-9075-fd4bd1cf49c4', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Tue, 17 Jun 2025 18:34:47 GMT', 'content-type': 'application/json', 'content-length': '282', 'connection': 'keep-alive', 'x-amzn-requestid': 'f7f3953a-5747-4866-9075-fd4bd1cf49c4'}, 'RetryAttempts': 0}, 'output': {'message': {'role': 'assistant', 'content': [{'text': '\n\nHello! The result of 1 + 1 is **2**. 😊'}, {'reasoningContent': {'reasoningText': {'text': '\n\n'}}}]}}, 'stopReason': 'end_turn', 'usage': {'inputTokens': 20, 'outputTokens': 22, 'totalTokens': 42}, 'metrics': {'latencyMs': 1125}}

The following is an example with default thinking on; the tokens are automatically parsed into the reasoningContent field for the Converse API:

{'ResponseMetadata': {'RequestId': 'b6d2ebbe-89da-4edc-9a3a-7cb3e7ecf066', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Tue, 17 Jun 2025 18:32:28 GMT', 'content-type': 'application/json', 'content-length': '1019', 'connection': 'keep-alive', 'x-amzn-requestid': 'b6d2ebbe-89da-4edc-9a3a-7cb3e7ecf066'}, 'RetryAttempts': 0}, 'output': {'message': {'role': 'assistant', 'content': [{'text': '\n\nHello! The sum of 1 + 1 is **2**. Let me know if you have any other questions or need further clarification! 😊'}, {'reasoningContent': {'reasoningText': {'text': '\nOkay, the user asked "hello, what is 1+1". Let me start by acknowledging their greeting. They might just be testing the water or actually need help with a basic math problem. Since it\'s 1+1, it\'s a very simple question, but I should make sure to answer clearly. Maybe they\'re a child learning math for the first time, or someone who\'s not confident in their math skills. I should provide the answer in a friendly and encouraging way. Let me confirm that 1+1 equals 2, and maybe add a brief explanation to reinforce their understanding. I can also offer further assistance in case they have more questions. Keeping it conversational and approachable is key here.\n'}}}]}}, 'stopReason': 'end_turn', 'usage': {'inputTokens': 16, 'outputTokens': 182, 'totalTokens': 198}, 'metrics': {'latencyMs': 7805}}

Perform reasoning and function calls in the same completion using the Invoke_Model API

With Qwen3, you can stream an explicit trace and the exact JSON tool call in the same completion. Up until now, reasoning models have forced the choice to either show the chain of thought or call tools deterministically. The following code shows an example:

messages = json.dumps( {
    "messages": [
        {
            "role": "user",
            "content": "Hi! How are you doing today?"
        }, 
        {
            "role": "assistant",
            "content": "I'm doing well! How can I help you?"
        }, 
        {
            "role": "user",
            "content": "Can you tell me what the temperate will be in Dallas, in fahrenheit?"
        }
    ],
    "tools": [{
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {
                        "type":
                            "string",
                        "description":
                            "The city to find the weather for, e.g. 'San Francisco'"
                    },
                    "state": {
                        "type":
                            "string",
                        "description":
                            "the two-letter abbreviation for the state that the city is in, e.g. 'CA' which would mean 'California'"
                    },
                    "unit": {
                        "type": "string",
                        "description":
                            "The unit to fetch the temperature in",
                        "enum": ["celsius", "fahrenheit"]
                    }
                },
                "required": ["city", "state", "unit"]
            }
        }
    }],
    "tool_choice": "auto"
})

response = client.invoke_model(
    modelId=model_id, 
    body=body
)
print(response)
model_output = json.loads(response['body'].read())
print(json.dumps(model_output, indent=2))

Response:

{'ResponseMetadata': {'RequestId': '5da8365d-f4bf-411d-a783-d85eb3966542', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Tue, 17 Jun 2025 18:57:38 GMT', 'content-type': 'application/json', 'content-length': '1148', 'connection': 'keep-alive', 'x-amzn-requestid': '5da8365d-f4bf-411d-a783-d85eb3966542', 'x-amzn-bedrock-invocation-latency': '6396', 'x-amzn-bedrock-output-token-count': '148', 'x-amzn-bedrock-input-token-count': '198'}, 'RetryAttempts': 0}, 'contentType': 'application/json', 'body': }
{
  "id": "chatcmpl-bc60b482436542978d233b13dc347634",
  "object": "chat.completion",
  "created": 1750186651,
  "model": "lmi",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "reasoning_content": "\nOkay, the user is asking about the weather in San Francisco. Let me check the tools available. There's a get_weather function that requires location and unit. The user didn't specify the unit, so I should ask them if they want Celsius or Fahrenheit. Alternatively, maybe I can assume a default, but since the function requires it, I need to include it. I'll have to prompt the user for the unit they prefer.\n",
        "content": "\n\nThe user hasn't specified whether they want the temperature in Celsius or Fahrenheit. I need to ask them to clarify which unit they prefer.\n\n",
        "tool_calls": [
          {
            "id": "chatcmpl-tool-fb2f93f691ed4d8ba94cadc52b57414e",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"location\": \"San Francisco, CA\", \"unit\": \"celsius\"}"
            }
          }
        ]
      },
      "logprobs": null,
      "finish_reason": "tool_calls",
      "stop_reason": null
    }
  ],
  "usage": {
    "prompt_tokens": 198,
    "total_tokens": 346,
    "completion_tokens": 148,
    "prompt_tokens_details": null
  },
  "prompt_logprobs": null
}

Deploy Qwen3-32B with SageMaker JumpStart

SageMaker JumpStart is a machine learning (ML) hub with FMs, built-in algorithms, and prebuilt ML solutions that you can deploy with just a few clicks. With SageMaker JumpStart, you can customize pre-trained models to your use case, with your data, and deploy them into production using either the UI or SDK.Deploying the Qwen3-32B model through SageMaker JumpStart offers two convenient approaches: using the intuitive SageMaker JumpStart UI or implementing programmatically through the SageMaker Python SDK. Let’s explore both methods to help you choose the approach that best suits your needs.

Deploy Qwen3-32B through SageMaker JumpStart UI

Complete the following steps to deploy Qwen3-32B using SageMaker JumpStart:

  1. On the SageMaker console, choose Studio in the navigation pane.
  2. First-time users will be prompted to create a domain.
  3. On the SageMaker Studio console, choose JumpStart in the navigation pane.

The model browser displays available models, with details like the provider name and model capabilities.

The SageMaker Studio Public Hub interface displays a grid of AI model providers, including Meta, DeepSeek, HuggingFace, and AWS, each showing their model counts and Bedrock integration status. The page includes a navigation sidebar and search functionality.

  1. Search for Qwen3 to view the Qwen3-32B model card.

Each model card shows key information, including:

  • Model name
  • Provider name
  • Task category (for example, Text Generation)
  • Bedrock Ready badge (if applicable), indicating that this model can be registered with Amazon Bedrock, so you can use Amazon Bedrock APIs to invoke the model

The SageMaker interface shows search results for "qwen3" displaying four text generation models from Qwen, each marked as Bedrock ready. Models range from 0.6B to 32B in size with consistent formatting and capabilities.

  1. Choose the model card to view the model details page.

The model details page includes the following information:

  • The model name and provider information
  • A Deploy button to deploy the model
  • About and Notebooks tabs with detailed information

The About tab includes important details, such as:

  • Model description
  • License information
  • Technical specifications
  • Usage guidelines

Screenshot of the SageMaker Studio interface displaying details about the Qwen3 32B language model, including its main features, capabilities, and deployment options. The interface shows tabs for About and Notebooks, with action buttons for Train, Deploy, Optimize, and Evaluate.

Before you deploy the model, it’s recommended to review the model details and license terms to confirm compatibility with your use case.

  1. Choose Deploy to proceed with deployment.
  2. For Endpoint name, use the automatically generated name or create a custom one.
  3. For Instance type¸ choose an instance type (default: ml.g6-12xlarge).
  4. For Initial instance count, enter the number of instances (default: 1).

Selecting appropriate instance types and counts is crucial for cost and performance optimization. Monitor your deployment to adjust these settings as needed. Under Inference type, Real-time inference is selected by default. This is optimized for sustained traffic and low latency.

  1. Review all configurations for accuracy. For this model, we strongly recommend adhering to SageMaker JumpStart default settings and making sure that network isolation remains in place.
  2. Choose Deploy to deploy the model.

A deployment configuration screen in SageMaker Studio showing endpoint settings, instance type selection, and real-time inference options. The interface includes fields for endpoint name, instance type (ml.g5.12xlarge), and initial instance count.

The deployment process can take several minutes to complete.

When deployment is complete, your endpoint status will change to InService. At this point, the model is ready to accept inference requests through the endpoint. You can monitor the deployment progress on the SageMaker console Endpoints page, which will display relevant metrics and status information. When the deployment is complete, you can invoke the model using a SageMaker runtime client and integrate it with your applications.

Deploy Qwen3-32B using the SageMaker Python SDK

To get started with Qwen3-32B using the SageMaker Python SDK, you must install the SageMaker Python SDK and make sure you have the necessary AWS permissions and environment set up. The following is a step-by-step code example that demonstrates how to deploy and use Qwen3-32B for inference programmatically:

!pip install --force-reinstall --no-cache-dir sagemaker==2.235.2

from sagemaker.serve.builder.model_builder import ModelBuilder 
from sagemaker.serve.builder.schema_builder import SchemaBuilder 
from sagemaker.jumpstart.model import ModelAccessConfig 
from sagemaker.session import Session 
import logging 

sagemaker_session = Session()
artifacts_bucket_name = sagemaker_session.default_bucket() 
execution_role_arn = sagemaker_session.get_caller_identity_arn()

# Changed to Qwen32B model
js_model_id = "huggingface-reasoning-qwen3-32b"
gpu_instance_type = "ml.g5.12xlarge"

response = "Hello, I'm a language model, and I'm here to help you with your English."

sample_input = {
    "inputs": "Hello, I'm a language model,",
    "parameters": {
        "max_new_tokens": 128, 
        "top_p": 0.9, 
        "temperature": 0.6
    }
}

sample_output = [{"generated_text": response}]

schema_builder = SchemaBuilder(sample_input, sample_output)

model_builder = ModelBuilder( 
    model=js_model_id, 
    schema_builder=schema_builder, 
    sagemaker_session=sagemaker_session, 
    role_arn=execution_role_arn, 
    log_level=logging.ERROR 
) 

model = model_builder.build() 

predictor = model.deploy(
    model_access_configs={js_model_id: ModelAccessConfig(accept_eula=True)}, 
    accept_eula=True
) 

predictor.predict(sample_input)

You can run additional requests against the predictor:

new_input = {
"inputs": "What is Amazon doing in Generative AI?",
"parameters": {"max_new_tokens": 64, "top_p": 0.8, "temperature": 0.7},
}

prediction = predictor.predict(new_input)
print(prediction)

The following are some error handling and best practices to enhance deployment code:

# Enhanced deployment code with error handling
import backoff
import botocore
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

@backoff.on_exception(backoff.expo, 
                     (botocore.exceptions.ClientError,),
                     max_tries=3)
def deploy_model_with_retries(model_builder, model_id):
    try:
        model = model_builder.build()
        predictor = model.deploy(
            model_access_configs={model_id:ModelAccessConfig(accept_eula=True)},
            accept_eula=True
        )
        return predictor
    except Exception as e:
        logger.error(f"Deployment failed: {str(e)}")
        raise

def safe_predict(predictor, input_data):
    try:
        return predictor.predict(input_data)
    except Exception as e:
        logger.error(f"Prediction failed: {str(e)}")
        return None

Clean up

To avoid unwanted charges, complete the steps in this section to clean up your resources.

Delete the Amazon Bedrock Marketplace deployment

If you deployed the model using Amazon Bedrock Marketplace, complete the following steps:

  1. On the Amazon Bedrock console, under Foundation models in the navigation pane, choose Marketplace deployments.
  2. In the Managed deployments section, locate the endpoint you want to delete.
  3. Select the endpoint, and on the Actions menu, choose Delete.
  4. Verify the endpoint details to make sure you’re deleting the correct deployment:
    1. Endpoint name
    2. Model name
    3. Endpoint status
  5. Choose Delete to delete the endpoint.
  6. In the deletion confirmation dialog, review the warning message, enter confirm, and choose Delete to permanently remove the endpoint.

Delete the SageMaker JumpStart predictor

The SageMaker JumpStart model you deployed will incur costs if you leave it running. Use the following code to delete the endpoint if you want to stop incurring charges. For more details, see Delete Endpoints and Resources.

predictor.delete_model()
predictor.delete_endpoint()

Conclusion

In this post, we explored how you can access and deploy the Qwen3 models using Amazon Bedrock Marketplace and SageMaker JumpStart. With support for both the full parameter models and its distilled versions, you can choose the optimal model size for your specific use case. Visit SageMaker JumpStart in Amazon SageMaker Studio or Amazon Bedrock Marketplace to get started. For more information, refer to Use Amazon Bedrock tooling with Amazon SageMaker JumpStart models, SageMaker JumpStart pretrained models, Amazon SageMaker JumpStart Foundation Models, Amazon Bedrock Marketplace, and Getting started with Amazon SageMaker JumpStart.

The Qwen3 family of LLMs offers exceptional versatility and performance, making it a valuable addition to the AWS foundation model offerings. Whether you’re building applications for content generation, analysis, or complex reasoning tasks, Qwen3’s advanced architecture and extensive context window make it a powerful choice for your generative AI needs.


About the authors

Niithiyn Vijeaswaran is a Generative AI Specialist Solutions Architect with the Third-Party Model Science team at AWS. His area of focus is AWS AI accelerators (AWS Neuron). He holds a Bachelor’s degree in Computer Science and Bioinformatics.

Avan Bala is a Solutions Architect at AWS. His area of focus is AI for DevOps and machine learning. He holds a bachelor’s degree in Computer Science with a minor in Mathematics and Statistics from the University of Maryland. Avan is currently working with the Enterprise Engaged East Team and likes to specialize in projects about emerging AI technologies.

Mohhid Kidwai is a Solutions Architect at AWS. His area of focus is generative AI and machine learning solutions for small-medium businesses. He holds a bachelor’s degree in Computer Science with a minor in Biological Science from North Carolina State University. Mohhid is currently working with the SMB Engaged East Team at AWS.

Yousuf Athar is a Solutions Architect at AWS specializing in generative AI and AI/ML. With a Bachelor’s degree in Information Technology and a concentration in Cloud Computing, he helps customers integrate advanced generative AI capabilities into their systems, driving innovation and competitive edge. Outside of work, Yousuf loves to travel, watch sports, and play football.

John Liu has 15 years of experience as a product executive and 9 years of experience as a portfolio manager. At AWS, John is a Principal Product Manager for Amazon Bedrock. Previously, he was the Head of Product for AWS Web3 / Blockchain. Prior to AWS, John held various product leadership roles at public blockchain protocols, fintech companies and also spent 9 years as a portfolio manager at various hedge funds.

Rohit Talluri is a Generative AI GTM Specialist at Amazon Web Services (AWS). He is partnering with top generative AI model builders, strategic customers, key AI/ML partners, and AWS Service Teams to enable the next generation of artificial intelligence, machine learning, and accelerated computing on AWS. He was previously an Enterprise Solutions Architect and the Global Solutions Lead for AWS Mergers & Acquisitions Advisory.

Varun Morishetty is a Software Engineer with Amazon SageMaker JumpStart and Bedrock Marketplace. Varun received his Bachelor’s degree in Computer Science from Northeastern University. In his free time, he enjoys cooking, baking and exploring New York City.



Source link

Continue Reading

Books, Courses & Certifications

Agents as escalators: Real-time AI video monitoring with Amazon Bedrock Agents and video streams

Published

on


Organizations deploying video monitoring systems face a critical challenge: processing continuous video streams while maintaining accurate situational awareness. Traditional monitoring approaches that use rule-based detection or basic computer vision frequently miss important events or generate excessive false positives, leading to operational inefficiencies and alert fatigue.

In this post, we show how to build a fully deployable solution that processes video streams using OpenCV, Amazon Bedrock for contextual scene understanding and automated responses through Amazon Bedrock Agents. This solution extends the capabilities demonstrated in Automate chatbot for document and data retrieval using Amazon Bedrock Agents and Knowledge Bases, which discussed using Amazon Bedrock Agents for document and data retrieval. In this post, we apply Amazon Bedrock Agents to real-time video analysis and event monitoring.

Benefits of using Amazon Bedrock Agents for video monitoring

The following figure shows example video stream inputs from different monitoring scenarios. With contextual scene understanding, users can search for specific events.

A front door camera will capture many events throughout the day, but some are more interesting than others—having context if a package is being delivered or removed (as in the following package example) limits alerts to urgent events.

Amazon Bedrock is a fully managed service that provides access to high-performing foundation models (FMs) from leading AI companies through a single API. Using Amazon Bedrock, you can build secure, responsible generative AI applications. Amazon Bedrock Agents extends these capabilities by enabling applications to execute multi-step tasks across systems and data sources, making it ideal for complex monitoring scenarios. The solution processes video streams through these key steps:

  1. Extract frames when motion is detected from live video streams or local files.
  2. Analyze context using multimodal FMs.
  3. Make decisions using agent-based logic with configurable responses.
  4. Maintain searchable semantic memory of events.

You can build this intelligent video monitoring system using Amazon Bedrock Agents and Amazon Bedrock Knowledge Bases in an automated solution. The complete code is available in the GitHub repo.

Limitations of current video monitoring systems

Organizations deploying video monitoring systems face a fundamental dilemma. Despite advances in camera technology and storage capabilities, the intelligence layer interpreting video feeds often remains rudimentary. This creates a challenging situation where security teams must make significant trade-offs in their monitoring approach. Current video monitoring solutions typically force organizations to choose between the following:

  • Simple rules that scale but generate excessive false positives
  • Complex rules that require ongoing maintenance and customization
  • Manual monitoring that relies on human attention and doesn’t scale
  • Point solutions that only handle specific scenarios but lack flexibility

These trade-offs create fundamental barriers to effective video monitoring that impact security, safety, and operational efficiency across industries. Based on our work with customers, we’ve identified three critical challenges that emerge from these limitations:

  • Alert fatigue – Traditional motion detection and object recognition systems generate alerts for any detected change or recognized object. Security teams quickly become overwhelmed by the volume of notifications for normal activities. This leads to reduced attention when genuinely critical events occur, diminishing security effectiveness and increasing operational costs from constant human verification of false alarms.
  • Limited contextual understanding – Rule-based systems fundamentally struggle with nuanced scene interpretation. Even sophisticated traditional systems operate with limited understanding of the environments they monitor due to a lack of contextual awareness, because they can’t easily do the following:
    • Distinguish normal from suspicious behavior
    • Understand temporal patterns like recurring weekly events
    • Consider environmental context such as time of day or location
    • Correlate multiple events that might indicate a pattern
  • Lack of semantic memory – Conventional systems lack the ability to build and use knowledge over time. They can’t do the following:
    • Establish baselines of routine versus unusual events
    • Offer natural language search capabilities across historical data
    • Support reasoning about emerging patterns

Without these capabilities, you can’t gain cumulative benefits from your monitoring infrastructure or perform sophisticated retrospective analysis. To address these challenges effectively, you need a fundamentally different approach. By combining the contextual understanding capabilities of FMs with a structured framework for event classification and response, you can build more intelligent monitoring systems. Amazon Bedrock Agents provides the ideal platform for this next-generation approach.

Solution overview

You can address these monitoring challenges by building a video monitoring solution with Amazon Bedrock Agents. The system intelligently screens events, filters routine activity, and escalates situations requiring human attention, helping reduce alert fatigue while improving detection accuracy. The solution uses Amazon Bedrock Agents to analyze detected motion from video, and alerts users when an event of interest happens according to the provided instructions. This allows the system to intelligently filter out trivial events that can trigger motion detection, such as wind or birds, and direct the user’s attention only to events of interest. The following diagram illustrates the solution architecture.

The solution uses three primary components to address the core challenges: agents as escalators, a video processing pipeline, and Amazon Bedrock Agents. We discuss these components in more detail in the following sections.

The solution uses the AWS Cloud Development Kit (AWS CDK) to deploy the solution components. The AWS CDK is an open source software development framework for defining cloud infrastructure as code and provisioning it through AWS CloudFormation.

Agent as escalators

The first component uses Amazon Bedrock Agents to examine detected motion events with the following capabilities:

  • Provides natural language understanding of scenes and activities for contextual interpretation
  • Maintains temporal awareness across frame sequences to understand event progression
  • References historical patterns to distinguish unusual from routine events
  • Applies contextual reasoning about behaviors, considering factors like time of day, location, and action sequences

We implement a graduated response framework that categorizes events by severity and required action:

  • Level 0: Log only – The system logs normal or expected activities. For example, when a delivery person arrives during business hours or a recognized vehicle enters the driveway, these events are documented for pattern analysis and future reference but require no immediate action. They remain searchable in the event history.
  • Level 1: Human notification – This level handles unusual but non-critical events that warrant human attention. An unrecognized vehicle parked nearby, an unexpected visitor, or unusual movement patterns trigger a notification to security personnel. These events require human verification and assessment.
  • Level 2: Immediate response – Reserved for critical security events. Unauthorized access attempts, detection of smoke or fire, or suspicious behavior trigger automatic response actions through API calls. The system notifies personnel through SMS or email alerts with event information and context.

The solution provides an interactive processing and monitoring interface through a Streamlit application. With the Streamlit UI, users can provide instructions and interact with the agent.

The application consists of the following key features:

  • Live stream or video file input – The application accepts M3U8 stream URLs from webcams or security feeds, or local video files in common formats (MP4, AVI). Both are processed using the same motion detection pipeline that saves triggered events to Amazon Simple Storage Service (Amazon S3) for agent analysis.
  • Custom instructions – Users can provide specific monitoring guidance, such as “Alert me about unknown individuals near the loading dock after hours” or “Focus on vehicle activity in the parking area.” These instructions adjust how the agent interprets detected motion events.
  • Notification configuration – Users can specify contact information for different alert levels. The system uses Amazon Simple Notification Service (Amazon SNS) to send emails or text messages based on event severity, so different personnel can be notified for potential issues vs. critical situations.
  • Natural language queries about past events – The interface includes a chat component for historical event retrieval. Users can ask “What vehicles have been in the driveway this week?” or “Show me any suspicious activity from last night,” receiving responses based on the system’s event memory.

Video processing pipeline

The solution uses several AWS services to capture and prepare video data for analysis through a modular processing pipeline. The solution supports multiple types of video sources:

When using streams, OpenCV’s VideoCapture component handles the connection and frame extraction. For testing, we’ve included sample event videos demonstrating different scenarios. The core of the video processing is a modular pipeline implemented in Python. Key components include:

  • SimpleMotionDetection – Identifies movement in the video feed
  • FrameSampling – Captures sequences of frames over time when motion is detected
  • GridAggregator – Organizes multiple frames into a visual grid for context
  • S3Storage – Stores captured frame sequences in Amazon S3

This multi-process framework optimizes performance by running components concurrently and maintaining a queue of frames to process. The video processing pipeline organizes captured frame data in a structured way before passing it to the Amazon Bedrock agent for analysis:

  • Frame sequence storage – When motion is detected, the system captures a sequence of frames over 10 seconds. These frames are stored in Amazon S3 using a timestamp-based path structure (YYYYMMDD-HHMMSS) that allows for efficient retrieval by date and time. In the case where motions exceed 10 seconds, multiple events are created.
  • Image grid format – Rather than processing individual frames separately, the system arranges multiple sequential frames into a grid format (typically 3×4 or 4×5). This presentation provides temporal context and is sent to the Amazon Bedrock agent for analysis. The grid format enables understanding of how motion progresses over time, which is critical for accurate scene interpretation.

The following figure is an example of an image grid sent to the agent. Package theft is difficult to identify with classic image models. The large language model’s (LLM’s) ability to reason over a sequence of image allows it to make observations about intent.

The video processing pipeline’s output—timestamped frame grids stored in Amazon S3—serves as the input for the Amazon Bedrock agent components, which we discuss in the next section.

Amazon Bedrock agent components

The solution integrates multiple Amazon Bedrock services to create an intelligent analysis system:

  • Core agent architecture – The agent orchestrates these key workflows:
    • Receives frame grids from Amazon S3 on motion detection
    • Coordinates multi-step analysis processes
    • Makes classification decisions
    • Triggers appropriate response actions
    • Maintains event context and state
  • Knowledge management – The solution uses Amazon Bedrock Knowledge Bases with Amazon OpenSearch Serverless to:
    • Store and index historical events
    • Build baseline activity patterns
    • Enable natural language querying
    • Track temporal patterns
    • Support contextual analysis
  • Action groups – The agent has access to several actions defined through API schemas:
    • Analyze grid – Process incoming frame grids from Amazon S3
    • Alert – Send notifications through Amazon SNS based on severity
    • Log – Record event details for future reference
    • Search events by date – Retrieve past events based on a date range
    • Look up vehicle (Text-to-SQL) – Query the vehicle database for information

For structured data queries, the system uses the FM’s ability to convert natural language to SQL. This enables the following:

  • Querying Amazon Athena tables containing event records
  • Retrieving information about registered vehicles
  • Generating reports from structured event data

These components work together to create a comprehensive system that can analyze video content, maintain event history, and support both real-time alerting and retrospective analysis through natural language interaction.

Video processing framework

The video processing framework implements a multi-process architecture for handling video streams through composable processing chains.

Modular pipeline architecture

The framework uses a composition-based approach built around the FrameProcessor abstract base class.

Processing components implement a consistent interface with a process(frame) method that takes a Frame and returns a potentially modified Frame:

```
class FrameProcessor(ABC):
    @abstractmethod
    def process(self, frame: Frame) -> Optional[Frame]: ...
```

The Frame class encapsulates the image data along with timestamps, indexes, and extensible metadata:

```
@dataclass
class Frame:
    buffer: ndarray  # OpenCV image array
    timestamp: float
    index: float
    fps: float
    metadata: dict = field(default_factory=dict)
```

Customizable processing chains

The architecture supports configuring multiple processing chains that can be connected in sequence. The solution uses two primary chains. The detection and analysis chain processes incoming video frames to identify events of interest:

```
chain = FrameProcessorChain([
    SimpleMotionDetection(motion_threshold=10_000, frame_skip_size=1),
    FrameSampling(timedelta(milliseconds=250), threshold_time=timedelta(seconds=2)),
    GridAggregator(shape=(13, 3))
])
```

The storage and notification chain handles the storage of identified events and invocation of the agent:

```
storage_chain = FrameProcessorChain([
    S3Storage(bucket_name=TARGET_S3_BUCKET, prefix=S3_PREFIX, s3_client_provider=s3_client_provider),
    LambdaProcessor(get_response=get_response, monitoring_instructions=config.monitoring_instructions)
])
```

You can modify these changes independently to add or replace components based on specific monitoring requirements.

Component implementation

The solution includes several processing components that demonstrate the framework’s capabilities. You can modify each processing step or add new ones. For example, for simple motion detection, we use a simple pixel difference, but you can refine the motion detection functionality as needed, or follow the format to implement other detection algorithms, such as object detection or scene segmentation.

Additional components include the FrameSampling processor to control capture timing, GridAggregator to create visual frame grids, and storage processors that save event data and trigger agent analysis, and these can be customized and replaced as needed. For example:

  • Modify existing components – Adjust thresholds or parameters to tune for specific environments
  • Create alternative storage backends – Direct output to different storage services or databases
  • Implement preprocessing and postprocessing steps – Add image enhancement, data filtering, or additional context generation

Finally, the LambdaProcessor serves as the bridge to the Amazon Bedrock agent by invoking an AWS Lambda function that sends the information in a request to the deployed agent. From there, the Amazon Bedrock agent takes over and analyzes the event and takes action accordingly.

Agent implementation

After you deploy the solution, an Amazon Bedrock agent alias becomes available. This agent functions as an intelligent analysis layer, processing captured video events and executing appropriate actions based on its analysis. You can test the agent and view its reasoning trace directly on the Amazon Bedrock console, as shown in the following screenshot.

This agent will lack some of the metadata supplied by the Streamlit application (such as current time) and might not give the same answers as the full application.

Invocation flow

The agent is invoked through a Lambda function that handles the request-response cycle and manages session state. It finds the highest published version ID and uses it to invoke the agent and parses the response.

Action groups

The agent’s capabilities are defined through action groups implemented using the BedrockAgentResolver framework. This approach automatically generates the OpenAPI schema required by the agent.

When the agent is invoked, it receives an event object that includes the API path and other parameters that inform the agent framework how to route the request to the appropriate handler based. You can add new actions by defining additional endpoint handlers following the same pattern and generating a new OpenAPI schema:

```
if __name__ == "__main__":
    print(app.get_openapi_json_schema())
```

Text-to-SQL integration

Through its action group, the agent is able to translate natural language queries into SQL for structured data analysis. The system reads data from assets/data_query_data_source, which can include various formats like CSV, JSON, ORC, or Parquet.

This capability enables users to query structured data using natural language. As demonstrated in the following example, the system translates natural language queries about vehicles into SQL, returning structured information from the database.

The database connection is configured through a SQL Alchemy engine. Users can connect to existing databases by updating the create_sql_engine() function to use their connection parameters.

Event memory and semantic search

The agent maintains a detailed memory of past events, storing event logs with rich descriptions in Amazon S3. These events become searchable through both vector-based semantic search and date-based filtering. As shown in the following example, temporal queries make it possible to retrieve information about events within specific time periods, such as vehicles observed in the past 72 hours.

The system’s semantic memory capabilities enable queries based on abstract concepts and natural language descriptions. As shown in the following example, the agent can understand abstract concepts like “funny” and retrieve relevant events, such as a person dropping a birthday cake.

Events can be linked together by the agent to identify patterns or related incidents. For example, the system can correlate separate sightings of individuals with similar characteristics. In the following screenshots, the agent connects related incidents by identifying common attributes like clothing items across different events.

This event memory store allows the system to build knowledge over time, providing increasingly valuable insights as it accumulates data. The combination of structured database querying and semantic search across event descriptions creates an agent with a searchable memory of all past events.

Prerequisites

Before you deploy the solution, complete the following prerequisites:

  1. Configure AWS credentials using aws configure. Use either the us-west-2 or us-east-1 AWS Region.
  2. Enable access to Anthropic’s Claude 3.x models, or another supported Amazon Bedrock Agents model you want to use.
  3. Make sure you have the following dependencies:

Deploy the solution

The AWS CDK deployment creates the following resources:

  • Storage – S3 buckets for assets and query results
  • Amazon Bedrock resources – Agent and knowledge base
  • Compute – Lambda functions for actions, invocation, and updates
  • Database – Athena database for structured queries, and an AWS Glue crawler for data discovery

Deploy the solution with the following commands:

```
#1. Clone the repository and navigate to folder
git clone https://github.com/aws-samples/sample-video-monitoring-agent.git && cd sample-video-monitoring-agent
#2. Set up environment and install dependencies
python3 -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt
#3. Deploy AWS resources
cdk bootstrap && cdk deploy
#4. Run the streamlit app
cd code/streamlit_app && streamlit run app.py
```

On Windows, replace the second line with the following code:

```
python3 -m venv .venv && % .venv\Scripts\activate.bat && pip install -r requirements.txt
```

Clean up

To destroy the resources you created and stop incurring charges, run the following command:

Future enhancements

The current implementation demonstrates the potential of agent-based video monitoring in a home security setting, but there are many potential applications.

Sample Use Cases

The following showcases the application of the solution to various scenarios.

Small business

{ “alert_level”: 0, “timestamp”: “2024-11-20T15:24:15Z”, “reason”: “Vehicle arrival in driveway”, “description”: ”Standard vehicle arrival and parking sequence. Vehicles present: Black Nissan Frontier pickup (parked), silver Honda CR-V (arriving), and partial view of blue vehicle in foreground. Area features: Gravel driveway surface, two waste bins (County Waste and recycling), evergreen trees in background. Sequence shows Honda CR-V executing normal parking maneuver: approaches from east, performs standard three-point turn, achieves final position next to pickup truck. Daytime conditions, clear visibility. Vehicle condition: Clean, well-maintained CR-V appears to be 2012-2016 model year, no visible damage or unusual modifications. Movement pattern indicates familiar driver performing routine parking. No suspicious behavior or safety concerns observed. Timestamp indicates standard afternoon arrival time. Waste bins properly positioned and undisturbed during parking maneuver.” }

Industrial

{ “alert_level”: 2, “timestamp”: “2024-11-20T15:24:15Z”, “reason”: “Warehouse product spill/safety hazard”,”description”: ”Significant product spill incident in warehouse storage aisle. Location: Main warehouse aisle between high-bay racking systems containing boxed inventory. Sequence shows what appears to be liquid or container spill, likely water/beverage products based on blue colored containers visible. Infrastructure: Professional warehouse setup with multi-level blue metal racking, concrete flooring, overhead lighting. Incident progression: Initial frames show clean aisle, followed by product falling/tumbling, resulting in widespread dispersal of items across aisle floor. Hazard assessment: Creates immediate slip/trip hazard, blocks emergency egress path, potential damage to inventory. Area impact: Approximately 15-20 feet of aisle space affected. Facility type appears to be distribution center or storage warehouse. Multiple cardboard boxes visible on surrounding shelves potentially at risk from liquid damage.” }

Backyard

{ “alert_level”: 1, “timestamp”: “2024-11-20T15:24:15Z”, “reason”: “Wildlife detected on property”, “description”: ”Adult raccoon observed investigating porch/deck area with white railings. Night vision/IR camera provides clear footage of animal. Subject animal characteristics: medium-sized adult raccoon, distinctive facial markings clearly visible, healthy coat condition, normal movement patterns. Sequence shows animal approaching camera (15:42PM), investigating area near railing (15:43-15:44PM), with close facial examination (15:45PM). Final frame shows partial view as animal moves away. Environment: Location appears to be elevated deck/porch with white painted wooden railings and balusters. Lighting conditions: Nighttime, camera operating in infrared/night vision mode providing clear black and white footage. Animal behavior appears to be normal nocturnal exploration, no signs of aggression or disease.” }

Home safety

{ “alert_level”: 2, “timestamp”: “2024-11-20T15:24:15Z”, “reason”: “Smoke/possible fire detected”, “description”: ”Rapid development of white/grey smoke visible in living room area. Smoke appears to be originating from left side of frame, possibly near electronics/TV area. Room features: red/salmon colored walls, grey couch, illuminated aquarium, table lamps, framed artwork. Sequence shows progressive smoke accumulation over 4-second span (15:42PM – 15:46PM).Notable smoke density increase in upper left corner of frame with potential light diffusion indicating particulate matter in air. Smoke pattern suggests active fire development rather than residual smoke. Blue light from aquarium remains visible throughout sequence providing contrast reference for smoke density.”

Further extensions

In addition, you can extend the FM capabilities using the following methods:

  • Fine-tuning for specific monitoring contexts – Adapting the models to recognize domain-specific objects, behaviors, and scenarios
  • Refined prompts for specific use cases – Creating specialized instructions that optimize the agent’s performance for particular environments like industrial facilities, retail spaces, or residential settings

You can expand the agent’s ability to take action, for example:

  • Direct control of smart home and smart building systems – Integrating with Internet of Things (IoT) device APIs to control lights, locks, or alarm systems
  • Integration with security and safety protocols – Connecting to existing security infrastructure to follow established procedures
  • Automated response workflows – Creating multi-step action sequences that can be triggered by specific events

You can also consider enhancing the event memory system:

  • Long-term pattern recognition – Identifying recurring patterns over extended time periods
  • Cross-camera correlation – Linking observations from multiple cameras to track movement through a space
  • Anomaly detection based on historical patterns – Automatically identifying deviations from established baselines

Lastly, consider extending the monitoring capabilities beyond fixed cameras:

  • Monitoring for robotic vision systems – Applying the same intelligence to mobile robots that patrol or inspect areas
  • Drone-based surveillance – Processing aerial footage for comprehensive site monitoring
  • Mobile security applications – Extending the platform to process feeds from security personnel body cameras or mobile devices

These enhancements can transform the system from a passive monitoring tool into an active participant in security operations, with increasingly sophisticated understanding of normal patterns and anomalous events.

Conclusion

The approach of using agents as escalators represents a significant advancement in video monitoring, using the contextual understanding capabilities of FMs with the action-oriented framework of Amazon Bedrock Agents. By filtering the signal from the noise, this solution addresses the critical problem of alert fatigue while enhancing security and safety monitoring capabilities.With this solution, you can:

  • Reduce false positives while maintaining high detection sensitivity
  • Provide human-readable descriptions and classifications of events
  • Maintain searchable records of all activity
  • Scale monitoring capabilities without proportional human resources

The combination of intelligent screening, graduated responses, and semantic memory enables a more effective and efficient monitoring system that enhances human capabilities rather than replacing them. Try the solution today and experience how Amazon Bedrock Agents can transform your video monitoring capabilities from simple motion detection to intelligent scene understanding.


About the authors

Kiowa Jackson is a Senior Machine Learning Engineer at AWS ProServe, specializing in computer vision and agentic systems for industrial applications. His work bridges classical machine learning approaches with generative AI to enhance industrial automation capabilities. His past work includes collaborations with Amazon Robotics, NFL, and Koch Georgia Pacific.

Piotr Chotkowski is a Senior Cloud Application Architect at AWS Generative AI Innovation Center. He has experience in hands-on software engineering as well as software architecture design. In his role at AWS, he helps customers design and build production grade generative AI applications in the cloud.



Source link

Continue Reading

Books, Courses & Certifications

Complete Guide with Curriculum & Fees

Published

on


The year 2025 for AI education provides choices catering to learning style, career goal, and budget. The Logicmojo Advanced Data Science & AI Program has emerged as the top one, offering comprehensive training with proven results in placement for those wishing to pursue job-oriented training. It offers the kind of live training, projects, and career support that fellow professionals seek when interested in turning into a high-paying AI position. 

On the other hand, for the independent learner seeking prestige credentials, a few other good options might include programs from Stanford, MIT, and DeepLearning.AI. Google and IBM certificates are an inexpensive footing for a beginner, while, at the opposite end of the spectrum, a Carnegie Mellon certificate is considered the ultimate academic credential in AI.

Whatever choice you make in 2025 to further your knowledge in AI will place you at the forefront of technology innovation. AI, expected to generate millions of jobs, has the potential to revolutionize every industry, and so whatever you learn today will be the deciding factor in your career waters for at least the next few decades. 



Source link

Continue Reading

Trending