Jobs & Careers

A Gentle Introduction to Context Engineering in LLMs

Published

4 days ago

August 7, 2025

Kanwal Mehreen

Image by Author | Canva

# Introduction

There is no doubt that large language models can do amazing things. But apart from their internal knowledge base, they heavily depend on the information (the context) you feed them. Context engineering is all about carefully designing that information so the model can succeed. This idea gained popularity when engineers realized that simply writing clever prompts is not enough for complex applications. If the model doesn’t know a fact that’s needed, it can’t guess it. So, we need to assemble every piece of relevant information so the model can truly understand the task at hand.

Part of the reason the term ‘context engineering’ gained attention was due to a widely shared tweet by Andrej Karpathy, who said:

+1 for ‘context engineering’ over ‘prompt engineering’. People associate prompts with short task descriptions you would give an LLM in your day-to-day use, whereas in every industrial-strength LLM app, context engineering is the delicate art and science of filling the context window with just the right information for the next step…

This article is going to be a bit theoretical, and I will try to keep things as simple and crisp as I can.

# What Is Context Engineering?

If I received a request that said, ‘Hey Kanwal, can you write an article about how LLMs work?’, that’s an instruction. I would write what I find suitable and would probably aim it at an audience with a medium level of expertise. Now, if my audience were beginners, they would hardly understand what’s happening. If they were experts, they might consider it too basic or out of context. I also need a set of instructions like audience expertise, article length, theoretical or practical focus, and writing style to write a piece that resonates with them.

Likewise, context engineering means giving the LLM everything from user preferences and example prompts to retrieved facts and tool outputs, so it fully understands the goal.

Here’s a visual that I created of the things that might go into the LLM’s context:

Context engineering includes instructions, user profile, history, tools, retrieved docs, and more | Image by Author

Each of these elements can be viewed as part of the context window of the model. Context engineering is the practice of deciding which of these to include, in what form, and in what order.

# How Is Context Engineering Different From Prompt Engineering?

I will not make this unnecessarily long. I hope you have grasped the idea so far. But for those who didn’t, let me put it briefly. Prompt engineering traditionally focuses on writing a single, self-contained prompt (the immediate question or instruction) to get a good answer. In contrast, context engineering is about the entire input environment around the LLM. If prompt engineering is ‘what do I ask the model?’, then context engineering is ‘what do I show the model, and how do I manage that content so it can do the task?’

# How Context Engineering Works

Context engineering works through a pipeline of three tightly connected components, each designed to help the model make better decisions by seeing the right information at the right time. Let’s take a look at the role of each of these:

// 1. Context Retrieval and Generation

In this step, all the relevant information is pulled in or generated to help the model understand the task better. This can include past messages, user instructions, external documents, API results, or even structured data. You might retrieve a company policy document for answering an HR query or generate a well-structured prompt using the CLEAR framework (Concise, Logical, Explicit, Adaptable, Reflective) for more effective reasoning.

// 2. Context Processing

This is where all the raw information is optimized for the model. This step includes long-context techniques like position interpolation or memory-efficient attention (e.g., grouped-query attention and models like Mamba), which help models handle ultra-long inputs. It also includes self-refinement, where the model is prompted to reflect and improve its own output iteratively. Some recent frameworks even allow models to generate their own feedback, judge their performance, and evolve autonomously by teaching themselves with examples they create and filter.

// 3. Context Management

This component handles how information is stored, updated, and used across interactions. This is especially important in applications like customer support or agents that operate over time. Techniques like long-term memory modules, memory compression, rolling buffer caches, and modular retrieval systems make it possible to maintain context across multiple sessions without overwhelming the model. It is not just about what context you put in but also about how you keep it efficient, relevant, and up-to-date.

# Challenges and Mitigations in Context Engineering

Designing the perfect context isn’t just about adding more data, but about balance, structure, and constraints. Let’s look at some of the key challenges you might encounter and their potential solutions:

Irrelevant or Noisy Context (Context Distraction): Feeding the model too much irrelevant information can confuse it. Use priority-based context assembly, relevance scoring, and retrieval filters to pull only the most useful chunks.
Latency and Resource Costs: Long, complex contexts increase compute time and memory use. Truncate irrelevant history or offload computation to retrieval systems or lightweight modules.
Tool and Knowledge Integration (Context Clash): When merging tool outputs or external data, conflicts can occur. Add schema instructions or meta-tags (like @tool_output) to avoid format issues. For source clashes, try attribution or let the model express uncertainty.
Maintaining Coherence Over Multiple Turns: In multi-turn conversations, models may hallucinate or lose track of facts. Track key information and selectively reintroduce it when needed.

Two other important issues: context poisoning and context confusion have been well explained by Drew Breunig, and I encourage you to check that out.

# Wrapping Up

Context engineering is no longer an optional skill. It is the backbone of how we make language models not just respond, but understand. In many ways, it is invisible to the end user, but it defines how useful and intelligent the output feels. This was meant to be a gentle introduction to what it is and how it works.

If you are interested in exploring further, here are two solid resources to go deeper:

### Items for Human Review:
* **Andrej Karpathy Tweet**: The article quotes a “widely shared tweet by Andrej Karpathy.” For credibility and reader convenience, it would be best to find the original tweet and link to it directly. The quoted text should also be checked against the original for accuracy.
* **External Links**: The article links to an article by Drew Breunig, an arXiv paper, and a deepwiki page. A human editor should verify these links are active, reputable, and point to the intended content before publication. The arXiv paper ID (2507.13334) appears to be a placeholder for a future publication and will need to be confirmed.

Kanwal Mehreen is a machine learning engineer and a technical writer with a profound passion for data science and the intersection of AI with medicine. She co-authored the ebook “Maximizing Productivity with ChatGPT”. As a Google Generation Scholar 2022 for APAC, she champions diversity and academic excellence. She’s also recognized as a Teradata Diversity in Tech Scholar, Mitacs Globalink Research Scholar, and Harvard WeCode Scholar. Kanwal is an ardent advocate for change, having founded FEMCodes to empower women in STEM fields.

Source link

Jobs & Careers

NVIDIA & AMD to Now Pay 15% of China Sale Revenue to the US

Published

31 minutes ago

August 11, 2025

Sanjana Gupta

US chipmakers NVIDIA and AMD recently received permission to sell their H20 and MI308 AI chips to China, respectively, amid tightened export controls. According to the latest reports, companies will now allocate 15% of their revenue from sales of these advanced chips to the US government.

The agreement follows the Donald Trump administration’s decision in April to halt H20 chip sales to China. By early 2025, Chinese firms reported severe shortages, and NVIDIA had warned of a potential $5.5 billion hit to its bottom line. Notably, last month, the administration allowed both NVIDIA and AMD to resume sales after obtaining export licences.

NVIDIA had also announced the RTX PRO GPU, a China-specific chip engineered to comply with US regulations. The RTX PRO joins the H20 and other variants, designed to maintain NVIDIA’s presence in China while adhering to legal boundaries.

China is a key market for both companies. In a visit to Beijing in April, NVIDIA CEO Jensen Huang said China was a critical market for NVIDIA and added, “We hope to continue to cooperate with China.”

Moreover, at the opening ceremony of the third China International Supply Chain Expo in Beijing this year, Huang called Chinese AI models “world-class”. This included DeepSeek, the one model that was ‘banned’ on multiple US government devices like those of NASA and the Navy.

In the same ceremony, he asserted that he has excellent relations with “just about’ every government”. “Anyone who discounts Huawei and China’s manufacturing capability is deeply naive. This is a formidable company, and I’ve seen the technologies they’ve created in the past,” he added.

Moreover, US commerce secretary Howard Lutnick said last month that the resumption of AI chip sales was part of negotiations with China to secure rare earths. He described the H20 as NVIDIA’s “fourth-best chip” and said it was in the US’ interests for Chinese firms to use American technology, even if the most advanced chips remained restricted.

The post NVIDIA & AMD to Now Pay 15% of China Sale Revenue to the US appeared first on Analytics India Magazine.

Source link

Jobs & Careers

Graas.ai Secures $9 Million to Expand AI Agent Foundry in India

Published

42 minutes ago

August 11, 2025

Merin Susan John

Email:
info@aimmediahouse.com

Our Offices

AIM India
1st Floor, Sakti Statesman, Marathahalli – Sarjapur Outer Ring Rd, Green Glen Layout, Bellandur, Bengaluru, Karnataka 560103

AIM Americas
166 Geary St STE 1500 Suite #634, San Francisco, California 94108, United States

Source link

Jobs & Careers

Agentic AI Hands-On in Python: A Video Tutorial

Published

1 hour ago

August 11, 2025

Matthew Mayo

Agentic AI Hands-On in Python: A Video Tutorial

Image by Editor | ChatGPT

# Introduction

Sometimes it feels like agentic AI is just AI that’s taken an improv class and now won’t stop making its own decisions. Trying to more accurately define agentic AI can feel like explaining jazz to someone who’s never heard music. It’s part autonomy, part orchestration, and 100% guaranteed to make you question who’s actually running the show.

Well, there’s no need to be confused by agentic AI any longer. This video, recently recorded from an ODSC talk and made broadly available by its creators, is a comprehensive four-hour workshop on agentic AI engineering, hosted by Jon Krohn of the Jon Krohn YouTube channel and Super Data Science podcast, and Edward Donner, co-founder and CTO of Nebula.

The video dives into the definition, design principles, and development of AI agents, emphasizing the unprecedented opportunity to derive business value from AI applications using agentic workflows in 2025 and beyond. It covers a range of frameworks and practical applications, showcasing how large language model (LLM) outputs can control complex workflows and achieve autonomy in tasks. The instructors highlight the rapid advancements in LLM capabilities and the potential for agentic systems to augment or fully automate business processes.

The workshop emphasizes the hands-on nature of the content, with an accompanying GitHub repository with all the code for viewers to replicate and experiment with. The instructors frequently stress the rapid evolution of the field and the importance of starting small with agentic projects to ensure success.

# What’s Covered?

Here are the more specific topics covered in the video:

Defining Agents: The video defines AI agents as programs where LLM outputs control complex workflows, emphasizing autonomy and distinguishing between simpler predefined workflows and dynamic agents proper.
The Case for Agentic AI: It highlights the unprecedented opportunity in 2025 to derive business value from agentic workflows, noting the rapid improvement of LLMs and their dramatic impact on benchmarks like Humanity’s Last Exam (HLE) when used within agentic frameworks.
Foundational Elements: Core concepts such as tools (enabling LLMs to perform actions) are explained, alongside inherent risks like unpredictability and cost, and strategies for monitoring and guardrails to mitigate them.
Implications of Agentic AI: The workshop also addresses the implications of Agentic AI, including workforce changes and strategies for future-proofing careers in data science, emphasizing skills like multi-agent orchestration and foundational knowledge.

Agentic AI frameworks, the tools of the agentic revolution, covered include:

Model Context Protocol (MCP): an open-source standard protocol for connecting agents with data sources and tools, often likened to a ‘USBC for agentic applications’
OpenAI Agents SDK: a lightweight, simple, and flexible framework, used for deep research
CrewAI: a heavier-weight framework specifically designed for multi-agent systems
More complex frameworks like LangGraph and Microsoft Autogen are also mentioned

Finally, the hands-on coding exercises in the video include:

Practical demonstrations include recreating OpenAI’s Deep Research functionality using the OpenAI Agents SDK, showcasing how agents can perform web searches and generate reports
Discussions on design principles for agentic systems cover five workflow design patterns: Prompt Chaining, Routing, Parallelization, Orchestrator Worker, and Evaluator Optimizer
Building an autonomous software engineering team with CrewAI is demonstrated, where agents collaborate to write and test Python code and even generate a user interface, highlighting CrewAI’s ‘batteries included’ features for safe code execution
The final project involves developing autonomous traders using MCP, demonstrating how agents can access real-time market data, leverage persistent knowledge graphs, and perform web searches to make simulated trading decisions

# Expected Takeaways

After watching this video, viewers will be able to:

Grasp the fundamental concepts of AI agents, including their definition, core components like tools and autonomy, and the distinction between constrained workflows and dynamic agent systems.
Implement agentic systems using popular frameworks such as those from OpenAI and CrewAI, gaining hands-on experience in setting up multi-agent collaborations and leveraging their unique features, like structured outputs or automated code execution.
Understand and apply the Model Context Protocol (MCP) for seamless integration of diverse tools and resources into agentic applications, including the ability to create simple custom MCP servers.
Develop practical agentic applications, as demonstrated by the recreation of deep research functionality and the construction of an autonomous software engineering team and simulated trading agents.
Recognize and mitigate risks associated with deploying agentic systems, such as unpredictability and cost management, through monitoring and guardrails.

If you’re looking for a resource to straighten out agentic AI for you and show you how you can leverage the burgeoning technology in your AI engineering exploits for this year and beyond, check out this great video by Jon Krohn and Edward Donner.

Matthew Mayo (@mattmayo13) holds a master’s degree in computer science and a graduate diploma in data mining. As managing editor of KDnuggets & Statology, and contributing editor at Machine Learning Mastery, Matthew aims to make complex data science concepts accessible. His professional interests include natural language processing, language models, machine learning algorithms, and exploring emerging AI. He is driven by a mission to democratize knowledge in the data science community. Matthew has been coding since he was 6 years old.

Source link