What Is Context Engineering?
Discover how to curate and manage the full information environment that AI models use during inference, focusing on optimizing context window quality to ensure reliable, coherent responses in multi-turn and agentic systems. Understand key concepts like context rot, attention budgets, and strategies for maintaining high signal-to-noise ratios with techniques including compaction, structured note-taking, and multi-agent architectures.
A developer builds a customer support agent. The system prompt is clean, specific, and well-structured. In the first few turns, the agent handles user requests with impressive precision. Then, about a dozen messages in, something shifts. It starts ignoring constraints it followed perfectly earlier. It repeats information the user already confirmed. It loses coherence entirely.
The prompt did not change. The model did not change. What changed was the surrounding information environment the model was operating within, and nobody was managing it.
This is the problem that context engineering exists to solve. As AI systems grow more capable and take on longer, more complex tasks, the quality of their outputs depends less on finding the perfect phrasing for a single instruction and more on controlling the full information environment the model operates within. Understanding this discipline is essential for anyone building reliable applications with large language models.
Defining context engineering
To define context engineering precisely, we first need to understand what context means in the technical sense. When a LLM generates a response, it has no persistent memory across sessions and no awareness of anything outside the current interaction. The only information available to it at any moment is the complete set of tokens passed to it during that inference call. That set of tokens is its context. It includes the system prompt, the conversation history, any data retrieved from external sources, tool outputs, and examples provided by the engineer.
Context engineering is the discipline of deliberately curating, structuring, and managing that set of tokens to maximize the likelihood of a desired model output. Andrej Karpathy, a leading voice in applied AI, described it as “the art and science of filling the context window.” That framing captures something important: context engineering is both a technical practice and a craft that requires judgment.
Where prompt engineering focuses on how to write instructions, context engineering asks a broader question: given everything the model could potentially see, what is the optimal subset of information to place in front of it, in what form, and at what moment? ...