the shift from prompt engineering to context engineering
A few months ago I was using a single AI agent to generate project plans for all of my projects. The same agent. The same model. A standard prompt I had written that was supposed to produce a consistent plan no matter what project I fed it. The output was nothing like consistent. Every plan came out in a different format. Different sections. Different sequencing. Different information surfaced for one project and missing for the next. None of them matched the structure I had asked for. I would change one detail in the prompt and the next plan would come out organized completely differently from the previous five.
The agent was doing what I asked. It was not doing what I needed.
The version of AI use most people know is this version. One model. One prompt at a time. The work you get out is shaped by how well you wrote the prompt, the model's specific behavior that day, what training distribution it last sampled from, and a handful of other things you cannot see. The craft, as the conversation went last year, was prompt engineering. Write the perfect prompt and you get the perfect output.
I do not believe that anymore. The craft is somewhere else.
When I stopped using one general agent and started building specialists, the project plans came out the same way every time. The specialist's only job is to generate a project plan. It has a defined input shape, a defined output shape, and a defined behavior between them. It has access to a specific set of rules about what a project plan in my system looks like. It has access to memory about how prior project plans were structured. It has examples. It has constraints. The prompt I send it is almost incidental. What does the work is everything around the prompt.
This is the shift that is happening across the AI industry right now, and the people building serious agent systems are talking about it directly. It is called context engineering. The craft is no longer about writing the perfect prompt. The craft is about constructing the right context for the agent to operate inside. The right rules. The right examples. The right tools. The right memory. The right scope. When the context is correct, the prompt almost does not matter.
The reason this shift matters more than it sounds is that it changes what an agent system actually is. A general agent with a clever prompt is not a system. It is a query interface. You ask, it answers, the answer varies, you cannot test it, you cannot rely on it, you cannot build on top of it. A specialist with a constrained scope and a well-engineered context bundle is something you can build a system out of. The output is repeatable. The behavior is testable. The work it produces clears a defined bar or it does not.
The principle underneath both of those is a constraint principle. The way you make an agent reliable is by giving it less to do, not more. You narrow its scope. You define its inputs. You define its outputs. You give it the context it needs and nothing else. The freedom you take away is the reliability you get back.
This goes against the way most people are encouraged to use AI right now. The marketing is all about how the model can do everything. The advice is all about how to prompt it better so it does more for you. Both of those are pointing at the wrong thing if you are trying to build something that has to work the same way twice.
I run an agent system because I tried the one-agent version and it did not produce the reliability I needed to actually run a company on top of it. Every agent in the system has a narrow scope, a defined output, a context bundle that gets loaded when it is invoked, and a place in the pipeline where another specialist verifies its work. The system works because of those constraints, not despite them.
Prompt engineering was the right framing for a moment when most people were typing into a chat window. Context engineering is the right framing for a moment when people are building agents that have to do work no one is going to inspect prompt-by-prompt. The shift is real. The people in my world who are not making it yet are going to find their generalist agents producing the same kind of output mine were producing a few months ago. Wildly different every time. Beautiful in pieces. Useless in aggregate.
Get new posts by email
Friday digest, no filler. Drop your email below and I'll send what I publish.