I asked AI to fix 120 pages. The false positives were on me.

A 120-page document came back in five minutes, and the false positives were on me, not the model.

A 120-page document landed on my desk. Dense. Normally two to three days of work alongside my regular load. I handed it to AI. Five minutes. There were a few false positives. I read every flagged item myself, caught the bad ones, and wrapped the whole task in a few hours instead of days.

The hours I won were real. The flagged items I had to throw out were real too. I asked the AI to do a thing, and it did the thing. It did the wrong version of it. It flagged choices that were the customer's house style, not mistakes. It missed the rule book I had on my own laptop and never gave the model. The bill for those false positives was on me.

If you have ever asked AI for a clean answer and gotten a fluent-but-wrong one, you have already met the difference.

Here is the question the 120 pages raised, and the question that gets typed into Google every day. What is the difference between prompt engineering and context engineering, and why is everyone using the two phrases interchangeably? It is a fair question. The phrases sound like the same thing said twice. They are not. Context engineering vs prompt engineering is a real seam, and it is the seam that decides whether a five-minute turnaround comes back usable or comes back as homework.

I will give you the diagnostic next, the fix after that, and a small utility I built at the end.

Treat the AI like a new coworker on day one, not a search bar.

Picture a smart new joiner on their first morning. No badge yet. No client list. Has not seen a single SOP. Walk up to that person and say "summarise this transcript" and you will get a summary. It will be the summary anyone could write. The names will be flat, the line your customer keeps repeating will not be the line the new joiner picks up.

Sit them down for two minutes first. Who you are, which interview, what the customer already said no to, what you intend to do with the output. The same person hands back something usable. That two-minute sit-down is the whole shape of the second discipline. The prompt is the verb you hand them. The context is everything you set up before the verb.

The two phrases mean different things, and the people who built the models say so out loud.

Anthropic's engineering write-up draws the line cleanly. the primary focus of prompt engineering is how to write effective prompts, particularly system prompts. Read that twice. Prompt engineering lives on the task side of the boundary. The verb is "instruct." You sit at the cursor and write a better task description, a better system prompt, a better instruction. You are sharpening the ask.

In the same engineering blog, the team names the next move. At Anthropic, we view context engineering as the natural progression of prompt engineering. The discipline grew because writing a clean prompt stopped being the bottleneck. Supplying the right surrounding tokens started being it. The prompt did not get worse. The job got bigger. Now you are not just writing the verb, you are also choosing the role the model is playing, the references it is allowed to consult, the rules it is supposed to follow, and the things it is supposed to leave alone.

The same write-up gives the test for whether the surrounding tokens are doing their job. good context engineering means finding the smallest possible set of high-signal tokens that maximize the likelihood of some desired outcome. That is the line that explains why this saves tokens, time, and rework. You are not pasting your entire SharePoint into the context window. You pick the smallest set that does the job. Everything else is noise to the model's attention.

That is what was wrong with my 120-page run. The false positives were not a model failure. They were a context failure. The model was not told what counts as a problem and what counts as house style.

The strongest objection is that this whole vocabulary will be obsolete in two years.

The strongest counter is the HBR essay by Oguz Acar, who argues that Despite the buzz surrounding it, the prominence of prompt engineering may be fleeting. I take that seriously. Half of it is right. As models get better at inferring intent from sparser instructions, a lot of today's prompt engineering tricks will fade. Special tokens. Magic phrases. Role-stuffing. Phrases people copy off LinkedIn as if they were spells.

The half that does not fade is the half this post is about. The model can get better at reading you. It cannot guess what it has never seen. Your customer's house style. The two regulated terms you cannot use. The reference document you want it to refer to. The file you want it to ignore. Better models make context engineering more important, not less. The bottleneck moves from prompting craft to what you feed in. That holds especially as AI agents start running tasks across multiple tools.

The fix was not a better prompt, it was a context block I now write before the prompt.

In August 2024, six months after I closed the tab on my first ChatGPT try in February 2024, I came back to it. The second time, I stopped searching and started talking. I began supplying role, background, and both sides of the coin before the task. Same model, same kind of work. The second answer told me which paragraph I needed to rewrite before showing the document to anyone.

I did not change the verb. I changed everything that came before the verb. I told the model who I was, what document was sitting in front of me, what I wanted kept, what I wanted left out, and which reference to use. Only then did I tell it what to do.

Replay the 120-page beat with that shape and the false positives shrink on the page. "You are the experienced reviewer of user guides with over 15+ years of relative experience. Reviewing a user gide document that I have to share with the customer. Ignore deviations from generic English style. Flag only items that violate the attached terminology guide. Do not flag a heading style I have used three times in the same document. Ensure that this is very critical and any mistake can result into breaking customer trust" Same task. Same model. The list that comes back is the list I would have made by hand, on a different day, with three days to do it.

Context engineering is not an excuse to write a 600-word prompt for a one-line task.

The failure mode of this discipline is over-engineering. Stuffing the model with role, background, three reference documents, twelve don'ts, and an example block when the task is "rewrite this sentence in active voice." The smallest-set principle works in both directions. If a one-line ask got the right answer the first time, the one-line ask is the right tool.

When to use context engineering instead of prompt engineering is the question worth asking. Tasks that recur. Tasks with house style. Tasks where the cost of a wrong answer is rework you cannot afford. And one rule that does not bend: never paste sensitive customer data, IRB submission IDs, or regulated material into a cloud model just to fatten the context block. Non-negotiable.

I built a small utility that writes the context block for you, because I got tired of writing it from memory.

I now use AI fluently in my Consultant workflow, behind the wheel, with generalised queries. I built a Context Generator utility on my site that walks the user through the Role / Objective / Background / Data prompt structure.

Try it here: Context Generator on withsamipshah.com. The next question is no longer how to write a context block. It is what context you are willing to write down once so you never have to explain it to the AI again.

A 120-page document came back in five minutes, and the false positives were on me, not the model.

If you have ever asked AI for a clean answer and gotten a fluent-but-wrong one, you have already met the difference.

Treat the AI like a new coworker on day one, not a search bar.

The two phrases mean different things, and the people who built the models say so out loud.

The strongest objection is that this whole vocabulary will be obsolete in two years.

The fix was not a better prompt, it was a context block I now write before the prompt.

Context engineering is not an excuse to write a 600-word prompt for a one-line task.

I built a small utility that writes the context block for you, because I got tired of writing it from memory.

More field notes you may like.

My first ChatGPT answer was bland. Mine was the problem.

I asked the AI to clean my Downloads. It sent a checklist.

I picked the wrong Claude model and shipped nothing.