Search⌘ K
AI Features

How Models Actually Work

Explore the mechanics behind AI language models including tokens, context windows, model tiers, extended thinking, and system prompts. Understand why AI sometimes forgets instructions or hallucinates, and learn how to effectively work with different AI models to build reliable features and avoid costly mistakes. This lesson prepares you to harness AI behavior for better debugging, iteration, and model selection in your app development.

PawPals has been going well. At this stage, we are comfortable with the iteration loop, building real features across two different tools. Then three things happen in the same afternoon.

  1. First, you ask the AI to update the booking confirmation page, and it rewrites the navigation bar you told it never to touch. You already said “do not change the navigation bar” four messages ago. The AI seems to have forgotten.

  2. Second, you ask the AI to add a payment integration and it confidently references a Stripe function called createBookingCharge. That function does not exist, which you know because you looked it up, but the AI wrote it with the same confidence it writes everything else.

  3. Third, you switch to a different model because someone told you it was “better.” The response takes 45 seconds instead of 3, and the answer is no better than what you were getting before. You just waited fifteen times longer for the same result.

None of these are bugs, and none of them are random. They are predictable behaviors of how language models work, and once you understand the mechanics, you can work around everyone. Six concepts explain almost everything you will encounter: tokens, the context window, model tiers, extended thinking, temperature, and system prompts.

What are tokens?

When you type a message to the AI, it does not read your words the way you do. It breaks your text into tokens which are small chunks of text that the model processes as individual units.

A token is roughly three-quarters of a word:

  • Short common words like “the” or “is” → one token each

  • Longer words get split: “confirmation” → confirm + ation

  • Code is tokenized differently: const walkerName = "Sarah"five or six tokens

Why does this matter? Because tokens are the currency of every interaction. Everything ...