Search⌘ K
AI Features

How ChatGPT Works?

Explore the step-by-step process of how ChatGPT transforms user prompts into responses. Understand key concepts like tokenization, context windows, transformer attention, autoregressive generation, and streaming output to confidently explain ChatGPT's workings in AI interviews.

In AI and ML engineering interviews, it’s common to be asked, “Can you explain how ChatGPT works?” This question probes your understanding of large language models and your ability to articulate complex systems clearly. Interviewers want to see that you grasp the key components of a generative AI system (like ChatGPT) and can explain the inference-time process—i.e., what happens from when a user enters a prompt to when ChatGPT streams back a response.

Think of your answer as a guided tour of what happens when someone interacts with ChatGPT. In this lesson, we’ll walk through the core components of ChatGPT’s inference process step by step, with analogies and diagrams to help solidify your understanding. By the end, you should have a clear mental model that enables you to confidently communicate in an interview.

What happens when a user sends a message to ChatGPT?

The first step in ChatGPT’s pipeline is tokenization, ...