Home/Newsletter/Artificial Intelligence/Why Energy-Based Transformers are key for AI’s next leap

Why Energy-Based Transformers are key for AI’s next leap

AI research is shifting from speed alone to deliberate “thinking time.” This piece explores how Energy-Based Transformers (EBTs) use built-in self-checks to deliver stronger, more reliable results across text, images, and beyond.

9 min read

Sep 01, 2025

When you ask ChatGPT a question and see that little “thinking…” cue, it’s pausing intentionally: considering possibilities and refining the reply instead of offering the first guess.

This approach, called inference time computation, allows AI to spend extra cycles when a query is tricky, trading milliseconds for better results. Now here’s the bigger question: can we train a model to perform careful double-checking without special add-ons, and make it work across text, images, and more, using only basic unsupervised training?

In this piece, we’ll explore how that works and why Energy-Based Transformers (EBTs) may be the most exciting leap yet.

What does “extra thinking time” mean in recent AI research?

Imagine you type a question into your favorite chatbot.

A standard transformer whips through its layers once, streams out tokens, and is done in a few hundred milliseconds. With the new approach, the model still produces that first draft, but then it pauses to do one (or many) additional internal passes. During those passes, it may self-edit, re-rank alternative completions, or search its latent space. The final answer is shown when the timer ends or confidence is good enough.

Written By: Fahim