Large language models (LLMs) are evolving at a rapid pace. And this past month has seen some groundbreaking updates that are bound to change how we interact with AI — especially after GPT-5 and OpenAI’s new open-source models were announced.
In this newsletter, we’ll discuss the most exciting new features, benchmark results, and practical applications of the latest OpenAI models, including GPT-5 and their open-source models.
OpenAI has taken a massive leap forward with GPT-5, the most advanced model in the GPT series. It builds upon the strengths of previous models while introducing several new features and enhancements. These improvements elevate the user experience to new heights. Before discussing the key features, let’s look at the models provided by GPT-5.
GPT-5 is delivered as a unified flagship model but is available in optimized variants to suit performance, cost, and context needs. Rather than separate model families like GPT-3.5 and GPT-4, GPT-5 operates under a single architecture with adaptive reasoning. However, OpenAI exposes multiple configurations for specific workloads:
Model Variant | Context Window | Optimized For | Ideal Use Cases |
gpt-5-mini | 8K tokens | Low-latency responses with minimal compute cost | Quick Q&A, chatbots, and lightweight summarization |
gpt-5-standard | 32K tokens | Balanced speed and reasoning depth | Coding, content creation, and moderate multi-turn conversations |
gpt-5-pro | 128K tokens | Full deep-reasoning capability with maximum context retention | Research, large document analysis, and complex multi-step problem-solving |
gpt-5-reasoning | 128K tokens | Extended chain-of-thought and higher reasoning fidelity for difficult problems | STEM problem solving, advanced planning, and logical/mathematical reasoning |
All variants share the same underlying improvements but differ in resource allocation and throughput. This tiered approach lets users choose between cost efficiency and maximum capability, without switching to a completely different model family.
Context window and cost structure: GPT-5 introduces dynamic context windows that vary depending on your subscription tier, providing different levels of context retention for various tasks:
8K tokens for free-tier users: Ideal for basic queries with limited context.
32K tokens for Plus-tier users ($20/month): For more complex conversations and multi-turn tasks.
128K tokens for Pro-tier users ($40/month): Designed for professional users with high-context needs and demanding tasks.
This context window flexibility ensures that GPT-5 can be tailored to fit the needs of casual users and those requiring advanced capabilities. The subscription tiers make it accessible to a wide range of users.
Customizable chat colors: A new personalization feature allows users to modify the chat interface’s color scheme. This is especially helpful for users who prefer a specific aesthetic or those with visual accessibility needs. It ensures that GPT-5 can cater to a wider range of user preferences.
Preset personalities: GPT-5 introduces preset personalities that adjust the assistant’s tone to match the user’s needs. You can now choose from different personality modes, whether you want GPT-5 to be concise, supportive, professional, or even slightly sarcastic. This feature allows users to have more control over the nature of interactions, making it more adaptable to various conversational contexts.
To access this feature, go to the “Personalization” section in “Settings,” click “Custom instructions,” and then select the personality you want by picking a preset:
Gmail and Google Calendar integration: For Plus, Pro, Team, and Enterprise users, GPT-5 now integrates seamlessly with Gmail and Google Calendar. This allows GPT-5 to assist with email drafting, scheduling, and event management directly from within the chat interface. Users can now ask GPT-5 to schedule meetings, draft professional emails, or summarize important calendar events, streamlining their workflow and enhancing productivity.
To use this feature, go to the Connectors section in Settings and follow the on-screen instructions to connect your Gmail and Google Calendar.
Performance and scalability: GPT-5 is about efficiency and adaptability. Switching between fast responses and deep reasoning is useful for multi-turn conversations and complex inquiries. This makes it a highly versatile tool that caters to casual users (looking for quick answers) and professionals (who require detailed, accurate insights).
User-centric features: The integration with Gmail and Google Calendar is a game-changer for professionals, while the personalization features (such as customizable chat colors and preset personalities) make GPT-5 feel more intuitive and adaptable to individual needs. These changes elevate GPT-5 from a simple tool to an essential part of users’ workflows.
While GPT-5 has set a new benchmark for performance, OpenAI’s release of open-source GPT models has made powerful language tools more widely accessible. These models are part of OpenAI’s ongoing efforts to democratize AI and enable developers to build custom AI solutions without relying on costly API calls or proprietary systems.
In line with its open-source strategy, OpenAI has released GPT-OSS models, including versions like gpt-oss-120b and gpt-oss-20b, both available under an Apache 2.0 license. This means developers now have access to high-performance LLMs with full control over the models, enabling them to fine-tune and deploy these models in custom environments.
The move to open-source is significant because it lowers the barrier to entry for developers, researchers, and businesses. This allows them to experiment with cutting-edge AI without paying for API access. These models can be run locally, on cloud instances, or in hybrid environments, making them highly scalable and adaptable for various use cases.
OpenAI acknowledges that “open models matter” for accessibility and innovation. Open-sourcing these large models empowers developers, researchers, and even smaller companies to experiment without relying on expensive API calls. The models are optimized to run locally as gpt-oss-20b needs just ~16GB VRAM (released in a compressed MXFP4 format), so it can even run on a high-end laptop or edge device. Meanwhile, the 120B can fit on a single server-grade GPU (no more TPU pod required for a 100B+ model). This lowers the barrier to entry for advanced AI: startups can fine-tune these models on proprietary data, or individuals can run them for privacy-sensitive tasks. OpenAI also put the weights on Hugging Face and partnered with many inference providers (Azure, AWS, Hugging Face, etc.) to simplify deployment for developers. In short, OpenAI is bridging the gap, and now, you can choose their API for fully managed, multimodal GPT, or grab GPT-OSS to run your own cost-efficient, customizable model.
Since the models are on Hugging Face, using them is straightforward. For instance, with the Python Transformers library:
This snippet loads the 20B model and queries it in a chat format (the pipeline will automatically apply OpenAI’s Harmony chat template). The output is a coherent answer with reasoning, thanks to the model’s training on dialogue and explanation tasks. By adjusting the model ID to openai/gpt-oss-120b, one could run the larger 120B model if adequate GPU memory is available. The open-source nature of GPT‑OSS means developers can integrate advanced AI into their apps without calling an external API. They can even inspect or tweak the model’s behavior at will, which is a huge win for transparency and innovation.
We wanted to test GPT-5 on our own. Note that this is by no means a thorough evaluation of GPT-5. It’s a quick way to understand how it behaves in a typical chat setup.
We will run multiple experiments to evaluate the model in coding, logical reasoning, and STEM-based problem-solving.
Let’s start with a coding example. We want to create an interactive physics-based animation using JavaScript. The animation will simulate a galaxy of stars moving under the influence of gravity while incorporating dynamic behaviors such as merging, color blending, and supernova explosions.
The prompt is given below.
Prompt:
Generate a JavaScript animation that simulates a galaxy of stars moving in a gravitational field inside a container with the features mentioned below.
Randomly placed stars with different masses and colors (white, blue, yellow, green, and red).
Gravity simulation where stars attract each other based on a simple Newtonian gravity model.
Star merging whereif two stars get close enough, they merge into a larger star, blending their colors using additive color mixing.
Supernova effect where when a star reaches a certain mass threshold, it explodes into multiple smaller stars.
Smooth physics updates with realistic-looking gravitational motion.
The first version had over 500 lines of code, but it failed to run on the first try. After the second attempt, the following animation was generated.
For comparison, the same prompt was given to the OpenAI o3 model, which resulted in the following animation:
It can be noted that the supernova explosions start almost immediately in the o3 generated code, whereas things take time to start in the code by GPT-5. However, the code generated by GPT-5 is more dramatic, and visually more pleasing. We also get a control panel in GPT-5 generated code, which was not present in the code generated by o3.
There are some issues however, in the code generated by GPT-5:
The colors of the stars eventually all turn to white, whereas in the code generated by o3, the colors change according to the specifications given in the prompt.
The paths of the smaller stars do not seem to change if they are close enough to impact the gravity of the larger star. Once again, the code generated by the o3 model took care of that.
In this experiment, a simple prompt was given. The prompt is a logical and reasoning puzzle with elements of relational reasoning. It primarily tests deductive logic and understanding of how ordering works.
Prompt: In a distant kingdom, three wise scholars, namely Xander, Yvette, and Zachary, are known for their intellect and wisdom. The royal records state that if Xander is older than Yvette, and Yvette is older than Zachary, then Zachary is the youngest. However, a new decree adds that if Yvette and Zachary were born in the same year, but Zachary’s birth occurred later in the year than Yvette’s, he would still be considered younger. Given that the royal scribe recorded Yvette’s birth before Zachary’s and that Xander is older than both, who holds the title of the youngest scholar?
The model responded within 4 seconds:
Zachary.
Reason: Xander is older than both, so the youngest is between Yvette and Zachary. The scribe recorded Yvette’s birth before Zachary’s; even if they share the same year, the decree says the one born later (Zachary) is younger. Hence, Zachary is the youngest.
For this experiment, we will provide the STEM problem to see how they both perform to test the models’ skills in science, technology, engineering, and mathematics.
The following integration problem was given to both the models:
Prompt: Find the integral of
We are not showing the reply here as its quite lengthy and filled with formulas and discussion of mathematical theories and concepts, but the problem was solved in less than 5 seconds, using integration by parts.
Overall, the results were competent. However, we didn’t test its context window limitations, an area where some evaluations have flagged it as “not up to the mark.”
The latest updates from OpenAI mark a significant milestone in the LLM landscape.
GPT-5’s blend of adaptive reasoning, expanded context windows, and user-centric features like personalization and productivity integrations elevate it beyond a mere language model. Instead, this blend of features turns it into a versatile AI assistant for diverse use cases. Meanwhile, the release of GPT-OSS open-source models lowers the barrier to advanced AI. This empowers developers, researchers, and businesses to run, fine-tune, and deploy high-performance models on their own terms. Whether you value cutting-edge capabilities through the managed GPT-5 service or the flexibility of open-source deployment, OpenAI’s dual-track approach is incredibly useful. It signals a future where powerful AI is both more capable and more accessible than ever before.
If you want to learn more about large language models, consider taking the following course to learn the basics of LLMs
In this course, you will learn how large language models work, what they are capable of, and where they are best applied. You will start with an introduction to LLM fundamentals, covering core components, basic architecture, model types, capabilities, limitations, and ethical considerations. You will then explore the inference and training journeys of LLMs. This includes how text is processed through tokenization, embeddings, positional encodings, and attention to produce outputs, as well as how models are trained for next-token prediction at scale. Finally, you will learn how to build with LLMs using a developer-focused toolkit. Topics include prompting, embeddings for semantic search, retrieval-augmented generation (RAG), tool and function calling, evaluation, and production considerations. By the end of this course, you will understand how LLMs actually work and apply them effectively in language-focused applications.