OpenAI Rolls Out GPT-5.4 Thinking After GPT-5.3 Instant

Key Takeaway

OpenAI introduced GPT-5.4 after releasing GPT-5.3 Instant, combining a tone and accuracy update for everyday ChatGPT use with a more advanced model for professional work, reasoning, web research, and agent workflows.

GPT-5.4 Thinking & GPT-5.3 Instant Leunch – Key Points

The Story

OpenAI first rolled out GPT-5.3 Instant as an update to ChatGPT’s default model, focused on making conversations less awkward, less preachy, and more directly useful. The company later released GPT-5.4 in ChatGPT as GPT-5.4 Thinking, alongside GPT-5.4 Pro, positioning it as a more capable and efficient frontier model for professional and developer work. OpenAI says GPT-5.3 Instant improves tone, reduces unnecessary refusals, and lowers hallucination rates, while GPT-5.4 adds stronger reasoning, computer use, deep web research, and benchmark performance. The sequence highlights OpenAI’s effort to address both everyday user complaints and higher-end workflow demands.

The Facts

GPT-5.3 Instant updated ChatGPT’s default model
OpenAI rolled out GPT-5.3 Instant as the default ChatGPT model, with the update centered on smoother everyday conversations rather than entirely new capabilities. It is also available in the API as gpt-5.3-chat-latest.
OpenAI said GPT-5.3 Instant reduces overly cautious refusals, unnecessary disclaimers, defensive or moralizing preambles, and phrasing that users experienced as stiff, preachy, or awkward.
OpenAI said the model better balances web information with its own knowledge and reasoning, is less likely to overindex on search results, and performs better on imaginative and immersive writing tasks. The company also said tone and non-English naturalness, including in Japanese and Korean, remain ongoing areas of work.
OpenAI framed GPT-5.3 Instant as a direct response to user complaints
The company publicly described the update as “More accurate, less cringe” and said, “We heard your feedback loud and clear,” after feedback that earlier versions sometimes declined safe questions or buried answers under long caveats.
OpenAI reported lower hallucination rates for GPT-5.3 Instant
On OpenAI’s higher-stakes internal evaluation, hallucination rates fell by 26.8% with web use and 19.7% without web use. On an internal user-feedback evaluation based on de-identified conversations flagged as factual errors, hallucinations fell by 22.5% with web use and 9.6% without web access.
Transition dates are now clearer
GPT-5.2 Instant remains available for paid users under Legacy Models and is scheduled to retire on June 3, 2026. GPT-5.2 Thinking will remain available for three months for paid users and is scheduled to retire on June 5, 2026.
OpenAI then released GPT-5.4, GPT-5.4 Thinking, and GPT-5.4 Pro
GPT-5.4 launched after GPT-5.3 Instant. In ChatGPT it appears as GPT-5.4 Thinking, while GPT-5.4 Pro is aimed at users who want maximum performance on complex tasks.
GPT-5.4 is positioned as OpenAI’s most capable and efficient frontier model for professional work
OpenAI said GPT-5.4 combines its recent advances in reasoning, coding, and agentic workflows into a single model, incorporating the coding strengths of GPT-5.3-Codex while improving work across spreadsheets, presentations, documents, tools, and software environments.
In ChatGPT, GPT-5.4 Thinking can provide an upfront plan for longer queries so users can adjust direction mid-response. OpenAI also said it improves deep web research, especially on highly specific questions, while maintaining context better over longer reasoning chains.
GPT-5.4 expands agent capabilities with computer use, tool search, and a 1M-token context window
OpenAI described GPT-5.4 as its first general-purpose model with native computer-use capabilities and said it supports up to 1 million tokens of context. The company also introduced tool search, which reduced token usage by 47% on 250 tasks from Scale’s MCP Atlas benchmark while maintaining the same accuracy.
OpenAI reported stronger benchmark performance for GPT-5.4
On GDPval, which tests well-specified knowledge work across 44 occupations, GPT-5.4 reached 83.0% versus 70.9% for GPT-5.2. OpenAI also reported 57.7% on SWE-Bench Pro, 75.0% on OSWorld-Verified versus 47.3% for GPT-5.2, 82.7% on BrowseComp versus 65.8% for GPT-5.2, and 54.6% on Toolathlon versus 45.7% for GPT-5.2.
Relative to GPT-5.2, OpenAI said GPT-5.4 is 33% less likely to make false claims and its full responses are 18% less likely to contain any errors on de-identified factual-error prompts. On internal spreadsheet modeling tasks, GPT-5.4 scored 87.3% versus 68.4% for GPT-5.2, and human raters preferred its presentations 68.0% of the time. OpenAI’s system card also says GPT-5.4 Thinking is the first general-purpose model in the series deployed with mitigations for High cyber capability. In the API, GPT-5.4 is priced above GPT-5.2 at $2.50 per million input tokens versus $1.75.

Timeline / What Changed

March 3, 2026: OpenAI rolled out GPT-5.3 Instant as an update to ChatGPT’s default model.
Later: OpenAI released GPT-5.4, GPT-5.4 Thinking, and GPT-5.4 Pro across ChatGPT, Codex, and the API.
June 3, 2026: GPT-5.2 Instant is scheduled to retire after its transition period for paid users.
June 5, 2026: GPT-5.2 Thinking is scheduled to retire after its three-month legacy period for paid users.

Background / Context

GPT-5.3 Instant was designed to fix a usability problem: users had complained that ChatGPT could feel overly cautious, preachy, or emotionally off in routine conversations. GPT-5.4 shifts the focus toward professional output, reasoning efficiency, deep research, computer use, and measurable gains on knowledge-work benchmarks, especially in spreadsheets, presentations, and document-heavy tasks. Together, the two releases show OpenAI splitting its product story between everyday chat quality and higher-end work performance.

Why This Matters

The sequence of releases shows that AI competition is no longer just about headline capability. OpenAI is now optimizing for two pressures at once: whether ChatGPT feels natural and useful in daily conversation, and whether newer models can complete more complex work accurately, efficiently, and with less supervision. That matters because product adoption increasingly depends on both.

This article was drafted with the assistance of generative AI. All facts and details were reviewed and confirmed by an editor prior to publication.