The initial rollout is limited to a small group of trusted U.S.-based partners, following restrictions imposed by the Trump administration. The company has expressed reservations about this government access process becoming permanent, stating it keeps the best tools from developers and defenders who need them. Broader availability through ChatGPT, Codex, and the API is expected in the coming weeks.
What Sol Actually Does Better
GPT-5.6 Sol introduces two major capability upgrades. A new “max reasoning” mode gives the model more time to work through complex problems with greater accuracy. An “ultra mode” goes further, employing subagents to handle sophisticated workflows that typically exceed the capabilities of a single AI agent.
Performance improvements are most significant in three domains:
- Coding: Sol achieved state-of-the-art on TerminalBench 2.1, a benchmark for command-line workflows requiring planning and iteration
- Biology: Scored 53.5% on virology tests (up from 44.5% in GPT-5.5), with improvements across molecular biology and pathogen research
- Cybersecurity: Matched performance of competing frontier models on ExploitBench while using roughly one-third of the output tokens
The Safety Framework
All three models incorporate OpenAI’s most robust safety stack to date. Under OpenAI’s Preparedness Framework, Sol and Terra are classified as “High capability” in cybersecurity and biological/chemical risk, but do not reach the “Critical” threshold for AI self-improvement.
Tailored safeguards have been implemented for each model’s profile. OpenAI dedicated over 700,000 A100 GPU hours to automatically find universal jailbreaks, and testing revealed that while Sol and Terra can identify vulnerabilities and exploit pieces, they were unable to perform autonomous end-to-end attacks against hardened targets. Automated red teaming will run continuously during deployment.
The company also uses activation classifiers for sensitive domains that can intervene to stop unsafe answers during generation, plus real-time conversation scanning to block unsafe outputs and automated systems to identify unsafe patterns across conversations.
The Government Factor
The Trump administration is currently limiting widespread launch of GPT-5.6. This follows a June 2 executive order mandating a new process for benchmarking and assessing AI models before public release. Previously, the administration compelled Anthropic to remove access to Claude Fable 5 and Mythos 5 despite adherence to voluntary government review processes.
OpenAI stated the company is taking this short-term step because it believes it is the strongest path to broader availability while working with the administration to develop the cyber Executive Order framework and a repeatable process for future releases.
Three Models for Different Needs
Sol (flagship): Strongest capabilities for complex agentic workflows, coding, scientific research, and defensive security operations. Best for enterprises and developers pushing the boundaries of what AI can do.
Terra (balanced): Performance comparable to GPT-5.5 but at half the cost. Ideal for production applications where capability and cost need balance.
Luna (speed-focused): Fastest and most affordable option in the family. Designed for applications prioritizing quick responses over maximum capability.
Currently, the GPT-5.6 models are accessible through the API and Codex for select OpenAI partners and organizations. Wider availability in ChatGPT, Codex, and the API is planned for “soon,” with broader public access anticipated in the coming weeks once the government review process concludes.
The phased rollout reflects OpenAI’s approach to managing both capability and risk as frontier AI becomes more powerful and more widely deployed.
Follow Hashlytics on Bluesky, LinkedIn , Telegram and X to Get Instant Updates



