Home AI Tools and Trends

OpenAI adds Websocket mode for faster AI agents

May 8, 2026

OpenAI has rolled out a significant update to its responses API, introducing a WebSocket-based execution mode designed to dramatically reduce latency in agentic workflows. This strategic shift aims to accelerate performance for coding agents and real-time AI systems by replacing traditional HTTP request-response patterns with persistent, bidirectional connections.

Key Update Details

Developer: OpenAI
Feature: WebSocket-based execution mode for Responses API
Purpose: Reduce latency and improve throughput in agentic workflows
Status: Released in alpha after a two-month cycle to selected partners
Impact: Up to 40% latency reduction in early production use

OpenAI Boosts AI Agent Speed with WebSocket Mode

The new WebSocket mode addresses a critical bottleneck in modern AI agent development. Agentic workflows, which involve multiple steps like tool calls, intermediate reasoning, and follow-up queries, previously suffered from repeated network round-trip times due to the stateless nature of HTTP. As AI inference speeds have improved, these network delays became a dominant source of latency and operational complexity.

By establishing a long-lived, bidirectional connection, the WebSocket-based execution mode allows for continuous data exchange without the overhead of repeated handshakes. This architectural change supports streaming responses and faster tool execution, streamlining the coordination of multi-step workflows. OpenAI noted that early production use has demonstrated up to a 40% latency reduction and enhanced throughput in high-concurrency scenarios.

Addressing the Latency Challenge in Multi-Step AI Workflows

This update reflects a broader industry focus on optimizing the transport layer in agentic systems, where communication patterns significantly influence overall performance. The approach aligns with event-driven design principles common in distributed systems, where maintaining state across interactions boosts responsiveness and throughput. Ofek Shaked, a Vibe Coder, praised the change, stating, WebSockets for agent state is such an obvious but huge win. No more cold starts killing your multi-tool chains.

OpenAI’s early production results highlight sustained throughput of approximately 1,000 transactions per second (TPS), with bursts reaching up to 4,000 TPS. Developer tooling and coding agent platforms have quickly adopted the new mode. Vercel, for instance, integrated the WebSocket mode into its AI SDK and reported a 40% latency reduction. Similarly, Cline observed a 39% improvement in multi-file workflows, while Cursor saw gains of up to 30%. These figures underscore the impact of system-level optimizations beyond model improvements. Gabriel Chua, a DX Engineer at OpenAI, confirmed the feature’s flexibility, noting, You can warm up the connection by sending your system prompt and tool definitions first. It’s Zero Data Retention (ZDR) compatible.

A Return to Stateful Connections for Next-Gen AI

From an implementation standpoint, developers can now replace numerous HTTP calls with a single persistent session, simplifying orchestration logic and reducing connection setup overhead. This also enhances support for streaming applications, such as incremental code generation and interactive reasoning. Kevin Cho, an engineer at Microsoft, commented that this approach signals Going back to the original software stack problems. websockets and stateful connections. The shift introduces new design considerations, including connection lifecycle management and backpressure under high concurrency, aligning with established patterns for stateful distributed systems. OpenAI launched the feature in alpha, with partners like Codex already migrating most of their Responses API traffic to the WebSocket mode, signaling its readiness for broader adoption.

Follow Hashlytics on Bluesky, LinkedIn , Telegram and X to Get Instant Updates

OpenAI adds Websocket mode for faster AI agents

Key Update Details

OpenAI Boosts AI Agent Speed with WebSocket Mode

Addressing the Latency Challenge in Multi-Step AI Workflows

A Return to Stateful Connections for Next-Gen AI

LEAVE A REPLY Cancel reply

Join the conversation

Key Update Details

OpenAI Boosts AI Agent Speed with WebSocket Mode

Addressing the Latency Challenge in Multi-Step AI Workflows

A Return to Stateful Connections for Next-Gen AI

RELATED ARTICLESMORE FROM AUTHOR

OpenAI Plans AI Agent Phone Launch in 2027

Brockman Rebuts Musk’s OpenAI History in Trial

OpenAI ChatGPT Images 2.0 Gains Traction Across India

LEAVE A REPLY Cancel reply

Join the conversation

RELATED ARTICLES MORE FROM AUTHOR