All Topics

Technology

Art

OpenAI Launches WebSocket Mode for Responses API to Reduce Latency by 40%

Rohan Chaubey

3mo ago· 1 min readenProduct

38/100

Stale

Bagelometer↗

Leave it on the tray for the seagulls.

Score38Typepress releaseSentimentpositive

Summary

OpenAI has introduced WebSocket Mode for its Responses API, which maintains persistent connections to reduce latency by up to 40% in AI agent workflows. Instead of resending full context with each agent turn, the system only sends incremental inputs, significantly improving efficiency for heavy tool-call operations.

Key quotes

· 3 pulled

Every agent turn, you're resending the full context. Again. That overhead compounds fast.

WebSocket Mode for the Responses API keeps a persistent connection, sends only incremental inputs, and cuts end-to-end latency by up to 40% on heavy tool-call workflows.

That overhead compounds fast.

Snippet from the RSS feed

Every agent turn, you're resending the full context. Again. That overhead compounds fast. WebSocket Mode for the Responses API keeps a persistent connection, sends only incremental inputs, and cuts end-to-end latency by up to 40% on heavy tool-call workfl

You might also wanna read

How OpenAI rebuilt its WebRTC stack for low-latency voice AI at scale

OpenAI rearchitected its WebRTC stack to address three key constraints for real-time voice AI: low-latency audio delivery, global scale, and

openai.com·27d ago

OpenAI Launches Cloud-Based Workspace AI Agents for Business and Education Plans

OpenAI is introducing cloud-based "workspace" AI agents for its Business, Enterprise, Edu, and Teachers plan users. These agents can perform

The Verge·1mo ago

OpenAI Launches Workspace Agents in ChatGPT for Team Automation

OpenAI introduces workspace agents in ChatGPT, which are Codex-powered shared agents that can handle complex tasks and long-running workflow

openai.com·1mo ago