Introduction to OneBudd STS

    Learn about OneBudd's Speech-to-Speech infrastructure and how it powers real-time voice AI applications.

    What is OneBudd STS?

    OneBudd STS (Speech-to-Speech) is a unified voice AI infrastructure platform that enables real-time, bidirectional voice conversations with AI. Our platform orchestrates the complete voice pipeline — from speech recognition through LLM processing to natural speech synthesis — all through a single WebSocket connection.

    Whether you're building a customer service agent, a voice assistant, or an interactive AI companion, OneBudd STS provides the low-latency infrastructure to make conversations feel natural and responsive.

    Key Features

    Ultra-Low Latency

    Sub-500ms end-to-end response times for natural, real-time conversations.

    Multi-Language

    Support for multiple languages with automatic detection and switching.

    Barge-In Support

    Users can interrupt the AI mid-response for natural turn-taking.

    Streaming TTS

    Real-time audio streaming with immediate playback as speech is generated.

    LLM Orchestration

    Built-in integration with leading language models for intelligent responses.

    Multi-SDK Support

    Official SDKs for JavaScript, Python, and Go with consistent APIs.

    Architecture Overview

    OneBudd STS uses a WebSocket-based architecture for real-time bidirectional communication:

    Client (Browser/App)↓ WebSocket ↓OneBudd STS Server
    STT→ LLM →TTS
    Audio Response

    Latency Targets

    • STT Processing: ~100-150ms per audio chunk
    • LLM Response: ~200-300ms first token
    • TTS Generation: ~100-150ms to first audio
    • Total End-to-End: <500ms for natural conversation flow

    Next Steps

    Ready to get started? Follow our Quick Start Guide to set up OneBudd STS and create your first voice AI application in minutes.