Introduction to OneBudd STS - OneBudd Documentation

Learn about OneBudd's Speech-to-Speech infrastructure and how it powers real-time voice AI applications.

What is OneBudd STS?

OneBudd STS (Speech-to-Speech) is a unified voice AI infrastructure platform that enables real-time, bidirectional voice conversations with AI. Our platform orchestrates the complete voice pipeline — from speech recognition through LLM processing to natural speech synthesis — all through a single WebSocket connection.

Whether you're building a customer service agent, a voice assistant, or an interactive AI companion, OneBudd STS provides the low-latency infrastructure to make conversations feel natural and responsive.

Key Features

Ultra-Low Latency

Sub-500ms end-to-end response times for natural, real-time conversations.

Multi-Language

Support for multiple languages with automatic detection and switching.

Barge-In Support

Users can interrupt the AI mid-response for natural turn-taking.

Streaming TTS

Real-time audio streaming with immediate playback as speech is generated.

LLM Orchestration

Built-in integration with leading language models for intelligent responses.

Multi-SDK Support

Official SDKs for JavaScript, Python, and Go with consistent APIs.

Architecture Overview

OneBudd STS uses a WebSocket-based architecture for real-time bidirectional communication:

Client (Browser/App)↓ WebSocket ↓OneBudd STS Server↓

STT→ LLM →TTS

↓Audio Response

Latency Targets

STT Processing: ~100-150ms per audio chunk
LLM Response: ~200-300ms first token
TTS Generation: ~100-150ms to first audio
Total End-to-End: <500ms for natural conversation flow

Next Steps

Ready to get started? Follow our Quick Start Guide to set up OneBudd STS and create your first voice AI application in minutes.