I built a sub-500ms latency voice agent from scratch
Outperformed Vapi 2× on latency by treating voice as turn-taking, not transcription.

Multi-speaker voice model with natural interruption and barge-in prevention, genuinely different from turn-taking chatbots.
Enterprise meeting platforms, customer service teams, sales call centers
ChatGPT Realtime · Google Gemini Live · Recall.ai
Outperformed Vapi 2× on latency by treating voice as turn-taking, not transcription.
Replicates Thinking Machines' multimodal demo on a CPU laptop with commodity models.
The repo actually solves the messy plumbing of live voice agents: modular ASR→LLM→TTS adapters plus an optional PersonaPlex speech-to-speech path, per-agent env overrides, and a Playwright-driven Jitsi bot for room joining. It's a useful MVP for anyone prototyping AI co-hosts, though mixing backends is still manual and PersonaPlex demands extra infra, so it's more pragmatic experiment than turnkey product.
Another AI sales researcher when Clay and Apollo already dominate this space.
Voice-controlled desktop agent with memory and real GUI control, but needs more examples than a warning.
Yet another website chatbot, but with a mascot skin and voice.