GitHub Repository

An open-source AI Voice Agent that integrates with Asterisk/FreePBX using Audiosocket/RTP technology

1,063 starsPython

Ava – AI Voice Agent for Traditional Phone Systems(Python+Asterisk/ARI)

Name: Ava – AI Voice Agent for Traditional Phone Systems(Python+Asterisk/ARI)
Availability: InStock
Author: hkjarral

by hkjarral·Mar 12, 2026·7 points·1 comment

Visit Project View on HN

AI Analysis

●●SolidNiche GemBig Brain

Adds AI voice to legacy Asterisk systems without ripping out existing telephony.

Strengths

•Modular pipeline architecture lets you mix STT, LLM, and TTS providers freely.
•Five production-ready golden baselines validated for enterprise deployment.
•987 GitHub stars with Docker setup and admin UI running in 2 minutes.

Weaknesses

•Requires existing Asterisk/FreePBX infrastructure, limiting total addressable market.
•AI voice agent space increasingly crowded with Retell, Vapi, Bland AI.

Post Description

Hi HN, I'm the creator of AVA - AI Voice Agent for Asterisk

My repo was shared here once before by someone else so I wanted to follow up with the progress since then.

https://news.ycombinator.com/item?id=46380399

I've been working with Asterisk/FreePBX systems for years. I wanted to add AI voice capabilities to legacy phone systems without paying per-minute SaaS fees or ripping out the entire telephony stack.

So I built AVA, a self-hosted AI voice agent that can integrate into any traditional phone system. While most solutions demand expensive migrations to cloud-only providers, AVA provides a self-hosted path to connect AI agents to existing phone systems while ensuring data privacy and lowering operational costs

AVA is a Dockerized Python app that sits alongside your Asterisk server. It connects via ARI (Asterisk REST Interface) and routes call audio to AI providers — OpenAI Realtime, Deepgram, Google Live API, ElevenLabs, Telnyx, or fully local models (Vosk + llama.cpp + Piper). You can mix and match STT/LLM/TTS in a modular pipeline, or use a single provider end-to-end.

Two audio transport paths: We support both AudioSocket (low-latency TCP with TLV framing) and ExternalMedia RTP (UDP, better for NAT). A transport orchestrator auto-negotiates sample rates and codecs between what Asterisk sends on the wire and what each AI provider expects — so you can run 8kHz ulaw from Asterisk into a provider that wants 24kHz linear16 without manual config.

Session lifecycle: A typed session store tracks every call from StasisStart through hangup — audio diagnostics, barge-in counts, provider state, conversation turns. Every call is fully observable and debuggable after the fact.

Barge-in and VAD were the hardest problems. We use a dual-mode VAD — WebRTC VAD combined with energy-based RMS detection, scored into a single confidence value (40% WebRTC weight, 40% energy ratio, 20% agreement bonus). Frame smoothing prevents single-frame glitches from triggering false interrupts. When barge-in fires, we kill active playback (both streaming and file-based) via ARI, flush provider audio buffers, release conversation gating tokens, and optionally suppress provider output for a configurable window to prevent pre-barge audio from re-queuing. The system supports three interrupt sources: local VAD, Asterisk's native talk detection events, and provider-side interruption signals.

The hardest latency challenge was bridging legacy SIP/RTP with modern WebSocket streams. We use a two-container architecture: a lightweight orchestrator for ARI state management and an optional heavier container for local model inference. There are 6 pre-validated golden baseline configs if you just want something working out of the box, plus an Admin UI for visual setup.

Try the live demo: (925)-736-6718 Option 5 for Google, 6 for Deepgram, 7 for Openai realtime, 8 for Local hybrid and 9 for Elevenlabs

Code is MIT. I'd love feedback on the transport layer (src/core/transport_orchestrator.py) and the VAD tuning (src/core/vad_manager.py).

Similar Projects

Developer Tools●●Solid

Scitex-notification – Give AI agents a voice: TTS, phone calls, SMS

Seven unanswered audio alerts trigger a phone call — works through iPhone Silent Mode.

Ship ItNiche Gem

ywatanabe1989

113mo ago

Security●●Solid

VoiceGoat – A vulnerable voice agent for practicing LLM attacks

CTF-style flags for voice prompt injection make learning LLM security actually fun.

Niche GemRabbit Hole

xmhatx

1411mo ago

AI/ML●●●Banger

Sanna – OpenClaw for your phone. Open-source voice AI agent for Android

Voice agent that actually reads WhatsApp and controls Android—OpenClaw for your pocket.

Zero to OneSolve My ProblemShip It

sannabot

103mo ago

AI/ML●Mid

WordPress for Voice Agents – Unpod.ai

Voice agent orchestration with no-code studio, but orchestrates off-the-shelf APIs like everyone else.

Ship It

parvbhullar

11103mo ago

AI/ML●●●Banger

Cheap-IM – CPU-only voice agent approximating Thinking Machines' demo

Runs real-time vision-keyed voice agents on a laptop CPU without custom silicon or training.

Big BrainWizardryDark Horse

mrkn1

4028d ago

SaaS●Mid

Talentpluto, a voice AI agent connecting GTM talent and startups

Polished product, but recruiter bots and warm intro networks already solve this.

SlickCrowd Pleaser

pipervw

203mo ago