Back to browse
Nexus Gateway – Reduce LLM API Costs Using Semantic Caching

Nexus Gateway – Reduce LLM API Costs Using Semantic Caching

by Sunnyanand_dev·Mar 5, 2026·2 points·1 comment

AI Analysis

MidShip ItSolve My Problem

Semantic caching for LLM APIs exists (Anthropic prompt caching, Langchain, Miniplex, vLLM); gateway routing is table stakes.

Strengths
  • Multi-provider BYOK (Bring Your Own Key) removes vendor lock-in—genuine customer control
  • Vector-based semantic cache with configurable thresholds is a sound technical approach; claims 40–70% cost reduction
  • Sub-millisecond routing overhead and SOC2 Type II certification signal production-readiness
Weaknesses
  • Semantic caching itself is not novel—Anthropic's native prompt caching, vLLM, and smaller competitors (Miniplex, Helicone) already ship this
  • 'Universal router' for 200+ models sounds marketing-first; actual routing logic and failover strategy undefined in public docs
Target Audience

Application developers using multiple LLM providers, cost-conscious AI teams, infrastructure engineers

Similar To

Anthropic prompt caching · vLLM · Helicone

Post Description

Hi HN,

I'm building Nexus Gateway, an AI gateway that helps developers reduce LLM API costs.

Problem: Many applications send repeated or semantically similar prompts to LLMs, which leads to unnecessary API calls and higher costs.

Solution: Nexus Gateway uses semantic caching to detect similar prompts and serve cached responses instead of calling the LLM again.

Features: • Semantic caching to reduce repeated API calls • Multi-model support (OpenAI, Gemini, Llama, Anthropic) • BYOK support • PII protection and sovereign AI layer (in progress)

Goal: Reduce LLM costs by 40–70% while improving latency.

I’d really appreciate feedback from the community.

Website: https://www.nexus-gateway.org

Similar Projects

Infrastructure●●Solid

LLM-Gateway – Zero-Trust LLM Gateway

Zero-trust networking via zrok beats LiteLLM when your GPUs sit behind NAT.

Big BrainSolve My Problem
michaelquigley
712mo ago