Back to browse
GitHub Repository

Unified management and routing for llama.cpp, MLX and vLLM models with web dashboard.

130 starsGo

Llamactl – Self-hosted LLM manager with OpenAI-compatible routing

by lordmathis·Mar 17, 2026·2 points·0 comments

AI Analysis

●●SolidSolve My ProblemNiche Gem

Multi-backend LLM manager when Ollama and LM Studio already handle this.

Strengths
  • Single dashboard controls llama.cpp, MLX, and vLLM instances across different hardware
  • Built-in HuggingFace downloader with in-browser preset configuration editor
  • Multi-node support distributes inference workloads across multiple hosts
Weaknesses
  • Crowded category with Ollama, LM Studio, and Continue offering similar management
  • Requires manual backend installation before llamactl can manage anything
Category
Target Audience

Developers running self-hosted LLM infrastructure

Similar To

Ollama · LM Studio · LocalAI

Post Description

Llamactl is a unified management system for running local LLMs across llama.cpp, MLX, and vLLM backends, with a web dashboard and OpenAI-compatible API.

I originally built this because I got tired of constantly SSHing to my server to edit a config just try out a new model. It's grown a lot since then.

What it does:

Web UI for creating and managing LLM instances from your browser

Full llama.cpp model lifecycle - download from HuggingFace, create preset.ini configs with an in-browser editor, load/unload models via router mode

Automatic idle timeout, LRU eviction, and instance limits

llama.cpp, mlx_lm and vllm backends

OpenAI and Anthropic API compatible endpoints (backend-dependent)

Multi-node support for distributing instances across hosts

Inference API keys with per-instance access control

docs: https://llamactl.org/stable/

Similar Projects

Infrastructure●●Solid

LLM-Gateway – Zero-Trust LLM Gateway

Zero-trust networking via zrok beats LiteLLM when your GPUs sit behind NAT.

Big BrainSolve My Problem
michaelquigley
712mo ago
AI/MLMid

Mega LLMs – Universal AI chat client for any OpenAI-compatible API

Plug any OpenAI-compatible provider into a single UI, switch models mid-session, and run side-by-side comparisons while tracking usage — everything you'd expect from a multi-model chat client. The design is eye-catching and the web/desktop split suggests a real app, but this is a crowded niche; the product will live or die on stability of provider integrations, context/memory handling, and clear privacy controls.

SlickSolve My Problem
p32929
204mo ago