Back to browse
GitHub Repository

Self-hosted mem0 MCP server for Claude Code. Run a complete memory server against self-hosted Qdrant + Neo4j + Ollama while using Claude as the main LLM.

93 starsPython

Persistent memory for Claude Code with self-hosted Qdrant and Ollama

by elvismdev·Feb 17, 2026·8 points·0 comments

AI Analysis

●●SolidNiche GemSolve My ProblemBig Brain

Claude Code memory server that auto-reads OAT tokens, routes LLM ops to local or cloud models.

Strengths
  • Zero-config auth: auto-detects and reads Claude's OAT token, eliminating manual API key setup friction.
  • Smart model routing: splits entity extraction to Gemini (85.4% accurate) and contradiction detection to Claude (100%), minimizing cost.
  • Well-documented setup with global and per-project MCP installation, clear integration instructions via CLAUDE.md.
Weaknesses
  • Solves a real but narrow problem: only useful to Claude Code users; doesn't apply to Cursor, Windsurf, or other IDEs.
  • High operational overhead: requires running Qdrant, Ollama, and optionally Neo4j; introduces significant deployment complexity.
Target Audience

Claude Code users who want session-persistent memory; developers running self-hosted AI infrastructure

Similar To

mem0ai (the library being wrapped) · Continue.dev context caching · Aider session context management

Post Description

I built an MCP server that gives Claude Code long-term memory across sessions, backed by infrastructure you control.

Every Claude Code session starts from zero, no memory of previous sessions. This server uses mem0ai as a library and exposes 11 MCP tools for storing, searching, and managing memories. Qdrant handles vector storage, Ollama runs embeddings locally (bge-m3), and Neo4j optionally builds a knowledge graph.

Some engineering details HN might find interesting:

- Zero-config auth: auto-reads Claude Code's OAT token from ~/.claude/.credentials.json, detects token type (OAT vs API key), and configures the SDK accordingly. No separate API key needed. - Graph LLM ops (3 calls per add_memory) can be routed to Ollama (free/local), Gemini 2.5 Flash Lite (near-free), or a split-model where Gemini handles entity extraction (85.4% accuracy) and Claude handles contradiction detection (100% accuracy).

Python, MIT licensed, one-command install via uvx.

https://github.com/elvismdev/mem0-mcp-selfhosted

Similar Projects

CogmemAi – Persistent Memory for Claude Code via MCP

Runs extraction and search server-side so your local MCP is a tiny HTTP client — no local DBs, no giant RAM leaks, and an easy npx install and .mcp.json or global MCP registration. It exposes clear tools (save_memory, recall_memories, extract_memories, get_project_context) and adds project-scoped + global preferences — a pragmatic fix for Claude Code's tiny flat-file memory. The tradeoff is obvious: usefulness depends on the hosted API (privacy, uptime, cost), and the repo looks early-stage with minimal commits and docs beyond the quickstart.

Niche GemShip ItSolve My Problem
hifriendbot
204mo ago