GitHub Repository

Okapi is an observability stack. It ingests telemetry using OTLP, exposes queries via PromQl, stores traces and spans. Okapi can we used to build dashboards, view metrics and store and analyze traces in distributed systems.

16 starsJava

Okapi yet Another Observability Thing

Name: Okapi yet Another Observability Thing
Availability: InStock
Author: kushal2048

by kushal2048·Mar 12, 2026·2 points·0 comments

Visit Project View on HN

AI Analysis

●MidShip It

Yet another observability stack when Grafana and Honeycomb already dominate the market.

Strengths

•OTLP-native ingestion works with any OpenTelemetry collector
•PromQL compatibility means existing Grafana dashboards work unchanged
•YAML dashboards as code enables GitOps workflows

Weaknesses

•Observability market is extremely crowded with established players
•AI SRE agent doesn't meaningfully differentiate from existing solutions

Post Description

Hi, I just wanted to share Okapi with the community here. Its my stab at the observability problem - how to debug when something breaks in prod. Okapi focusses on two modalities of observability's big three (metrics, logs, traces). Okapi is all about metrics and traces. The idea here is that metrics and traces usually have the most signal and also that analyzing logs is so fragmented that one might just spend ages doing just logs stuff.

Features: - Otel everywhere, its Okapi's preferred and only ingestion mechanism :) Currently Okapi supports ingestion via protobuf-over-HTTP. Here's a sample config (https://github.com/okapi-core/okapi?tab=readme-ov-file#examp...)

- Dashboards both via clicks and code: Okapi UI has a dashboard designer hopefully with autocomplete everywhere so users don't have to guess metric paths. However, if you're not a fan of clicks and/or love GitOps all Okapi dashboards can be expressed as YAML templates.

- Out-of-box Service health: For an application instrumented as per Otel conventions, Okapi has REDs as a first-class concept. Service health pages have RED breakdowns for the service, its sub-operations and dependent paths. The calculations are subject to applications being instrumented, but hopefully following a convention makes things easy.

- And of course AI : Okapi has a limited capability AI SRE agent affectionately called Oscar (supposed to be an okapi but no mascot yet). Calling it a full blown SRE is a stretch since its tough job. You can ask Oscar questions in natural language as you would any chatbot and it will try its best to answer. Atleast on integration tests, Oscar can fetch metrics, find traces given criteria and do a multiple step debugging that links query latencies with high CPU usage on hosts.

I am curious to hear feedback from the community so check it out.

TLDR: https://github.com/okapi-core/okapi?tab=readme-ov-file#quick...