A tool to create and evaluate document processing pipelines for RAG

Name: A tool to create and evaluate document processing pipelines for RAG
Availability: InStock
Author: martimchaves

by martimchaves·Mar 27, 2026·2 points·0 comments

Visit Project View on HN

AI Analysis

●●SolidSolve My ProblemSlick

LLM-as-judge metrics beat guessing chunk sizes, but Ragas and LangSmith already exist.

Strengths

•Configurable pipeline stages (OCR → chunking → embedding) for systematic testing
•Transparent about LLM-as-judge limitations — calls it a compass, not GPS
•Side-by-side dataset comparison with precision, recall, MRR metrics

Weaknesses

•RAG evaluation is crowded (Ragas, Arize Phoenix, LangSmith all compete here)
•API integration details unclear — how do you actually lock and query datasets?

Post Description

Hey HN, I built [ragbandit](https://ragbandit.com), a tool to help you evaluate different document processing pipelines for the retrieval stage of your RAG systems.

I was a bit overwhelmed with the different ways that you can process documents to create embeddings for RAG, so I wanted to create a tool to experiment with different OCR models, refining the OCR results, different chunking methods, and different embedding models.

You can: - search processed documents in the playground - evaluate the retrieval results using an llm-as-judge (not perfect, but can be a useful signal) - compare different datasets (using aggregate metrics or by side by side comparison in the playground)

You can also manually inspect the results of each query, and of each intermediate document processing result.

To get a better idea, check out one of the use cases: https://ragbandit.com/use-cases/optimizing-insurance-documen...

To be completely fair, I haven't added that many options for the different stages of the document processing pipeline! There are tons of features that I'd like to add, but I've already spent quite a bit of time on this, so I'd really appreciate it if you could let me know if this is something that could be useful for you/you find interesting. Would you use something like this?

Tech stack: Postgres (with pgvector), fastapi, [ragbandit-core](https://github.com/MartimChaves/ragbandit-core) (the document processing core is open source), typescript with react, celery for background tasks (and redis as the broker).

It's currently a credits-based subscription with optional top-ups. You can get 1000 credits to try it out (I ask for card info for these 1000 credits as a spam filter).

Thanks, Martim