Back to browse
GitHub Repository

Retrieval system over the arXiv corpus

3 starsJupyter Notebook

ArXiv Scholar – An Open-Source RAG System for AI Research Papers

by dubeyaayush07·Jun 16, 2026·2 points·0 comments

AI Analysis

●●SolidNiche GemBig Brain

Hybrid search over 5,600 papers when Elicit and Semantic Scholar already exist.

Strengths
  • No LangChain abstraction means transparent failure modes and full architectural control.
  • ML query router decomposes complex queries into sub-queries with metadata filters.
  • Crash-safe ingestion with JSON cursor for resumable batch processing across arXiv folders.
Weaknesses
  • Reranker disabled by default due to performance degradation on current corpus size.
  • Academic paper search is crowded with Elicit, Consensus, and Semantic Scholar.
Category
Target Audience

AI researchers and engineers searching academic papers

Similar To

Elicit · Semantic Scholar · Consensus

Post Description

Try Search: https://ethereal-agents.space/search.html

Technical Blog: https://ethereal-agents.space/blog/launching-arxiv-scholar.h...

We'd love feedback on the retrieval quality, user experience, and overall approach.

Similar Projects