Smelt – Extract structured data from PDFs and HTML using LLM
LLM infers schema once, Go does 10k-row extraction—avoids token waste.

Deterministic Rust AST parsing for offline media extraction—verifiable via SHA-256 for legal discovery.
Digital forensics consultants, legal teams, SOC/SOAR automation engineers, air-gapped labs
Shodan (reconnaissance) · OSINT Framework (collection) · Burp Suite (offline analysis features)
LLM infers schema once, Go does 10k-row extraction—avoids token waste.
Per-span confidence scores let you review uncertain OCR before trusting 200k-page runs.
Tree-sitter interface extraction cuts token usage by 6x, but chat context window optimization is becoming table stakes.
Deterministic verification loop makes 3.8B models match 7x larger ones for structured extraction.
Clean API design but JinaAI and Firecrawl already dominate this crowded scraping category.
Invoice extraction with real ERP integrations, but OCR+sync already has competition.