Back to browse
GitHub Repository

Review-oriented DOCX extraction toolkit for Rust

4 starsRust

Review-oriented DOCX extraction toolkit for Rust

by nistuley·Jun 4, 2026·2 points·0 comments

AI Analysis

●●SolidNiche GemSolve My Problem

Extracts tracked changes and comment threads when most DOCX parsers only grab text.

Strengths
  • Parses tracked changes with insert, delete, move, and format change detection
  • Comment threading and resolved state from commentsExtended.xml
  • Both CLI and Rust crate with JSONL streaming output
Weaknesses
  • Niche audience limits broader adoption beyond legal and review workflows
  • Established parsers like python-docx could add these features
Target Audience

Rust developers building document review automation or legal tech workflows

Similar To

python-docx · docx4j · mammoth.js

Similar Projects

AI/ML●●Solid

DocMason – Agent Knowledge Base for local complex office files

Provenance-first RAG beats anonymous text chunks, but Cursor and Continue already own this space.

Big BrainNiche Gem
Jet_Xu
1102mo ago
AI/ML●●Solid

ProofPudding – Document Extraction API with Citations (PDF/Docx)

ProofPudding returns extraction results with explicit links back to the exact page and source text, supports native and scanned PDFs plus DOCX/images, and ships Python/TypeScript SDKs — handy for agents that need auditable facts. It’s a pragmatic product (per-extraction pricing and confidence scores are nice), but the market is crowded; I want clarity on underlying models, real-world accuracy numbers, and how it compares to Document AI/Textract in edge cases.

Solve My ProblemSlick
garai
104mo ago