I made a tool to find invoices in my email and match them to payments
Yet another AI bookkeeping tool in a space QuickBooks already owns.
AI enabled insights from emails, calendars, contacts, files, Slack, databases, web... Fast, private and local. Launching soon!
Local LLM email parsing when Plaid and receipt scanners already exist.
Privacy-conscious users tracking personal finances
Plaid · Receipt Bank · Hubdoc
I initially started with Google Gemini 3 Flash but I switched to Ollama + Ministral 3:3b. The extraction is not exhaustive and there is much to improve but this is working.
dwata runs locally, runs a web backend and the gui runs in browser. Connects to emails, downloads them. Then we can run the financial template detection. It checks for similar looking emails, grouped by sender. Then sends a sample from each cluster to LLM agent. The LLM is asked to find out the parts of text that look like the data we are looking for. dwata then searches for the variables/values that LLM gave in the email, creates a template by replacing the data with template tags. Saves template to DB. dwata parse the data from each email when extracting data.
Roadmap: There is a long way to go, the extractor needs to work much, much, better. dwata will also work on files soon (bank/CC statements).
I want to extract vendors, businesses, contacts, events, places, etc. Connect to different APIs and process everything locally.
dwata will be able to download and process data from Hacker News API too (or other similar sources) - extract entities you care about.
Eventually, only use Ollama/Llama.cpp with models that fit 6-8GB graphics cards or 16GB unified memory only!!
Yet another AI bookkeeping tool in a space QuickBooks already owns.
Automates invoice chasing but costs $19/month in a crowded market.
Extracts recipes from TikToks/YouTube locally—no cloud, no subscription, just yours.
Tree-sitter extraction cuts LLM context 50-tokens-to-8 tokens. Cursor and Cody ignore this.
OTP auto-extraction is useful, but Mailtrap and TempMail APIs already exist.
Tree-sitter interface extraction cuts token usage by 6x, but chat context window optimization is becoming table stakes.