← All projects

Project 02 Planned

Email → Invoice → Journal Entry Bot

Read a tax invoice, draft the journal entry — and, more importantly, flag the entries it shouldn't post on its own.


For: SME accountants and Big-4 audit teams · Planned — Months 2–3.

The problem


Indian SMEs spend roughly six hours per accountant per week downloading invoices from email and keying them into Tally. It is repetitive, and the repetition is exactly where errors creep in.

What it does


Upload a sample invoice PDF and watch it parse vendor, GSTIN, line items and the GST split, then emit a Tally-importable XML and an Excel JE template. Image-only invoices fall back to a vision model.

The interesting part is not the entries it drafts but the ones it refuses to draft — the cases it routes to a human because it isn't sure of the ledger or the tax treatment.

  • ≥90% accuracy target on 50 synthetic invoices
  • Claude Haiku vision for image-only PDFs
  • Architecture diagram included; trivially extends to SAP coding blocks

Design choices


A model that books to the wrong ledger with total confidence is a liability on month-end data. So accuracy is measured against fifty synthetic invoices with a stated target, and the architecture makes the confidence boundary explicit rather than hiding it.

The same pattern that maps to a Tally ledger maps to an SAP coding block — which is the bridge from the SME story to the MNC FP&A one.

Stack


PythonpdfplumberClaude HaikuStreamlit

Built and demonstrated on synthetic data only.