Smart IDP / Invoice Analyser
AI-powered invoice processor using Claude Sonnet + AWS Textract with template-driven OCR and Excel export.
The Challenge
Finance teams at mid-sized companies were spending 3โ4 hours per day manually keying data from supplier invoices โ PDFs, scanned images, and email attachments โ into spreadsheets. With no structured pipeline, a single mis-keyed figure could propagate through month-end reconciliation undetected for weeks.
The problem compounded at scale: the client processed 800โ1,200 invoices monthly across 12 supplier formats, each with different field layouts, currency conventions, and line-item structures. Existing OCR tools extracted raw text but had no understanding of context โ they could not distinguish a PO number from an invoice number, or correctly parse multi-line itemised bills. Compliance audits also required a full extraction trail, which manual entry simply could not provide.
What We Built
We architected a two-stage extraction pipeline: AWS Textract handles the low-level document parsing (bounding boxes, table detection, confidence scores), and Claude Sonnet 3.5 sits on top as an intelligent field-mapping layer. Each supplier format is captured once as a JSON template โ field names, positional hints, validation rules โ and the AI uses these templates to map Textract output to a canonical invoice schema regardless of layout variation.
The frontend is a React + TypeScript SPA backed by Supabase for auth, database, and Edge Functions. Edge Functions orchestrate the extraction jobs: they call Textract, pass the raw output and template to Claude via the Anthropic API with structured output prompting, validate the result against business rules, and persist the enriched record. Users can review AI confidence scores field-by-field, override incorrect values, and trigger a re-extraction with corrected hints that feed back into template refinement.
One-click Excel export was built using SheetJS โ the canonical schema maps directly to the client's existing reporting columns, so the downloaded file drops straight into their ERP import workflow with zero reformatting.
The Outcome
Invoice processing time dropped from an average of 4.2 minutes per document (manual) to under 25 seconds end-to-end, a 90%+ reduction. Across 1,000 monthly invoices, that freed approximately 60 staff-hours per month that were reallocated to exception handling and vendor relationship work.
Keying error rate fell to effectively zero for structured fields (vendor name, totals, dates, PO references) โ a measurable improvement over the previous ~2.3% manual error rate that had been causing downstream reconciliation failures. The audit trail built into every extraction record also allowed the finance team to pass their first ISO compliance review without a single finding related to invoice processing.