Automating Bank Deposit Slip Processing with Transformer-Based OCR
How we designed an OCR pipeline that turns handwritten bank deposit slips and investment forms into structured records automatically, without changing frontline workflows.

The problem
Banks still process handwritten paperwork every day. Deposit slips and investment forms land on operations desks and get read, interpreted, and entered into downstream systems by hand.
That creates three problems at once: processing is slow, labor costs stay high, and small reading mistakes turn into operational risk.
We wanted to build a better path — one that doesn't force workflow resets. Replacing forms, retraining staff, or asking customers to change behavior creates more friction than value. The goal was automation that fits how banks already operate.
How it works

The solution
We built a transformer-based OCR pipeline that reads handwritten deposit slips and investment forms automatically, extracts the required fields, and outputs clean structured data ready for any downstream system.
The system accepts uploads one image at a time or in bulk. Once a form is submitted, the pipeline performs recognition, validates the extracted fields, and passes the result downstream without a manual data-entry step in the middle.
Why transformer-based OCR instead of an LLM? Privacy and security. Financial documents contain sensitive customer data — account numbers, personal identifiers, transaction amounts. Sending that to a third-party LLM API is a non-starter for most institutions. By running a self-hosted transformer model, all document data stays inside the organization's own infrastructure. No external calls. No data leaving the perimeter.
This matters because the win is not just higher automation. It's automation that fits how financial institutions actually operate — including their security posture.
Why the IAM dataset made sense
Real bank forms aren't available for model training — governance around sensitive financial data is tight, and that's not going to change.
Instead of waiting for ideal conditions, we fine-tuned the model on the IAM Handwriting Dataset, a well-established benchmark for real-world handwriting recognition.
That was an engineering decision, not a compromise. The dataset includes broad handwriting variation, which helps the model generalize across different handwriting styles better than a narrow internal sample likely would.
What changed
Key outcomes
- Manual data entry eliminated from both deposit and investment form workflows.
- Bulk processing enables high-volume days without adding headcount.
- The OCR model handles varied handwriting styles without forcing any handwriting standardization upstream.
- Production-ready pipeline with a single maintainable architecture rather than fragmented form-specific logic.
Tech stack
Reflection
The hard part here was not just building OCR. It was building OCR that works credibly without access to proprietary training data.
Using a public benchmark dataset, then engineering the pipeline around validation and structured extraction, proved the broader point: sensitive workflows do not always require sensitive data to unlock meaningful automation.
The result is a cleaner operating model and a simpler technical surface area to maintain over time.
If your team is still keying data in by hand, the bottleneck is no longer operational. It's architectural.
Part of Inventokit's Thinking series — what we see, what we believe, and what we've learned.