Nordlake Shipping AI Document Extraction
A Hamburg freight forwarder extracts and validates 13,400 bills of lading per week with a tuned OCR and LLM pipeline.
Impact
What changed.
Throughput
93% of inbound documents now commit to the TMS without human intervention. Operations team capacity expanded by 4.5x without adding headcount.
Error reduction
Downstream compliance flags from dangerous-goods and certificate-of-origin discrepancies fell 86%. Customs broker rejection rates dropped to near-zero on automated lanes.
Customer experience
Shipment-detail customer queries that used to take 20 minutes of operator time now resolve instantly through the portal. Customer NPS rose 22 points within the year.
The challenge
Before
Nordlake Shipping is a Hamburg-headquartered freight forwarder handling roughly 700,000 shipping documents a year — bills of lading, commercial invoices, packing lists, certificates of origin, dangerous-goods declarations. A 22-person operations team was keying data from these documents into the firm's TMS, with quality varying by document language, scan quality, and time pressure. Customer queries about shipment detail required pulling original documents and re-reading them under deadline. Compliance auditing depended on the same overworked team.
- 700,000 annual shipping documents keyed manually into the TMS
- 22-person operations team with quality varying by language and scan quality
- Customer shipment queries requiring original document pull and re-reading
- Bills of lading in eight languages with inconsistent layouts
- Dangerous-goods declaration errors triggering compliance flags downstream
- Certificate-of-origin discrepancies caught only at customs broker stage
- Audit sampling done manually with no systematic confidence signal
- New customer onboarding slow because document-template variation overwhelmed staff
The solution
What we built
We built a document extraction pipeline that combines a tuned OCR layer, layout-aware parsing, and language-model post-processing. Documents come in via email, EDI, customer portal, and partner feed in eight languages. The pipeline classifies the document type, extracts every structured field with confidence scoring, validates against business rules and reference data (HS codes, port codes, dangerous-goods classifications), and either commits to the TMS automatically (high-confidence, validated) or routes to a human operator queue with the extracted fields pre-filled (lower confidence or business-rule exceptions). Operators correct and confirm — those corrections feed back into the model tuning. Customer-facing shipment detail is now searchable instantly because every document is indexed structurally. Compliance review runs continuously against confidence and exception signals rather than as a sampling exercise.
Core workflow connections
How the system flows.
- Document Intake (email / EDI / portal / partner)Type Classification
- OCR + Layout ParseLanguage DetectionField Extraction
- Confidence ScoringBusiness Rule ValidationReference Data Check
- High ConfidenceAuto-commit to TMSShipment Updated
- Lower Confidence or ExceptionHuman Operator QueueCorrectionConfirmed
- Model Tuning LoopOperator CorrectionsContinuous Improvement
- Customer Shipment QueryStructured SearchDocument Detail Surfaced
- Compliance WatchContinuous Confidence + Exception Signals
- Multilingual extraction across eight document languages
- Dangerous-goods and certificate-of-origin validation against reference data
Process
How we built it.
Document Intake (email / EDI / portal / partner) → Type Classification
OCR + Layout Parse → Language Detection → Field Extraction
Confidence Scoring → Business Rule Validation → Reference Data Check
High Confidence → Auto-commit to TMS → Shipment Updated
Lower Confidence or Exception → Human Operator Queue → Correction → Confirmed
Model Tuning Loop → Operator Corrections → Continuous Improvement
Customer Shipment Query → Structured Search → Document Detail Surfaced
Compliance Watch → Continuous Confidence + Exception Signals
Multilingual extraction across eight document languages
Dangerous-goods and certificate-of-origin validation against reference data
Start a project
Drowning in multilingual shipping documents?
We build document extraction pipelines that respect the variation, languages, and validation rules your operators have been carrying in their heads.
No retainer lock-in · Month-to-month · Full transparency