Loading market data...

Mistral Launches OCR 4 with 170-Language Support, New Document Analysis Features

Mistral Launches OCR 4 with 170-Language Support, New Document Analysis Features

Mistral has released the latest version of its optical character recognition tool, OCR 4, bringing support for 170 languages and a set of new analysis features aimed at improving how documents are processed at scale. The update is now available for developers and enterprise users.

What's new in OCR 4

The new version adds three core capabilities: bounding boxes, block classification, and confidence scores. Bounding boxes let developers pinpoint exactly where text appears on a page — useful for forms, tables, and mixed-layout documents. Block classification automatically identifies different sections of a document, separating headings, body text, captions, and other structural elements. Confidence scores assign a reliability rating to each extracted character or word, making it easier to flag uncertain readings for human review.

Wide language coverage

OCR 4 supports 170 languages, covering most major writing systems from Latin and Cyrillic to Arabic, Devanagari, and CJK characters. That's a significant expansion over earlier versions, which focused primarily on a smaller set of European and Asian languages. The company says the broader support allows global teams to use a single tool for multilingual document workflows — everything from scanned contracts to historical archives.

Target users and use cases

The update is aimed at developers building document-processing pipelines, as well as enterprises that need to digitize large volumes of printed material. Bounding boxes and block classification help automate layout analysis, so a system can tell where a table ends and a footnote begins. Confidence scores give downstream applications a way to decide whether to accept a result or flag it for manual verification — critical for regulated industries like finance and healthcare.

Mistral has not published benchmarks or pricing changes for OCR 4. The tool is available through the company's API and on-premise deployment options. Users of previous versions can upgrade through the usual channels.