OCR 3, a smaller Optical Character Recognition model for structured document AI, is now available from Mistral AI

Mistral OCR 3 is the latest version of Mistral OCR, Mistral AI’s optical character recognition software that drives Document AI. This model is named mistral-ocr-2512This software, which is available at a price of only $2 for 1,000 pages, offers a 50% reduction when it’s used with the Batch API.

Mistral OCR 3 – What applications is it optimised for?

Mistral OCR 3 targets the typical workload of enterprise documents. It is optimized for handwriting, forms, scanned files, complex tables, etc. This model was evaluated against internal benchmarks taken from real-world business cases. It achieved a win rate of 74% over Mistral OCR 2.

When table formatting is enabled it will enrich the output by adding HTML-based table representations. The combination of content and structural information is what downstream systems need for analytics, agent workflows and retrieval pipelines.

Mistral Document: Role

OCR 3 is a part of Mistral Document AI – the document processing capabilities that combine OCR, structured data extraction, and Document QnA.

Now it powers Mistral AI Studio’s Document AI Playground. Users can upload PDFs, images or both and return either JSON structured or cleanly transcribed text using this interface. Public API accesses the same OCR pipeline as the interactive interface, so teams can move seamlessly from exploration to production work without changing core models.

Structure, Inputs and Outputs

OCR can accept multiple document types through a single interface. The API allows for multiple document formats to be accepted by the OCR processor. Documents Field can be used to point out:

document_url You can convert PDFs to pptx or docx.
image_url Images such as jpeg, avif and png can be used.
Uploading or base64 encoding PDFs, images or other files through the same schema

You can find this information in Mistral’s DocumentAI docs, under the OCR processor section.

The JSON object that is returned has a Page array. The pages contain an index, markdown strings, images and tables. table_format="html" Detected hyperlinks are optional Header The following are some examples of how to get started: Footer If header or footer extract is active, then the following fields are displayed: Dimensions Object with the page size. It is possible to change the page size. document_annotation Field for structured annotations usage_info Block accounting information

The markdown contains placeholders like ![img-0.jpeg](img-0.jpeg) The following are some examples of how to get started: [tbl-3.html](tbl-3.html). The placeholders will be mapped to the actual content by using Images The following are some examples of how to get started: Tables The arrays simplify the reconstruction of downstream.

Upgrades Over Mistral OCR 2

Mistral OCR 3 brings several tangible improvements compared to OCR 2 Public release notes highlight four key areas.

Handwriting Mistral OCR 3 better interprets handwritten, mixed-content annotations and handwritten content placed on printed templates.
Forms The software enhances the ability to detect boxes, labels and handwritten entry in documents with dense layouts like invoices and receipts.
Documents that are complex or scanned This model has a higher tolerance for compression artifacts such as skew and distortion. It is also more resistant to low DPI scans with background noise.
Complex tables This tool can generate HTML tables in the correct format. It also reconstructs table structure with headers and merged cells. colspan The following are some examples of how to get started: Rowspan The layout of the tag is maintained.

https://mistral.ai/news/mistral-ocr-3

Prices, Batch Inferences, and Annotations

OCR 3 Model Card lists pricing as $2 for 1,000 standard OCR pages and $3 for 1,000 structured annotated pages.

Mistral’s Batch Inference API exposes OCR 3 as well /v1/batchThe batching feature of the platform is described in the documentation. By applying a 50% off to jobs running through the batch pipeline, the price of OCR is reduced by half.

The model integrates with two important features on the same endpoint, Annotations – Structured and BBox Extraction. They allow developers to add schema-driven labels to specific regions in a document, and to get bounding box information for the text or other elements. This is helpful when mapping content to downstream systems and UI overlays.

The Key Takeaways

Models and rolesMistral OCR 3 named mistral-ocr-2512The new OCR technology powers Mistral’s Document AI stack, which is used for document page understanding.
Accuracy gainUsing internal benchmarks for forms, documents scanned, tables and handwritings, OCR 3 has a win rate of 74% over Mistral OCR 2. Mistral considers it to be the state-of-the art OCR system, both against traditional OCR and AI systems.
RAG Structured outputsThis service will extract interleaved texts and images, then return markdown with HTML tables reconstructed, while preserving the layout of each table. The outputs are ready to be fed directly into RAG agents and search pipelines, with little extra parsing.
Formats for APIs and documentsOCR 3 is now available to developers via The /v1/ocr Endpoint SDK or endpoint, PDFs can be passed as document_url Images such as Jpeg or Png are acceptable. image_urlYou can also enable features like HTML tables, extracting the headers or footers, as well as base64 images.
Prices and batch processingOCR 3 costs 2 dollars for 1,000 pages and 3 for 1,000 pages with annotations. When using the Batch API, the price of standard OCR falls to just 1 dollar for 1,000 pages when processing large volumes.

Take a look at the TECHNICAL DETAILS. Check out our website to learn more. GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter Don’t forget about our 100k+ ML SubReddit Subscribe now our Newsletter.

Michal Sutter, a data scientist with a master’s degree in Data Science at the University of Padova. Michal Sutter excels in transforming large datasets to actionable insight. He has a strong foundation in statistics, machine learning and data engineering.

OCR 3, a smaller Optical Character Recognition model for structured document AI, is now available from Mistral AI

Anthropic releases Claude Opus 4.7, a major upgrade for agentic coding, high-resolution vision, and long-horizon autonomous tasks

The Coding Guide to Property Based Testing with Hypothesis and Stateful, Differential and Metamorphic Test Designs

Google AI Releases Google Auto-Diagnosis: A Large Language Model LLM Based System to Diagnose Integrity Test Failures At Scale

This is a complete guide to running OpenAI’s GPT-OSS open-weight models using advanced inference workflows.

AI isn’t coming for Hollywood. It has already arrived

AI Digital Twins Help People Manage Diabetes And Obesity

OpenAI says that hundreds of thousands of ChatGPT users may show signs of manic or psychotic crisis every week

I’m More Hopeful about Our Collective Brain Drain After Watching a 7-Hour Film in the Theater

Amazon Has New Frontier AI Models—and a Way for Customers to Build Their Own

Top Insights

The End-to end process of a Haystack Multi-Agent system that detects incidents, investigates metrics and logs, and produces production-grade incident reviews

VERINA: Evaluation of LLMs for End-toEnd Verifiable Coding with Formal Provenance

Latest News

Anthropic releases Claude Opus 4.7, a major upgrade for agentic coding, high-resolution vision, and long-horizon autonomous tasks

The Coding Guide to Property Based Testing with Hypothesis and Stateful, Differential and Metamorphic Test Designs

OCR 3, a smaller Optical Character Recognition model for structured document AI, is now available from Mistral AI

Mistral OCR 3 – What applications is it optimised for?

Mistral Document: Role

Structure, Inputs and Outputs

Upgrades Over Mistral OCR 2

Prices, Batch Inferences, and Annotations

The Key Takeaways

Related Posts