Mistral OCR 3 is the latest version of Mistral OCR, Mistral AI’s optical character recognition software that drives Document AI. This model is named mistral-ocr-2512This software, which is available at a price of only $2 for 1,000 pages, offers a 50% reduction when it’s used with the Batch API.
Mistral OCR 3 – What applications is it optimised for?
Mistral OCR 3 targets the typical workload of enterprise documents. It is optimized for handwriting, forms, scanned files, complex tables, etc. This model was evaluated against internal benchmarks taken from real-world business cases. It achieved a win rate of 74% over Mistral OCR 2.
When table formatting is enabled it will enrich the output by adding HTML-based table representations. The combination of content and structural information is what downstream systems need for analytics, agent workflows and retrieval pipelines.
Mistral Document: Role
OCR 3 is a part of Mistral Document AI – the document processing capabilities that combine OCR, structured data extraction, and Document QnA.
Now it powers Mistral AI Studio’s Document AI Playground. Users can upload PDFs, images or both and return either JSON structured or cleanly transcribed text using this interface. Public API accesses the same OCR pipeline as the interactive interface, so teams can move seamlessly from exploration to production work without changing core models.
Structure, Inputs and Outputs
OCR can accept multiple document types through a single interface. The API allows for multiple document formats to be accepted by the OCR processor. Documents Field can be used to point out:
document_urlYou can convert PDFs to pptx or docx.image_urlImages such as jpeg, avif and png can be used.- Uploading or base64 encoding PDFs, images or other files through the same schema
You can find this information in Mistral’s DocumentAI docs, under the OCR processor section.
The JSON object that is returned has a Page array. The pages contain an index, markdown strings, images and tables. table_format="html" Detected hyperlinks are optional Header The following are some examples of how to get started: Footer If header or footer extract is active, then the following fields are displayed: Dimensions Object with the page size. It is possible to change the page size. document_annotation Field for structured annotations usage_info Block accounting information
The markdown contains placeholders like  The following are some examples of how to get started: [tbl-3.html](tbl-3.html). The placeholders will be mapped to the actual content by using Images The following are some examples of how to get started: Tables The arrays simplify the reconstruction of downstream.
Upgrades Over Mistral OCR 2
Mistral OCR 3 brings several tangible improvements compared to OCR 2 Public release notes highlight four key areas.
- Handwriting Mistral OCR 3 better interprets handwritten, mixed-content annotations and handwritten content placed on printed templates.
- Forms The software enhances the ability to detect boxes, labels and handwritten entry in documents with dense layouts like invoices and receipts.
- Documents that are complex or scanned This model has a higher tolerance for compression artifacts such as skew and distortion. It is also more resistant to low DPI scans with background noise.
- Complex tables This tool can generate HTML tables in the correct format. It also reconstructs table structure with headers and merged cells.
colspanThe following are some examples of how to get started:RowspanThe layout of the tag is maintained.
Prices, Batch Inferences, and Annotations
OCR 3 Model Card lists pricing as $2 for 1,000 standard OCR pages and $3 for 1,000 structured annotated pages.
Mistral’s Batch Inference API exposes OCR 3 as well /v1/batchThe batching feature of the platform is described in the documentation. By applying a 50% off to jobs running through the batch pipeline, the price of OCR is reduced by half.
The model integrates with two important features on the same endpoint, Annotations – Structured and BBox Extraction. They allow developers to add schema-driven labels to specific regions in a document, and to get bounding box information for the text or other elements. This is helpful when mapping content to downstream systems and UI overlays.
The Key Takeaways
- Models and rolesMistral OCR 3 named
mistral-ocr-2512The new OCR technology powers Mistral’s Document AI stack, which is used for document page understanding. - Accuracy gainUsing internal benchmarks for forms, documents scanned, tables and handwritings, OCR 3 has a win rate of 74% over Mistral OCR 2. Mistral considers it to be the state-of-the art OCR system, both against traditional OCR and AI systems.
- RAG Structured outputsThis service will extract interleaved texts and images, then return markdown with HTML tables reconstructed, while preserving the layout of each table. The outputs are ready to be fed directly into RAG agents and search pipelines, with little extra parsing.
- Formats for APIs and documentsOCR 3 is now available to developers via The
/v1/ocrEndpoint SDK or endpoint, PDFs can be passed asdocument_urlImages such as Jpeg or Png are acceptable.image_urlYou can also enable features like HTML tables, extracting the headers or footers, as well as base64 images. - Prices and batch processingOCR 3 costs 2 dollars for 1,000 pages and 3 for 1,000 pages with annotations. When using the Batch API, the price of standard OCR falls to just 1 dollar for 1,000 pages when processing large volumes.
Take a look at the TECHNICAL DETAILS. Check out our website to learn more. GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter Don’t forget about our 100k+ ML SubReddit Subscribe now our Newsletter.


