By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Mistral AI Launches API for LLM-Based OCR of Multimodal Documents
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > Mistral AI Launches API for LLM-Based OCR of Multimodal Documents
News

Mistral AI Launches API for LLM-Based OCR of Multimodal Documents

News Room
Last updated: 2025/03/31 at 7:33 AM
News Room Published 31 March 2025
Share
SHARE

Now available on Mistral’s la Plateforme SaaS, Mistral OCR aims to provide an OCR solution for digitizing complex documents that interleave text and images, tables, mathematical expressions, and advanced layouts. This makes it particularly suitable for digitizing scientific research, historical documents and artifacts, user manuals, and more, the company says.

Mistral OCR uses Mistral LLMs to understand content extracted by OCR-ing a document. This helps understanding its context and the relationships between document elements, which makes it suitable for use with RAG systems taking multimodal documents as input.

According to the company’s own benchmarking, Mistral OCR outperforms other leading OCR solutions, including Google Document AI, Azure OCR, Gemini 1.5 and 2.0, and GPT-4o.

Unlike other models, Mistral OCR comprehends each element of documents—media, text, tables, equations—with unprecedented accuracy and cognition. It takes images and PDFs as input and extracts content in an ordered interleaved text and images.

Mistral AI maintains that its OCR API is the only one that extracts embedded images from documents along with text. The resulting text plus images are exported into a markdown file. Additional formats are supported for structured output, such as JSON, to chain OCR output into a more complex workflow), which can be useful to build agents.

When it comes to multilingual support, Mistral AI emphasizes its solution can parse, understand, and transcribe thousands of scripts, fonts, and languages.

Mistral OCR is already powering Mistral’s le Chat LLM-powered chat solution and will be available soon for on-premises deployments. According to the company, it can process up to 2000 pages per minute on a single node.

To use Mistral OCR API in Python, you install the mistralai package, which provides support for authentication and for using all capabilities provided by Mistral API. To process a file, you need to upload it first, as shown in the following snippet:


# Upload PDF file to Mistral's OCR service
uploaded_file = client.files.upload(
 file={
 "file_name": pdf_file.stem,
 "content": pdf_file.read_bytes(),
 },
 purpose="ocr",
)

# Get URL for the uploaded file
signed_url = client.files.get_signed_url(file_id=uploaded_file.id, expiry=1)

# Process PDF with OCR, including embedded images
pdf_response = client.ocr.process(
 document=DocumentURLChunk(document_url=signed_url.url),
 model="mistral-ocr-latest",
 include_image_base64=True
)

# Convert response to JSON format
response_dict = json.loads(pdf_response.model_dump_json())

The API is currently limited to files that do not exceed 50MB in size or 1,000 pages in length. The price is set to 1,000 pages/USD or 2,000 pages/USD when using batch OCR.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Firefox 137 Release Brings VA-API Accelerated H.265 On Linux
Next Article From Submesoscales to Global Impact: Oceananigans Powers Climate Predictions | HackerNoon
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

PureRAT Malware Spikes 4x in 2025, Deploying PureLogs to Target Russian Firms
Computing
Google’s AI-powered Flow won’t make filmmaking great again
Gadget
Epic New Balance Memorial Day sale live from $18 at Amazon — 15 deals I’d shop on sneakers, apparel and more
News
To build tomorrow’s power grid, the United States should look to geothermal energy
News

You Might also Like

News

Epic New Balance Memorial Day sale live from $18 at Amazon — 15 deals I’d shop on sneakers, apparel and more

1 Min Read
News

To build tomorrow’s power grid, the United States should look to geothermal energy

10 Min Read
News

Quit Paying Adobe Acrobat Fees and Get This Cost-Effective Alternative Instead

3 Min Read
News

Google is shrinking Pixel phones’ At a Glance widget

1 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?