GLM-OCR has accumulated over 3M downloads. We are releasing its technical report: https://t.co/KHFgnnDfYh We welcome your feedback!
GLM-OCR Hits 3M Downloads, Technical Report Released on arXiv
Zhipu AI· Updated
GLM-OCR, a 0.9B-parameter multimodal model for document understanding, has crossed 3 million downloads. Z.ai is releasing its technical report detailing the architecture, covering document parsing, table recovery, formula transcription, and key information extraction.
Evaluations on public benchmarks and industrial scenarios show GLM-OCR achieves competitive or state-of-the-art performance across document parsing, formula transcription, table structure recovery, and key information extraction. Its compact architecture targets both edge deployment and large-scale production systems.
Point it at your document processing pipeline to evaluate whether the MTP throughput gains hold for your workload — the technical report covers full benchmark results and architecture specs.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

