MPDocBench-Parse: Benchmarking Practical Multi-page Document Parsing

📰 ArXiv cs.AI

arXiv:2605.22100v1 Announce Type: new Abstract: Document parsing converts visually rich documents into machine-readable structured representations, forming a crucial foundation for information systems. Although many benchmarks have been proposed for document parsing, they remain inadequate for realistic scenarios. Existing benchmarks either focus on specific tasks or assess only single-page, text-centric settings, making them insufficient for practical multi-page parsing. Moreover, they lack fin

Published 23 May 2026

Full Article

Title: MPDocBench-Parse: Benchmarking Practical Multi-page Document Parsing

Abstract:
arXiv:2605.22100v1 Announce Type: new Abstract: Document parsing converts visually rich documents into machine-readable structured representations, forming a crucial foundation for information systems. Although many benchmarks have been proposed for document parsing, they remain inadequate for realistic scenarios. Existing benchmarks either focus on specific tasks or assess only single-page, text-centric settings, making them insufficient for practical multi-page parsing. Moreover, they lack fin

Read full paper → ← Back to Reads