Page image classification for content-specific data processing

📰 ArXiv cs.AI

arXiv:2507.21114v3 Announce Type: replace-cross Abstract: Digitization projects in humanities often generate vast quantities of page images from historical documents, presenting significant challenges for manual sorting and analysis. These archives contain diverse content, including various text types (handwritten, typed, printed), graphical elements (drawings, maps, photos), and layouts (plain text, tables, forms). Efficiently processing this heterogeneous data requires automated methods to cat

Published 5 May 2026

Read full paper → ← Back to Reads