Papers
arxiv:2603.23511

DISCO: Document Intelligence Suite for COmparative Evaluation

Published on Mar 4
Authors:
,
,

Abstract

DISCO evaluates OCR pipelines and vision-language models for document intelligence tasks, revealing varying performance across document types and highlighting the importance of task-aware approach selection.

AI-generated summary

Document intelligence requires accurate text extraction and reliable reasoning over document content. We introduce DISCO, a Document Intelligence Suite for COmparative Evaluation, that evaluates optical character recognition (OCR) pipelines and vision-language models (VLMs) separately on parsing and question answering across diverse document types, including handwritten text, multilingual scripts, medical forms, infographics, and multi-page documents. Our evaluation shows that performance varies substantially across tasks and document characteristics, underscoring the need for complexity-aware approach selection. OCR pipelines are generally more reliable for handwriting and for long or multi-page documents, where explicit text grounding supports text-heavy reasoning, while VLMs perform better on multilingual text and visually rich layouts. Task-aware prompting yields mixed effects, improving performance on some document types while degrading it on others. These findings provide empirical guidance for selecting document processing strategies based on document structure and reasoning demands.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2603.23511
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.23511 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.23511 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.23511 in a Space README.md to link it from this page.

Collections including this paper 1