MADP: A Multi-Agent Pipeline for Sustainable Document Processing with Human-in-the-Loop — AI agent app

Diego Gosmar, Giovanni Zenezini/MADP: A Multi-Agent Pipeline for Sustainable Document Processing with Human-in-the-LoopUnknown

Document processing automation remains a critical challenge in enterprise environments, where traditional manual approaches are labor-intensive and error-prone. We present MADP, a multi-agent architecture that addresses the challenge of automating document processing in enterprise settings by combining deep learning-based classification and parsing with large language model extraction, while maintaining accuracy through selective human validation. Our system integrates five specialized agents--Classificator, Splitter, Parser, Extraction, and Validator--with a Human-in-the-Loop (HITL) mechanism and a novel Prompt Fine Tuning with Feedback Inheritance (PFTFI) approach. The operational analysis on a production use-case scenario of 100,000 invoices per year indicates a potential reduction of Full-Time Equivalent (FTE) requirements by approximately 70%. Production deployment on 955 real-world documents processed through January 2026 achieves a 97.0% full-pipeline automation rate, with only 3% requiring non-AI fallback. Ablation evaluation on a stratified 100-document subset (5 documents per each of 20 supplier/document-type categories) demonstrates that the full MADP configuration with Human-in-the-Loop supervision attains 98.5% document-level accuracy. Additionally, we present a comprehensive sustainability analysis showing that our hybrid AI+HITL approach reduces CO2 emissions by 69%, energy consumption by 69%, and water usage by 63% compared to traditional manual processing. Benchmark comparisons of multiple LLM backends (Granite-Docling, Mistral-Small, DeepSeek-OCR) provide practical insights for deployment in production environments.

agent app

Stars0

Forks0

HF Downloads—30d

Last commit—

Refreshed9d ago

Project healthUnknownNo activity data.

Production readinessResearch / EarlyBest for exploration and prototyping.

Risk notesUnknown licenseVerify license before production use.

AgentHub Score

48 / 100

Composite score from 6 signals. How we score →

Active project

48Score

Growth

40C

Activity

30C

Documentation

70C+

Maturity

45C

Community

42C

Production

58C

GitHub stars · 90 days0 +0.0%

30d90d1y

Commit activity · 52 weeksActive contributor activity

LowHigh

JunSepDecMarNow

Practical assessment

Should you use it?

✓ Best for

Research and experimentation
Prototype development
Learning agentic patterns

◎ Strengths

Active community
Open source
Well-documented API

✕ Not ideal for

Untested at scale without validation
Teams without AI/ML expertise

⚠ Watch-outs

Review changelog before updating
Verify license for commercial use

Technical details

What's inside

Language—

License—

Sourcearxiv

Open source✗ No

Commercial use—

Docs—

Demo—

PaperarXiv ↗

AgentHub Score

Score 48/100

Below average

Alternatives

crewai

26.1k · Multi-Agent

autogen

42.7k · Multi-Agent

smolagents

11.2k · Coding

openai-agents-python

9.4k · Multi-Agent

Compare all →

Recent activity

Latest commit ——

Indexed by AgentHub crawler9d ago

Monitor for new releasesongoing