Explore/benchmark/Towards Human-Like Interactive Speech Recognition With Agentic Correction and Semantic Evaluation
T

Zixuan Jiang, Yanqiao Zhu, Peng Wang, Qinyuan Chen, Xinjian Zhao, Xipeng Qiu, Wupeng Wang, Zhifu Gao, Xiangang Li, Kai Yu, Xie Chen/Towards Human-Like Interactive Speech Recognition With Agentic Correction and Semantic EvaluationUnknown

Automatic speech recognition (ASR) is a core component of human--computer interaction and an increasingly important front-end for LLM-based assistants and agents. However, most current ASR systems still follow a single-pass paradigm, which is poorly aligned with human communication, where misunderstandings are resolved through iterative clarification and refinement. This mismatch makes it difficult to correct meaning-critical errors once they occur. Meanwhile, token-level metrics such as WER or CER cannot adequately reflect such a problem. To address these limitations, we formulate \emph{Interactive ASR} as a multi-turn refinement task and propose \textbf{Agentic ASR}, a closed-loop framework that combines a single-pass ASR front-end with semantic correction, intent routing, and reasoning-based editing. We further introduce the \textbf{Sentence-level Semantic Error Rate} ($S^2ER$), an LLM-based semantic evaluation metric, together with an \textbf{Interactive Simulation System} for scalable and reproducible benchmarking. Experiments on multilingual, named-entity-intensive, and code-switching benchmarks show that iterative interaction consistently reduces semantic errors, with much larger gains in $S^2ER$ than in conventional token-level metrics. Human--AI alignment and ablation studies further validate the reliability of the semantic judge and the robustness of the proposed framework. The code is available at: https://interactiveasr.github.io/ and the live demo is available at https://i-asr.sjtuxlance.com/

benchmark
GitHubCompare
Refreshed 20h ago
OverviewActivity52wAlternativesDocs
Stars0
Forks0
HF Downloads30d
Last commit
Refreshed20h ago
Project healthUnknownNo activity data.
Production readinessResearch / EarlyBest for exploration and prototyping.
Risk notesUnknown licenseVerify license before production use.
AgentHub Score
48 / 100
Composite score from 6 signals. How we score →
Active project
48Score
Growth
40C
Activity
30C
Documentation
70C+
Maturity
45C
Community
42C
Production
58C
GitHub stars · 90 days0 +0.0%
30d90d1y
latest release
Commit activity · 52 weeksActive contributor activity
LowHigh
JunSepDecMarNow
Practical assessment
Should you use it?

✓ Best for

  • Research and experimentation
  • Prototype development
  • Learning agentic patterns

◎ Strengths

  • Active community
  • Open source
  • Well-documented API

✕ Not ideal for

  • Untested at scale without validation
  • Teams without AI/ML expertise

⚠ Watch-outs

  • Review changelog before updating
  • Verify license for commercial use
Technical details
What's inside
Language
License
Sourcearxiv
Open source✗ No
Commercial use
Docs
Demo

AgentHub Score

48
Score 48/100
Below average

Alternatives

C
crewai
26.1k · Multi-Agent
87
A
autogen
42.7k · Multi-Agent
71
S
smolagents
11.2k · Coding
84
O
openai-agents-python
9.4k · Multi-Agent
81
Compare all →

Recent activity

Latest commit —
Indexed by AgentHub crawler20h ago
Monitor for new releasesongoing