SCRIBE: Structured Mid-Level Supervision for Tool-Using Language Models

📰 ArXiv cs.AI

arXiv:2601.03555v2 Announce Type: replace Abstract: Training reliable tool-augmented agents remains a significant challenge, largely due to the difficulty of credit assignment in multi-step reasoning. While process-level reward models offer a promising direction, existing LLM-based judges often produce noisy and inconsistent signals because they lack fine-grained, task-specific rubrics to distinguish high-level planning from low-level execution. In this work, we introduce SCRIBE (Skill-Condition

Published 28 Apr 2026
Read full paper → ← Back to Reads