Diagnosing CFG Interpretation in LLMs

📰 ArXiv cs.AI

arXiv:2604.20811v1 Announce Type: new Abstract: As LLMs are increasingly integrated into agentic systems, they must adhere to dynamically defined, machine-interpretable interfaces. We evaluate LLMs as in-context interpreters: given a novel context-free grammar, can LLMs generate syntactically valid, behaviorally functional, and semantically faithful outputs? We introduce RoboGrid, a framework that disentangles syntax, behavior, and semantics through controlled stress-tests of recursion depth, ex

Published 23 Apr 2026
Read full paper → ← Back to Reads