Chain-of-Thought as a Lens: Evaluating Structured Reasoning Alignment between Human Preferences and Large Language Models

📰 ArXiv cs.AI

arXiv:2511.06168v3 Announce Type: replace Abstract: This paper primarily demonstrates a method to quantitatively assess the alignment between multi-step, structured reasoning in large language models and human preferences. We introduce the Alignment Score, a semantic-level metric that compares a model-produced chain of thought traces with a human-preferred reference by constructing semantic-entropy-based matrices over intermediate steps and measuring their divergence. Our analysis shows that Ali

Published 22 Apr 2026

Read full paper → ← Back to Reads