SLALOM: Simulation Lifecycle Analysis via Longitudinal Observation Metrics for Social Simulation
📰 ArXiv cs.AI
arXiv:2604.11466v1 Announce Type: cross Abstract: Large Language Model (LLM) agents offer a potentially-transformative path forward for generative social science but face a critical crisis of validity. Current simulation evaluation methodologies suffer from the "stopped clock" problem: they confirm that a simulation reached the correct final outcome while ignoring whether the trajectory leading to it was sociologically plausible. Because the internal reasoning of LLMs is opaque, verifying the "b
DeepCamp AI