Can LLMs Reason About Attention? Towards Zero-Shot Analysis of Multimodal Classroom Behavior

📰 ArXiv cs.AI

arXiv:2604.03401v1 Announce Type: cross Abstract: Understanding student engagement usually requires time-consuming manual observation or invasive recording that raises privacy concerns. We present a privacy-preserving pipeline that analyzes classroom videos to extract insights about student attention, without storing any identifiable footage. Our system runs on a single GPU, using OpenPose for skeletal extraction and Gaze-LLE for visual attention estimation. Original video frames are deleted imm

Published 7 Apr 2026

Read full paper → ← Back to News