SciEval: A Benchmark for Automatic Evaluation of K-12 Science Instructional Materials
📰 ArXiv cs.AI
arXiv:2604.25472v1 Announce Type: new Abstract: The need to evaluate instructional materials for K-12 science education has become increasingly important, as more educators use generative AI to create instructional materials. However, the review of instructional materials is time-consuming, expertise-intensive, and difficult to scale, motivating interest in automated evaluation approaches. While large language models (LLMs) have shown strong performance on general evaluation tasks, their perform
DeepCamp AI