FORGE: Fine-grained Multimodal Evaluation for Manufacturing Scenarios

📰 ArXiv cs.AI

arXiv:2604.07413v2 Announce Type: replace-cross Abstract: The manufacturing sector is increasingly adopting Multimodal Large Language Models (MLLMs) to transition from simple perception to autonomous execution, yet current evaluations fail to reflect the rigorous demands of real-world manufacturing environments. Progress is hindered by data scarcity and a lack of fine-grained domain semantics in existing datasets. To bridge this gap, we introduce FORGE. Wefirst construct a high-quality multimoda

Published 14 Apr 2026

Read full paper → ← Back to Reads