DataMaster: Towards Autonomous Data Engineering for Machine Learning
📰 ArXiv cs.AI
arXiv:2605.10906v1 Announce Type: cross Abstract: As model families, training recipes, and compute budgets become increasingly standardized, further gains in machine learning systems depend increasingly on data. Yet data engineering remains largely manual and ad hoc: practitioners repeatedly search for external datasets, adapt them to existing pipelines, validate candidate data through downstream training, and carry forward lessons from prior attempts. We study task-conditioned autonomous data e
DeepCamp AI