BankerToolBench: Evaluating AI Agents in End-to-End Investment Banking Workflows
📰 ArXiv cs.AI
arXiv:2604.11304v1 Announce Type: new Abstract: Existing AI benchmarks lack the fidelity to assess economically meaningful progress on professional workflows. To evaluate frontier AI agents in a high-value, labor-intensive profession, we introduce BankerToolBench (BTB): an open-source benchmark of end-to-end analytical workflows routinely performed by junior investment bankers. To develop an ecologically valid benchmark grounded in representative work environments, we collaborated with 502 inves
DeepCamp AI