IB-bench

Can Large Language Models Replace Investment Banking Analysts?|

33 public tasks just launched!
Here's an early look at Opus 4.5 and ChatGPT 5.2

Leaderboard

2 models evaluated · 33 total tasks

Scoring: Overall score is weighted 20% Easy, 35% Medium, 45% Hard.

Difficulty levels: Easy (<1 hour), Medium (few hours), Hard (>1 day) - based on time a human analyst would need.

© 2026 IB-bench. All rights reserved.