About IB-bench
Measuring what matters for AI in investment banking.
Mission
Generic benchmarks test general reasoning. Finance benchmarks test textbook knowledge. Neither tells you if a model can actually do the job.
IB-Bench evaluates LLMs on the work junior analysts actually do: parsing complex filings, building and debugging financial models, and extracting critical data from documents. If you're building AI for finance, these are the tasks that matter.
Why IB-Bench?
- Real tasks: Built from actual analyst workflows, not synthetic problems
- Difficulty-weighted: Hard tasks count more because that's where value lies
- Quality data: Materials sourced from industry professionals
- Open source: Full methodology, prompts, and rubrics on GitHub
Contact
Questions, feedback, or want to contribute tasks? Reach out on GitHub or X.