About IB-bench

Measuring what matters for AI in investment banking.

Mission

Generic benchmarks test general reasoning. Finance benchmarks test textbook knowledge. Neither tells you if a model can actually do the job.

IB-Bench evaluates LLMs on the work junior analysts actually do: parsing complex filings, building and debugging financial models, and extracting critical data from documents. If you're building AI for finance, these are the tasks that matter.

Why IB-Bench?

  • Real tasks: Built from actual analyst workflows, not synthetic problems
  • Difficulty-weighted: Hard tasks count more because that's where value lies
  • Quality data: Materials sourced from industry professionals
  • Open source: Full methodology, prompts, and rubrics on GitHub

Contact

Questions, feedback, or want to contribute tasks? Reach out on GitHub or X.

© 2026 IB-bench. All rights reserved.