Back to Blog

Originally published on X here

How Do LLMs Work With Excel Files?

· Danial A.

How Do LLMs Work With Excel Files?

TLDR

  • LLMs default to Python’s openpyxl library for Excel files
  • Problem: openpyxl has no calculation engine - LLMs can change values but can’t see linked cells update
  • Solution: prompt LLMs to use headless LibreOffice for full recalculation

How LLMs Actually Handle Excel Files

By default, LLMs reach for openpyxl. The workflow:

  1. Open workbook with openpyxl
  2. Print out values and formulas for each cell
  3. Reason from this output to understand the model
  4. Make changes using openpyxl
  5. Save and close
What an LLM actually sees when you give them your spreadsheet
What an LLM actually sees when you give them your spreadsheet

At no point does the LLM check if the output is correct after making changes. Because it can’t.

The Problem with openpyxl

openpyxl is a great library - credit to the maintainers - but it’s designed to read and write Excel files. It has no calculation engine.

Here’s a toy example: you ask an LLM to change an input cell and report the output.

The LLM will:

  1. Print cell values by loading just the data: wb = openpyxl.load_workbook('model.xlsx')
  2. Do the same with formulas: wb = openpyxl.load_workbook('model.xlsx', data_only=False)
  3. Reason through the data to understand how the spreadsheet flows
  4. Manually calculate what it believes the output should be - instead of just asking the spreadsheet
After making changes, Claude calculates the answer directly instead of reading from the workbook
After making changes, Claude calculates the answer directly instead of reading from the workbook

This seems fine until you realise:

  • Context expensive - the LLM writes Python to calculate values instead of letting Excel’s engine do its thing
  • Non-reactive - spreadsheets are reactive. Change one cell, everything linked updates. A Python script doesn’t replicate this
  • Misses nuance - real Excel files have quirks: plugged values, special formatting, conditional logic, inconsistent data flow. An LLM misses all of this when calculating manually

Other libraries aren’t great either. formulas has a calculation engine but doesn’t support data tables - so you’d need to strip those out first. xlwings could work, but it requires the LLM to have access to Excel in their environment. They don’t.

The Fix: Headless LibreOffice

For most use cases, headless LibreOffice is the answer.

Quick explainer: LibreOffice is open-source Excel. Headless means no UI - you interact with it programmatically.

Prompt the LLM to use a headless LibreOffice instance when dealing with spreadsheets. Specifically, tell it to recalculate values after making changes. LLMs can be stubborn about this - they’ll want to just calculate things in Python.

This gives you:

  • Full recalculation engine
  • Data table support
  • No GUI overhead
  • Works on Linux servers (where LLMs live)

The new workflow:

  1. LLM loads the workbook using openpyxl twice (data only, then formulas) to understand structure
  2. Makes changes using openpyxl
  3. Triggers recalculation via headless LibreOffice
  4. Checks results with openpyxl, iterates

The tradeoff is speed and occasional compatibility quirks. The LLM triggers recalculation by converting xlsx to xlsx via LibreOffice (not a typo - converting forces a recalc).

For most financial models, it just works.

What’s Next: Computer Use

Screenshot-and-mouse interaction is going to be big in 2026. Today it’s slow and janky. But LLMs are getting scary good at generating synthetic UI for training - by year-end, we’ll probably have models that can drive Excel like a human.

Superhuman Excel modellers are coming. They’ll need a mix of computer use and programmatic access - openpyxl, LibreOffice, or something better.

Until then? Finance bros, our jobs are safe.

© 2026 IB-bench. All rights reserved.