64-bit Excel

The MS Excel team just posted on their blog about a 64-bit version of Excel in the upcoming 2010 version. Nominally, this means, as they put it, “…a version of Excel that runs natively in that [64-bit] environment and takes advantage of 64-bit strengths. One of the more obvious benefits is the lifting of the addressable memory constraint we were dealing with in 32-bit Windows.”

OK, OK, so that’s true. Out-of-memory conditions are one of the problems that have traditionally plagued Excel. I’m really stoked to see this being addressed. But one thing I’m really not stoked about – and a few people have already commented about this – is the unfortunately wording of the explanation for need for 64-bit Excel: “In short: enormous capacity; humungous workbooks.” In short, no, no, NOOOOOO!!!

Let me put my objection into context. I see Excel as filling a number of roles in the BI space. It’s got a calculation engine and pivot tables which make it a great tool for analysis. It’s got a charting engine and formatting which makes it an excellent presentation layer. It’s got VBA (for the time being anyway), which allows it to be used as a serviceable small-scale rapid application development platform. However, one thing it’s not and should never be used as is a database. The occasions when it is used in that way almost inevitably end in tears, and quite often in mission-critical systems which just quit working as they were intended – models which no longer calculate, or “humungous workbooks” which no longer open.

I was approached recently by someone in the team implementing the rollout of Office 2007 at work, asking about a number of very large workbooks (built in Excel 2003) which appeared not to open in Excel 2007. “How large is large?” I asked, thinking it’d be in the 50-100 MB range. Turns out these monsters approached 500 MB – several sheets containing 30K rows with 100+ columns filled with VLOOKUPs, OFFSETs, SUMIFs and array functions. They do open (and re-calculate) in Excel 2003, although you do have to wait a few minutes in either case. In fact, they do eventually open in Excel 2007, but only after a 5-hour wait.

Luckily, these behemoths are not in current use, which is the scenario which concerned the project team. If they had been part of a business-critical system, we would have been in serious trouble. The bottom line is, this is a great example of really poor spreadsheet design, which should have as a fundamental guiding principle the idea of restricting data to the minimal possible to achieve the desired result. Really large datasets do NOT belong in spreadsheets, and I’ll take a lot of convincing to believe that the enhancements offered in a 64-bit version of Excel can change that.