Pivot Table Compatibility Issues

Like a lot of users, I’ve done a lot with pivot tables over the years. A typical application to produce reports would be something like what I did today:

  1. SAS query against a database, summarised and output to csv
  2. open the csv file in Excel and create a pivot table off the summarised data
  3. format the pivot table, selecting the fields to aggregate by, and generate a pivot chart for a nice visual

This week I joined the pilot for our rollout of Office 2007, so I did all the above in Excel 2007. Having followed the improvements to pivot table functionality in this version on the Excel team blog (some posts archived here), and having tried out some fairly trivial pivot tables in the trial version I downloaded at home, I was keen to take a look at what I could do with them in this case. I wasn’t disappointed, there are a lot of really nice enhancements to both form and function, which produced a good-looking report.

The wrinkle occurred when I sent the document to a co-worker – when saving to 97/2003 format I was advised that some functionality would not be available. Turns out it’s more a case of all functionality is not available – the pivot table cannot be refreshed, filters cannot be changed…basically it turns into a static report, which of course is not at all what I wanted. I’d been bitten by the fact that pivot tables are versioned, just like Excel. The pivot table I’d created was version 12; the highest version my co-worker can use in Excel 2003 is 10.

So ok, I should have expected that in a new version of Excel, correct? After all, if there’s new functionality available in the new version, it seems pretty reasonable to expect that functionality not to be available in an old version, no? Well, yes…but I had naively expected that saving the file in 97/2003 format would have ‘downgraded’ the pivot table version, possibly disabling functionality that isn’t supported in 2003. Turns out that cannot be done after the pivot table has been created – the Excel team blog posted an explanation of why this is the case a couple of years ago, which obviously I had either missed or forgotten.

In any case, the way around the issue appears to be to ensure the file is saved in 97/2003 format before creating the pivot table. This is a fairly straightforward fix, but unfortunately it’s going to create some issues for us. We use a lot of code for our regular reporting (served to Excel 97/2003 users) that creates pivot tables, and I’m pretty sure most of it creates the pivot table before saving the file. Revising all of that is going to be a whole lot of nastiness…

Related post from a while back on funnybroad’s Dear Microsoft Office 2007 team


Building Models with Pivot Tables

Pivot Table Love @ Smurf on Spreadsheets
This is a really interesting post, mainly because of the discussion it’s generated – there’s definitely a diversity of opinion out there about the use of pivot tables. Personally, I have used them more often than not when building models, for the simple reason that once you’ve gotten over the initial learning curve on the way to using them they’re a very easy way to make sense out of large amounts of data.

My use of pivot tables has evolved over time, in tandem with my growth in understanding of best practice in spreadsheet modelling. When I first started using them, all I would do was put a truckload of data on one worksheet and use the wizard to build a table and present some summary stats on another. Which can be fine, but I think what I failed to appreciate at that point is the fact that there are lots of ways to extract summary stats, many of which are far less expensive in terms of file size. For instance, array functions, SUMIF/COUNTIF, or one of the many variations on the use of SUMPRODUCT (if you’re not using too many of them), can be very convenient in that they will save on the expense of creating the pivot cache. The database functions (DCOUNT, DSUM, DAVERAGE) are also very useful, and often enable the extraction of stats using far more complex criteria than would otherwise be available, directly from the dataset.

What I find myself doing more now is using pivot tables, in combination with GETPIVOTDATA, as an intermediate step on the way to building the inputs and calculations that do the work in the model. The question of file size often comes up, but unless the dataset is under, say, 10K rows, I’ll keep it outside the model. The SAS add-in for Office is a great option in that regard, with the option of opening a SAS dataset directly into a pivot table, or running stored processes or even adhoc queries on the server from inside the worksheet.

The really attractive feature of pivot tables, though, is that once they’re built they’re very flexible, and reasonably intuitive for the end user to change as needed. Of course, whether I use them in that way or not is going to depend on the objective of the model and the needs of the end user.