Conversations with Excel, part 3

In the first and second posts in this series I looked at some ways to read from and write to Excel files using SAS. The motivation for doing this is replacing code which uses the DDE protocol to communicate with the Excel application – which works fine using PC SAS on the desktop, but when SAS is on the server, there’s no application to talk to. SAS uses the SAS/ACCESS interface to ‘speak’ directly with Excel files.

A similar facility is available for working with Excel files in Python, by making use of the packages available from http://www.python-excel.org/. The conversation is enabled through three packages:

xlrd allows you to read data and formatting information from Excel files.

xlwt allows you to create and write to Excel files.

xlutils contains a bunch of utilities for using both xlrd and xlwt, and is required to copy or modify existing Excel files.

There’s comprehensive documentation for all three packages at the site, and a tutorial with some examples of how they can be used. You can also visit the active Google group for support.

This means that you have the ability to work with Excel files, using Python on a platform where Excel is not available. Usage is fairly straightforward – for instance, to read the data contained on the sheet ‘dataSht’ in the workbook ‘Template_test.xls’:

>>> from xlrd import open_workbook
>>> wb = open_workbook('Template_test.xls')
>>> sht = wb.sheet_by_name('dataSht')
>>> for row in range(sht.nrows):
...   values = [sht.cell(row, col).value for col in range(sht.ncols)]
...   print values
... 
[u'Template_test.xls', '', '', '', '', '', '', '']
['', '', '', '', '', '', '', 96.0]
[u'Date', u'Data_1', u'Data_2', u'Data_3', u'Data_4', u'Data_5', '', '']
[40043.0, 0.71426979592069983, 0.29721317486837506, 0.14380949875339866, 0.70981460809707642, 0.19360692566260695, '', '']
[40044.0, 0.30376328527927399, 0.75381017150357366, 0.26589830825105309, 0.30413518520072103, 0.41826989687979221, '', '']
[40045.0, 0.99682421330362558, 0.0027025360614061356, 0.64853132842108607, 0.27574777463451028, 0.99392103916034102, '', '']
[40046.0, 0.14693491021171212, 0.93810823513194919, 0.32732625165954232, 0.77697453368455172, 0.35358203155919909, '', '']
[40047.0, 0.43824125546962023, 0.20211741980165243, 0.6220957413315773, 0.28986502904444933, 0.85634097876027226, '', '']
[40048.0, 0.3646774091757834, 0.33247592020779848, 0.84804946463555098, 0.36496656434610486, 0.0059830849058926105, '', '']
[40049.0, 0.6037151631899178, 0.079236360732465982, 0.30319626023992896, 0.74752466194331646, 0.7890509688295424, '', '']
[40050.0, 0.49680318590253592, 0.051287947688251734, 0.54286114033311605, 0.76270149415358901, 0.35542313288897276, '', '']
[40051.0, 0.96113103721290827, 0.75952570792287588, 0.35812566895037889, 0.60966236609965563, 0.03527348255738616, '', '']
[40052.0, 0.35204670811071992, 0.75659727631136775, 0.97338171768933535, 0.67937295977026224, 0.53357180999591947, '', '']
[40053.0, 0.32696374924853444, 0.11761421523988247, 0.73568923026323318, 0.94905949058011174, 0.4074792442843318, '', '']
[40054.0, 0.59203020902350545, 0.31373690022155643, 0.73995516449213028, 0.44007967365905643, 0.67870346456766129, '', '']
[40055.0, 0.74593824986368418, 0.043794836848974228, 0.75793982530012727, 0.049134510103613138, 0.79131949925795197, '', '']
[40056.0, 0.54699079459533095, 0.54593769600614905, 0.84260744694620371, 0.089851934928447008, 0.30863919015973806, '', '']
[40057.0, 0.19803057983517647, 0.050982972607016563, 0.068164898082613945, 0.55615624878555536, 0.66064533870667219, '', '']
[40058.0, 0.1034383806400001, 0.90820295689627528, 0.41724261501803994, 0.076820098329335451, 0.58757591666653752, '', '']

There’s also support for reading named ranges in much the same way as with the SAS Excel libname engine, although it also has the same problem of being unable to evaluate dynamic ranges. It is possible to extract the formula from the ‘RefersTo’ string, so I think it may be possible to parse it to get an address to read from – I’ll follow up on this later.

Enjoy!