Conversations with Excel, part 2


In the first post of this series I talked about accessing data from Excel workbooks using the SAS Excel libname engine, as a replacement for data access through DDE. However, it’s also possible to use SQL pass-through to perform the same task – again, this makes the code relatively simple and familiar in comparison with the machinery required for DDE. For example, the read_file macro can be rewritten like so:

%macro read_file(fname);
*** Libnames ;
* sas dataset directory;
libname sasdata "&path.\&adhocref.\sasdata" ;

* strip off the file extension;
%let namelen=%eval(%length(&fname)-4);
%let fshortname=%substr(&fname, 1, &namelen);

proc sql;
  connect to excel(path="&inputdir.\&fname");
  create table sasdata.&fshortname as
  select * from connection to excel
  (select * from 'rngData');
  disconnect from excel;
quit;

%mend read_file;

So why prefer this to the libname? Well I think it’s probably more a question of style than anything – I haven’t compared performance but really with the amount of data in the spreadsheets I’m accessing it’s quick either way. I’ve gotten a lot more used to using sql pass-through over the last year or so, and now I tend to find it a more natural way to think about a data access problem than using the data step.

For an example, given a spreadsheet with a few thousand rows of employee hours worked in the range EmployeeTimeData, and a list of employees with their department and email address in the range emp_list, the following pass-through query allows me to calculate totals for each employee and return them to a sas dataset:

*** Subtotal hours worked for each employee;
proc sql;
  connect to excel(path="&inputdir.\&fname");
  create table sasdata.emp_hrs_email as
  select * from connection to excel
   (select
           a.Name_of_Employee as emp_name,
           b.DEPARTMENT as emp_department,
           b.EMAIL_ADDRESS as emp_email,
           sum(a.Time_in_Hours) as total_time
    from
      'EmployeeTimeData' a inner join
      'emp_list' b on
           a.Name_of_Employee = b.NAME_FOR_DOCUMENTATION
    group by
            a.Name_of_Employee,
            b.DEPARTMENT,
            b.EMAIL_ADDRESS
    );
quit;

It’s also possible to make the conversation go the other way, passing non-query statements with the execute statement:

*** Drop and re-create table emp_list;
proc sql;
  connect to excel as xldb(path="&inputdir.\&fname");
  execute(drop table 'emp_list') by xldb;
  execute(create table 'emp_list'
               (NAME_FOR_DOCUMENTATION VARCHAR(20) not null,
                DEPARTMENT VARCHAR(10),
                EMAIL_ADDRESS VARCHAR(20))
             ) by xldb;
quit;

Documentation for using the pass-through facility to access Excel/Access files in SAS 9.1.3 is here.

Next time I’ll take a look at some other ways to start conversations with Excel, this time using Python.

Advertisements

One thought on “Conversations with Excel, part 2

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s