In the first post of this series I talked about accessing data from Excel workbooks using the SAS Excel libname engine, as a replacement for data access through DDE. However, it’s also possible to use SQL pass-through to perform the same task – again, this makes the code relatively simple and familiar in comparison with the machinery required for DDE. For example, the read_file macro can be rewritten like so:
%macro read_file(fname); *** Libnames ; * sas dataset directory; libname sasdata "&path.\&adhocref.\sasdata" ; * strip off the file extension; %let namelen=%eval(%length(&fname)-4); %let fshortname=%substr(&fname, 1, &namelen); proc sql; connect to excel(path="&inputdir.\&fname"); create table sasdata.&fshortname as select * from connection to excel (select * from 'rngData'); disconnect from excel; quit; %mend read_file;
So why prefer this to the libname? Well I think it’s probably more a question of style than anything – I haven’t compared performance but really with the amount of data in the spreadsheets I’m accessing it’s quick either way. I’ve gotten a lot more used to using sql pass-through over the last year or so, and now I tend to find it a more natural way to think about a data access problem than using the data step.
For an example, given a spreadsheet with a few thousand rows of employee hours worked in the range EmployeeTimeData, and a list of employees with their department and email address in the range emp_list, the following pass-through query allows me to calculate totals for each employee and return them to a sas dataset:
*** Subtotal hours worked for each employee; proc sql; connect to excel(path="&inputdir.\&fname"); create table sasdata.emp_hrs_email as select * from connection to excel (select a.Name_of_Employee as emp_name, b.DEPARTMENT as emp_department, b.EMAIL_ADDRESS as emp_email, sum(a.Time_in_Hours) as total_time from 'EmployeeTimeData' a inner join 'emp_list' b on a.Name_of_Employee = b.NAME_FOR_DOCUMENTATION group by a.Name_of_Employee, b.DEPARTMENT, b.EMAIL_ADDRESS ); quit;
It’s also possible to make the conversation go the other way, passing non-query statements with the execute statement:
*** Drop and re-create table emp_list; proc sql; connect to excel as xldb(path="&inputdir.\&fname"); execute(drop table 'emp_list') by xldb; execute(create table 'emp_list' (NAME_FOR_DOCUMENTATION VARCHAR(20) not null, DEPARTMENT VARCHAR(10), EMAIL_ADDRESS VARCHAR(20)) ) by xldb; quit;
Documentation for using the pass-through facility to access Excel/Access files in SAS 9.1.3 is here.
Next time I’ll take a look at some other ways to start conversations with Excel, this time using Python.