Maintaining SAS formats in Excel

For one of the subject areas I deal with day to day, categorical data values are often stored in the database in narrow character fields, so we need to apply SAS formats to get readable values out in reports. To maintain these formats I store them in an Excel workbook, looking roughly like this:

This is a relatively painless way to manage keeping SAS formats up to date, as well as making them easily accessible to people who need to inspect them. Each format is stored in a named range, with the name of the range matching the name of the format. The column headers are chosen to agree with the SAS Format procedure requirement for an input control data set. The above range, for instance, would be named ‘excel_user’, and would be used to input the required data for a character format (Type=’C’) also named ‘excel_user’ (Fmtname).

So how do I get them into SAS from Excel? Once again, the SAS/ACCESS Excel libname statement is helpful here. First, we assign a libref ‘wb’ to the workbook holding the format ranges, and another libref ‘sasprogs’ which will be used to store the format catalog:

libname wb Excel "&path.\formats.xls" ver=2002 ;
libname sasprogs "&path" ;

Next we need to get a list of all the ranges that we’re interested in here. In this workbook the only named ranges are the ones containing the formats. Any other names that the libname engine detects will be the names of worksheets, which will contain a ‘$’ character. So, we simply exclude these from the query against the dictionary.tables table:

proc sql;
  create table tmp as
  select memname
  from dictionary.tables
  where libname='WB' 
    and index(memname, '$') = 0 ;
quit;

So now the table ‘tmp’ holds a list of all the format range names in the workbook. Now we need to use this table to guide SAS in building those formats. What we need to do is to submit a proc format statement for every one of those names. Like so:

%macro load_fmt(name) ;
  proc format library=sasprogs.sasfmts cntlin=wb.&name ;
%mend load_fmt;

Now we simply need to loop through the set of names to load each of the formats into the sasprogs.sasfmts catalog:

data _null_;
  set tmp;
  call execute('%load_fmt('||memname||')') ;
run;

All that’s left to do now is a little tidying up:

/*Close the file*/
libname wb clear ;
/*Inspect format contents*/
proc format library=sasprogs.sasfmts fmtlib cntlout=sasprogs.sasfmt_data; run;
/*Add them into the search path*/
options fmtsearch=(library sasfmts);

And we’re done – all the formats are now loaded and available in the SAS session. Now if there are any changes, or a new format needs to be created, just add it into the file, save it and re-run this script. Sorted!

Advertisements

Why Corporate Excel Training Sucks For Everyone

Today’s post comes from Sean Duffy, CEO at Excel Everest.

http://www.ExcelEverest.com | Corporations need to rethink the way they train their employees in Microsoft Excel.

Corporate Excel trainings, if they are offered at all, are usually three hour classes in which 20 people gather around a projector and more or less watch an instructor go through various features of Excel. Occasionally, students are asked to join along and type in a formula or two, but on the whole, it’s usually a teaching exercise, not a learning exercise.

At Excel Everest, we’ve heard this numerous times from numerous people, and we built our whole business around the premise that firstly, those who have a strong grounding of Excel inside organizations tend to be more successful, and secondly, that the current method of training employees is antique.

After having gone through a number of Excel trainings ourselves, we don’t think people are *actually* learning from the corporate Excel classes. Along with countless users we’ve talked to, a recent bit of feedback from an Excel Everest customer lays this out pretty clearly:

I have attended numerous one and two day programs “teaching” excel. I found that once the program is over, I am no better than I was before. For me most programs show us all the things Excel can do but they don’t really teach you Excel. Your program is the first that I feel that I am learning it such that I can use it. I also like that I can repeat and review any particular lesson any time. If you don’t do/use it you lose it. – Kimber

Kimber, like many others, has been there… she’s attended classes that have no relevance, no real world scenarios, and no accountability. Plus, our guess is that it’d be difficult to describe any of them as “fun.”

No only is this a losing situation for people like Kimber, but from the perspective of the company who’s paying for trainings, it’s also a risky proposition. There’s no way to know that employees are actually learning from the trainings, given that there’s no accountability. Return on investment simply cannot be shown.

We’ve thought deeply how to fix this problem of ROI as well as the problem of employees, well, not learning Excel, and we built Excel Everest, a complete training course in Excel that’s built entirely into an Excel file. Imagine a huge workbook that teaches employees Excel, challenges them with exercises and grades you automatically in a comprehensive scoreboard. We’ve built Excel Everest to be a self-service and fun learning tool. An employee can complete the tutorial on her own time.

From the employers standpoint, this makes perfect sense as well. If an employee completes Excel Everest, she can simply send the completed document over to her manager, and her manager can quickly see that she has learned the material. It’s as easy as that. The employer knows that the money they spent on Excel Everest is paying off.

If you’re interested in using Excel Everest inside your organization, say hello at http://www.ExcelEverest.com/Contact.aspx

// Excel Everest (http://www.ExcelEverest.com) is a complete Excel training course built inside an Excel file. The goal of Excel Everest is to provide an immersive, interactive, learning Experience for employees, all while demonstrating a return on investment to employers. Excel Everest is currently being used in organizations such as Google, Hymans Roberston, and PlayCore.

Some new regular expression functions

There are a couple of SAS functions I have found to be quite useful, and that I miss when I’m working in Excel. Probably the more useful of the two is the SCAN function. You give SCAN a text string, an integer n and optionally some delimiters, and SCAN will give you back the nth ‘word’ in a string, where the ‘words’ are delimited either by the characters you supplied or by default delimiter characters. For instance,

data _null_;
word5 = SCAN("The/number(of+potential.interactions|among^a)large*group<of-people,is$proportional.to!the square/of!their&number.", 5);
put word5=;
run;

will print

word5=interactions

to the log. If n is negative, it will count backwards (from right to left). So

data _null_;
word7 = SCAN("The/number(of+potential.interactions|among^a)large*group<of-people,is$proportional.to!the square/of!their&number.",-7);
put word7=;
run;

will print

word7=proportional

There’s also the function COUNTW, which will also take a text string and some delimiters as arguments, and returns the number of words in the string.

data _null_;
count = COUNTW("Consider, for example, the question of tax policy.");
put count=;
run;

produces

count=8

Once again regular expressions turn out to be ideally suited to translating this functionality to Excel worksheet functions. I’ve included them in Regex.xla as RXSCAN and RXCOUNTW:

Public Function RXSCAN(ByVal stringval As String, ByVal n As Integer, _
                        Optional ByVal dlm As String = "") As Variant
' RXSCAN - Returns the nth word from stringval, where
' words are delimited by any non-word character by default,
' or a specific character set by optional string dlm.

Dim objRegex As VBScript_RegExp_55.RegExp
Dim colMatch As VBScript_RegExp_55.MatchCollection
Dim charlist As String
Dim char As String
Dim i As Integer

' Build pattern
If dlm = "" Then ' Just looking for A-Za-z0-9_
  charlist = "\w+"
Else ' Negate character class specified
  charlist = "[^"
  For i = 1 To Len(dlm)
    char = Mid$(dlm, i, 1)
    If InStr("[\^-", char) Then char = "\" & char ' escape
    charlist = charlist & char
  Next i
  charlist = charlist & "]+"
End If

' Initialise Regexp object, get match collection
Set objRegex = New VBScript_RegExp_55.RegExp
With objRegex
  .Pattern = charlist
  .Global = True
  Set colMatch = .Execute(stringval)
End With

If colMatch.Count = 0 Then ' No words detected
    RXSCAN = CVErr(xlErrNA)
Else
  If Abs(n) > colMatch.Count Or n = 0 Then
    RXSCAN = CVErr(xlErrNum)
  Else
    If n > 0 Then ' Read from left to right
      RXSCAN = colMatch(n - 1)
    Else ' Right to left
      RXSCAN = colMatch(colMatch.Count + n)
    End If
  End If
End If

End Function

Public Function RXCOUNTW(ByVal stringval As String, _
                        Optional ByVal dlm As String = "") As Variant
' RXCOUNTW - Returns the count of words from stringval, where
' words are delimited by any non-word character by default,
' or a specific character set by optional string dlm.
Dim objRegex As VBScript_RegExp_55.RegExp
Dim colMatch As VBScript_RegExp_55.MatchCollection
Dim charlist As String
Dim char As String
Dim i As Integer

' Build pattern
If dlm = "" Then ' Just looking for A-Za-z0-9_
  charlist = "\w+"
Else ' Negate character class specified
  charlist = "[^"
  For i = 1 To Len(dlm)
    char = Mid$(dlm, i, 1)
    If InStr("[\^-", char) Then char = "\" & char ' escape
    charlist = charlist & char
  Next i
  charlist = charlist & "]+"
End If

' Initialise Regexp object, get match collection
Set objRegex = New VBScript_RegExp_55.RegExp
With objRegex
  .Pattern = charlist
  .Global = True
  Set colMatch = .Execute(stringval)
End With

If colMatch.Count = 0 Then ' No words detected
  RXCOUNTW = CVErr(xlErrNA)
Else
  RXCOUNTW = colMatch.Count
End If

End Function

I’ve also moved the add-in file to a dropbox folder. You can now find Regex.xla at http://dl.dropbox.com/u/29927305/Regex.xla