I was planning to continue the series with a post on accessing Excel data from Python – this is a brief diversion related to a problem I’ve encountered recently with using SAS to read Excel files.
I have a bunch of Excel files which I need to read every month. The relevant data is contained in a dynamic named range – that is, a name where the RefersTo string is typically a formula of the form
=OFFSET(fixedCell,0,0,COUNTA(columnInRange),rangeWidth)
This is useful when the amount of data varies from file to file – in this case, some files will have 5 rows, and some will have 5,000. The use of the COUNTA() function means that the number of rows in the range is evaluated by Excel, hence the height of the range adjusts automatically. Which is great, when you’re reading those ranges in Excel, but which can cause problems when you’re trying to get at them without the benefit of the Excel calculation engine. Case in point – the read_file macro from a previous post which uses the SAS Excel libname engine, fails to read dynamic ranges. I’m not 100% sure that it’s impossible to do so (the wonderful folks at SAS tech support are looking into it for me), but it has always failed so far.
So, I need a workaround. I came up with the following – it’s a bit of a clunker, but it works. The basic idea: run an Excel macro over the files which copies the data from the dynamic range into a static range in a new workbook.
Private Sub ReadFile(ByVal sBookName As String)
' Reads sitting time files in INPATH, copies all time entries
' to a new workbook in OUTPATH
Dim wbkFile As Excel.Workbook
Dim wbkOut As Excel.Workbook
Dim rngData As Excel.Range
Dim rngOut As Excel.Range
Dim sBranchName As String
Dim sMonth As String
' Specify required values for output range
Set wbkFile = Application.Workbooks.Open(INPATH & sBookName)
Set rngData = wbkFile.Names("rngTimeData").RefersToRange
sBranchName = wbkFile.Names("BranchName").RefersToRange.Value
sMonth = Format$(Evaluate(wbkFile.Names("Month").RefersTo), _
"mmyy")
' Add new workbook and specifies range to dump data into
Set wbkOut = Application.Workbooks.Add()
Set rngOut = wbkOut.Worksheets(1).Cells(1, 1)
Set rngOut = rngOut.Resize(rngData.Rows.Count, _
rngData.Columns.Count)
' transfer values across, name the range, save and close the new
' workbook
With rngOut
.Value = rngData.Value
End With
wbkOut.Names.Add "TimeData", rngOut
wbkFile.Close False
sBranchName = Replace$(sBranchName, " ", "_")
sBranchName = Left$(sBranchName, InStr(sBranchName, "_DC") - 1)
wbkOut.SaveAs OUTPATH & sBranchName & "_" & sMonth, _
xlWorkbookNormal
wbkOut.Close
End Sub
Public Sub GetFiles()
' Read all files in INPATH, save copies into OUTPATH and check
' which ones are missing
Dim fsObj As Scripting.FileSystemObject
Dim inFolder As Scripting.Folder
Dim inFile As Scripting.File
Application.ScreenUpdating = False
Set fsObj = CreateObject("Scripting.FileSystemObject")
Set inFolder = fsObj.GetFolder(INPATH)
For Each inFile In inFolder.Files
ReadFile inFile.Name
Next inFile
OutputDirectoryContents
Application.ScreenUpdating = True
End Sub
OutputDirectoryContents is a macro which writes the names of all the files in the directory OUTPATH to the main worksheet, then runs an advanced filter over the table of files that should be in there, returning those that aren’t (the Criteria range is set to File Found =FALSE):
Private Sub OutputDirectoryContents()
' Lists contents of OUTPATH on the front sheet and returns any
' missing filenames
Dim fsObj As Scripting.FileSystemObject
Dim outFolder As Scripting.Folder
Dim outFile As Scripting.File
Dim wksht As Excel.Worksheet
Dim rngFiles As Excel.Range
Dim startCell As Excel.Range
Dim rngSource As Excel.Range
Dim rngCriteria As Excel.Range
Dim rngExtract As Excel.Range
' set required variables...
Set fsObj = CreateObject("Scripting.FileSystemObject")
Set outFolder = fsObj.GetFolder(OUTPATH)
Set wksht = ThisWorkbook.Worksheets(1)
Set rngFiles = ThisWorkbook.Names("Files_Found").RefersToRange
Set startCell = rngFiles(1)
Set rngSource = ThisWorkbook.Names("FileWishlist").RefersToRange
Set rngCriteria = ThisWorkbook.Names("rngCriteria").RefersToRange
Set rngExtract = ThisWorkbook.Names("rngExtract").RefersToRange
' Clear, then repopulate, range of files found in the specified folder
rngFiles.ClearContents
startCell.Value = Format$(DateAdd("m", -1, Date), "mmm-yyyy")
For Each outFile In outFolder.Files
Set startCell = startCell.Offset(1)
startCell.Value = outFile.Name
Next outFile
' Refresh filter to see which ones are missing...
RefreshAdvancedFilter rngCriteria, rngExtract, rngSource, , True
End Sub
Public Sub RefreshAdvancedFilter(ByRef rngCriteria As Excel.Range, _
ByRef rngExtract As Excel.Range, _
ByRef rngSource As Excel.Range, _
Optional ByVal IsCopy As Boolean = True, _
Optional ByVal IsUnique As Boolean = False)
' Refreshes advanced filter of the source range
Dim lCopy As Long
lCopy = IIf(IsCopy, xlFilterCopy, xlFilterInPlace)
rngSource.AdvancedFilter Action:=lCopy, criteriarange:=rngCriteria, _
copytorange:=rngExtract, unique:=IsUnique
End Sub
Once I’m sure that all the files are there that need to be, I can run the SAS macro get_all_files over the contents of OUTPATH to create the datasets I want. Slightly convoluted (and a little wasteful in that the data is duplicated), but it gets the job done, so I’m happy.