I was planning to continue the series with a post on accessing Excel data from Python – this is a brief diversion related to a problem I’ve encountered recently with using SAS to read Excel files.
I have a bunch of Excel files which I need to read every month. The relevant data is contained in a dynamic named range – that is, a name where the RefersTo string is typically a formula of the form
This is useful when the amount of data varies from file to file – in this case, some files will have 5 rows, and some will have 5,000. The use of the COUNTA() function means that the number of rows in the range is evaluated by Excel, hence the height of the range adjusts automatically. Which is great, when you’re reading those ranges in Excel, but which can cause problems when you’re trying to get at them without the benefit of the Excel calculation engine. Case in point – the read_file macro from a previous post which uses the SAS Excel libname engine, fails to read dynamic ranges. I’m not 100% sure that it’s impossible to do so (the wonderful folks at SAS tech support are looking into it for me), but it has always failed so far.
So, I need a workaround. I came up with the following – it’s a bit of a clunker, but it works. The basic idea: run an Excel macro over the files which copies the data from the dynamic range into a static range in a new workbook.
Private Sub ReadFile(ByVal sBookName As String) ' Reads sitting time files in INPATH, copies all time entries ' to a new workbook in OUTPATH Dim wbkFile As Excel.Workbook Dim wbkOut As Excel.Workbook Dim rngData As Excel.Range Dim rngOut As Excel.Range Dim sBranchName As String Dim sMonth As String ' Specify required values for output range Set wbkFile = Application.Workbooks.Open(INPATH & sBookName) Set rngData = wbkFile.Names("rngTimeData").RefersToRange sBranchName = wbkFile.Names("BranchName").RefersToRange.Value sMonth = Format$(Evaluate(wbkFile.Names("Month").RefersTo), _ "mmyy") ' Add new workbook and specifies range to dump data into Set wbkOut = Application.Workbooks.Add() Set rngOut = wbkOut.Worksheets(1).Cells(1, 1) Set rngOut = rngOut.Resize(rngData.Rows.Count, _ rngData.Columns.Count) ' transfer values across, name the range, save and close the new ' workbook With rngOut .Value = rngData.Value End With wbkOut.Names.Add "TimeData", rngOut wbkFile.Close False sBranchName = Replace$(sBranchName, " ", "_") sBranchName = Left$(sBranchName, InStr(sBranchName, "_DC") - 1) wbkOut.SaveAs OUTPATH & sBranchName & "_" & sMonth, _ xlWorkbookNormal wbkOut.Close End Sub Public Sub GetFiles() ' Read all files in INPATH, save copies into OUTPATH and check ' which ones are missing Dim fsObj As Scripting.FileSystemObject Dim inFolder As Scripting.Folder Dim inFile As Scripting.File Application.ScreenUpdating = False Set fsObj = CreateObject("Scripting.FileSystemObject") Set inFolder = fsObj.GetFolder(INPATH) For Each inFile In inFolder.Files ReadFile inFile.Name Next inFile OutputDirectoryContents Application.ScreenUpdating = True End Sub
OutputDirectoryContents is a macro which writes the names of all the files in the directory OUTPATH to the main worksheet, then runs an advanced filter over the table of files that should be in there, returning those that aren’t (the Criteria range is set to File Found =FALSE):
Private Sub OutputDirectoryContents() ' Lists contents of OUTPATH on the front sheet and returns any ' missing filenames Dim fsObj As Scripting.FileSystemObject Dim outFolder As Scripting.Folder Dim outFile As Scripting.File Dim wksht As Excel.Worksheet Dim rngFiles As Excel.Range Dim startCell As Excel.Range Dim rngSource As Excel.Range Dim rngCriteria As Excel.Range Dim rngExtract As Excel.Range ' set required variables... Set fsObj = CreateObject("Scripting.FileSystemObject") Set outFolder = fsObj.GetFolder(OUTPATH) Set wksht = ThisWorkbook.Worksheets(1) Set rngFiles = ThisWorkbook.Names("Files_Found").RefersToRange Set startCell = rngFiles(1) Set rngSource = ThisWorkbook.Names("FileWishlist").RefersToRange Set rngCriteria = ThisWorkbook.Names("rngCriteria").RefersToRange Set rngExtract = ThisWorkbook.Names("rngExtract").RefersToRange ' Clear, then repopulate, range of files found in the specified folder rngFiles.ClearContents startCell.Value = Format$(DateAdd("m", -1, Date), "mmm-yyyy") For Each outFile In outFolder.Files Set startCell = startCell.Offset(1) startCell.Value = outFile.Name Next outFile ' Refresh filter to see which ones are missing... RefreshAdvancedFilter rngCriteria, rngExtract, rngSource, , True End Sub Public Sub RefreshAdvancedFilter(ByRef rngCriteria As Excel.Range, _ ByRef rngExtract As Excel.Range, _ ByRef rngSource As Excel.Range, _ Optional ByVal IsCopy As Boolean = True, _ Optional ByVal IsUnique As Boolean = False) ' Refreshes advanced filter of the source range Dim lCopy As Long lCopy = IIf(IsCopy, xlFilterCopy, xlFilterInPlace) rngSource.AdvancedFilter Action:=lCopy, criteriarange:=rngCriteria, _ copytorange:=rngExtract, unique:=IsUnique End Sub
Once I’m sure that all the files are there that need to be, I can run the SAS macro get_all_files over the contents of OUTPATH to create the datasets I want. Slightly convoluted (and a little wasteful in that the data is duplicated), but it gets the job done, so I’m happy.