Part 2 of 4
Continuing with my exploration of regular expressions in Excel, from Part 1 – so far I’ve got to the stage of writing a couple of functions, and planning a few more. As promised, I’ll detail here what the add-in I’m constructing should deliver to the user.
First, a suite of worksheet functions:
- RXFIND(find_pattern, within_text, [start_num], [case_sensitive])
– This is probably the easiest one to get right – a simple emulation of the functionality of the Excel built-in function FIND. As with FIND, the inputs are the pattern to test for, followed by the text to look within. Optionally, the start position can be specified (Default=0) as can case-sensitivity (Default=TRUE). Although Excel implements a separate SEARCH function which is case-insensitive, I’d rather save myself repeating the same code and added a parameter in to allow the user to specify whether case matters or not. Parenthetically, my belief, well-founded or not, is that case always matters.
Here’s the code I’ve come up with so far:
Public Function RXFIND(ByRef find_pattern As Variant, _ ByRef within_text As Variant, _ Optional ByVal start_num As Variant, _ Optional ByVal case_sensitive As Boolean) As Long Dim objRegex As VBScript_RegExp_55.RegExp Dim colMatch As VBScript_RegExp_55.MatchCollection Dim vbsMatch As VBScript_RegExp_55.Match Dim sMatchString As String Set objRegex = New VBScript_RegExp_55.RegExp ' Initialise Regex object With objRegex .Global = False ' Default is case sensitive If IsMissing(case_sensitive) Then .IgnoreCase = False Else: .IgnoreCase = Not case_sensitive End If .Pattern = find_pattern End With If IsMissing(start_num) Then start_num = 0 sMatchString = Right$(within_text, Len(within_text) - start_num) ' Create Match collection Set colMatch = objRegex.Execute(sMatchString) If colMatch.Count = 0 Then ' No match RXFIND = 0 Else Set vbsMatch = colMatch(0) RXFIND = vbsMatch.FirstIndex + start_num + 1 End If End Function
- RXSUB(text, old_text, new_text, [instance_num], [case_sensitive], [is_global])
– This one is similar to the Excel built-in SUBSTITUTE, although as with RXFIND, it allows a couple of extra parameters. It’s also a little trickier to code, as the optional is_global parameter (Default=FALSE) set to TRUE renders instance_num (Default=0) meaningless. I still haven’t got this in a fit state to post the code yet, but it differs only a little from RXFIND, in that instead of using the FirstIndex property and Execute method of the RegExp object, it only needs to access the Replace method, and of course it returns a string rather than a long. Work in progress.
- RXGET(find_pattern, within_text, [start_num], [case_sensitive])
I still haven’t completely figured out what to have this one do yet, but it’s going to rely on the Value property of the Match object, and will possibly allow for returning Submatched groups. Another work in progress.
- RXMATCH(find_pattern, within_array, [case_sensitive])
Returns the position of the first exact match with the find_pattern from within_array. Yet to begin.
As well as worksheet functions, I’ve been tossing around the idea of a Find/Replace dialog with regex bells and whistles. I have got no further than mimicking the Excel built-in dialog so far (i.e. no functionality as yet), but I am hopeful I’ll have something that actually does something soon. For reasons I’ll explain a little further in the next post, I am undecided as to whether I should stick with the standard user form. Anyway, that’ll do for now…next time I’ll talk a little about some of the key decisions that need to be made – late vs early binding, .NET vs the VBScript library, VB6 add-in vs .xla.