Regex Functions in Excel


Part 2 of 4

Continuing with my exploration of regular expressions in Excel, from Part 1 – so far I’ve got to the stage of writing a couple of functions, and planning a few more. As promised, I’ll detail here what the add-in I’m constructing should deliver to the user.

First, a suite of worksheet functions:

  • RXFIND(find_pattern, within_text, [start_num], [case_sensitive])
    – This is probably the easiest one to get right – a simple emulation of the functionality of the Excel built-in function FIND. As with FIND, the inputs are the pattern to test for, followed by the text to look within. Optionally, the start position can be specified (Default=0) as can case-sensitivity (Default=TRUE). Although Excel implements a separate SEARCH function which is case-insensitive, I’d rather save myself repeating the same code and added a parameter in to allow the user to specify whether case matters or not. Parenthetically, my belief, well-founded or not, is that case always matters.
    Here’s the code I’ve come up with so far: 

    Public Function RXFIND(ByRef find_pattern As Variant, _
                        ByRef within_text As Variant, _
                        Optional ByVal start_num As Variant, _
                        Optional ByVal case_sensitive As Boolean) As Long     
    
    Dim objRegex As VBScript_RegExp_55.RegExp
    Dim colMatch As VBScript_RegExp_55.MatchCollection
    Dim vbsMatch As VBScript_RegExp_55.Match
    Dim sMatchString As String
    
    Set objRegex = New VBScript_RegExp_55.RegExp
    
    ' Initialise Regex object
    With objRegex
        .Global = False
        ' Default is case sensitive
        If IsMissing(case_sensitive) Then
            .IgnoreCase = False
        Else: .IgnoreCase = Not case_sensitive
        End If
        .Pattern = find_pattern
    End With
    
    If IsMissing(start_num) Then start_num = 0
    sMatchString = Right$(within_text, Len(within_text) - start_num)
    
    ' Create Match collection
    Set colMatch = objRegex.Execute(sMatchString)
    If colMatch.Count = 0 Then ' No match
        RXFIND = 0
    Else
        Set vbsMatch = colMatch(0)
        RXFIND = vbsMatch.FirstIndex + start_num + 1
    End If
    End Function
  • RXSUB(text,  old_text,  new_text, [instance_num], [case_sensitive], [is_global])
    – This one is similar to the Excel built-in SUBSTITUTE, although as with RXFIND, it allows a couple of extra parameters. It’s also a little trickier to code, as the optional is_global parameter (Default=FALSE) set to TRUE renders instance_num (Default=0) meaningless. I still haven’t got this in a fit state to post the code yet, but it differs only a little from RXFIND, in that instead of using the FirstIndex property and Execute method of the RegExp object, it only needs to access the Replace method, and of course it returns a string rather than a long. Work in progress.
  • RXGET(find_pattern, within_text, [start_num], [case_sensitive])
    I still haven’t completely figured out what to have this one do yet, but it’s going to rely on the Value property of the Match object, and will possibly allow for returning Submatched groups. Another work in progress.
  • RXMATCH(find_pattern, within_array, [case_sensitive])
    Returns the position of the first exact match with the find_pattern from within_array. Yet to begin.

As well as worksheet functions, I’ve been tossing around the idea of a Find/Replace dialog with regex bells and whistles. I have got no further than mimicking the Excel built-in dialog so far (i.e. no functionality as yet), but I am hopeful I’ll have something that actually does something soon. For reasons I’ll explain a little further in the next post, I am undecided as to whether I should stick with the standard user form. Anyway, that’ll do for now…next time I’ll talk a little about some of the key decisions that need to be made – late vs early binding, .NET vs the VBScript library, VB6 add-in vs .xla.

Advertisements

One thought on “Regex Functions in Excel

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s