Regex Functions in Excel, part 3


Part 3 of 4

I’ve left this project alone for quite a while, until finally this weekend I had a little time to put into working on it. In part 2 I was still a little undecided as to how I was going to implement some of the worksheet functions, but now I am more or less satisfied with them. I’ve bundled the functions into Regex.xla, which is now available for download. The project is open for viewing, so code may be inspected or modified as you see fit – please let me know if you discover any errors or anything that could have been done better/smarter.

This is the first general purpose add-in I’ve created (I have written several application-specific add-ins), so I’d be interested to know if anyone has any feedback on this one. Some things I have not included which I’d like to:

  • Help files – I have started writing some html to provide users with an explanation of the functions and examples of their use, but I’m finding this very time-intensive. Next release I’ll include them.
  • A Regex Find/Replace dialog – still a work in progress, I’ve decided to leave this until I make a decision about whether to release the add-in as a compiled .dll file or just leave it as a .xla. At this point I have yet to test whether or not there would be any performance benefit in compiling the add-in in VB6 (or VB.NET?), which would be the key benefit I see in doing so.
  • A basic regular expression syntax included – while there are plenty of good sites to look at for regex syntax, it’d be great to include a quick overview of how they work and maybe a short list of FAQ-type answers to the usual ‘how do I do x?’ questions. Next release.

So, what’s included so far? Rather than post the entire source code for all five functions, I thought I’d put in all the header comments here, as they provide a reasonable explanation of their purpose (Roy MacLean wrote an excellent post on comments in code which motivated me putting these in – ordinarily until now my commenting in the header has been pretty sparse). So here they are:

Public Function RXFIND(ByRef find_pattern As Variant, _
                        ByRef within_text As Variant, _
                        Optional ByVal start_num As Long, _
                        Optional ByVal case_sensitive As Boolean) As Long
' RXFIND - Returns the starting position of text matching the regex pattern
' find_pattern in the string within_text, if a match is found.
' Optional long start_num specifies the number of the character in within_text
' to start matching. Default=0.
' Optional boolean case_sensitive makes the pattern case sensitive if true,
' insensitive otherwise. Default=true.

Public Function ISRXMATCH(ByRef find_pattern As Variant, _
                        ByRef within_text As Variant, _
                        Optional ByVal case_sensitive As Boolean) As Boolean
' ISRXMATCH - Returns true if the regex pattern find_pattern is matched
' in the string within_text, false otherwise.
' Optional boolean case_sensitive makes the pattern case sensitive if true,
' insensitive otherwise. Default=true.

Public Function RXGET(ByRef find_pattern As Variant, _
                        ByRef within_text As Variant, _
                        Optional ByVal submatch As Long, _
                        Optional ByVal start_num As Long, _
                        Optional ByVal case_sensitive As Boolean) As String
' RXGET - Looks for a match for regular expression pattern find_pattern
' in the string within_text and returns it if found, error otherwise.
' Optional long submatch may be used to return the corresponding submatch
' if specified - otherwise the entire match is returned.
' Optional long start_num specifies the number of the character to start
' searching for in within_text. Default=0.
' Optional boolean case_sensitive makes the regex pattern case sensitive
' if true, insensitive otherwise. Default=true.

Public Function RXMATCH(ByVal find_pattern As Variant, _
                    ByVal within_range As Variant, _
                    Optional ByVal case_sensitive As Boolean) As Long
' RXMATCH - Searches for the first cell matching regular expression pattern
' find_pattern in the range argument within_range.
' Optional boolean case_sensitive makes the regular expression case
' sensitive if true, insensitive otherwise. Default=true.
' within_range should be a single row or column range. If the range has
' multiple rows and columns, only the first row is searched.

Public Function RXSUB(ByRef within_text As Variant, _
                    ByRef old_text As Variant, _
                    ByRef new_text As Variant, _
                    Optional ByVal start_num As Long, _
                    Optional ByVal case_sensitive As Boolean, _
                    Optional ByVal is_global As Boolean) As String
' RXSUB - Function to substitute text matching regex pattern old_text with
' string argument new_text in the string within_text.
' Optional long start_num specifies which character to start replacement from,
' default=0.
' Optional boolean case_sensitive will make the pattern case sensitive if true
' and insensitive otherwise. Default=true.
' Optional boolean is_global will substitute new_text for *all* occurrences
' matching old_text if true, only the first otherwise. Default=false.

In my final post on this project, I’ll include some examples of these functions in use, and hopefully have an updated add-in with help files and a find/replace dialog. Maybe also a discussion of why I did or didn’t go with a compiled .dll. Maybe also a regex syntax. Maybe I’ll need to make this series into 5 posts…

[EDIT: I’ve changed the download link to a fileden url, this can be accessed directly now.]

Advertisements

4 thoughts on “Regex Functions in Excel, part 3

  1. Tim Mayes says:

    Geoff,

    For whatever reason, the add-in isn’t being recognized. Excel (2003 and 2007) says that it isn’t a valid add-in.

    I changed the extension to .xls and opened it in Excel 2003. That leads to an error message that says it is missing a styles.css file (never seen that for an add-in before). Then it opens what appears to be a RapidShare web page. I’m guessing that there is something wrong with the link.

    • geoffness says:

      Hi Tim, unfortunately this is one of the perils of relying on free hosting. What I thought was a direct link to the file at Rapidshare was actually a link to the web page at Rapidshare which *does* have the direct link to the file – so right click and save as just downloads that web page. I’m sorry, I should have tested it out first.
      When I get a chance tonight I’ll put it up on fileden or similar…in the meantime, thanks for downloading!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s