Regular Expressions – new version


Just a quick post tonight to let you all know, I’ve added a new page for downloads, which contains a new version of the regular expressions add-in, compatible with Excel 2007 and later. I’ve added in a couple of utility functions for convenience (mine more than yours but you might find them useful), and a form to evaluate regular expressions against test strings. And there’s even documentation!

The documentation gives a (very) brief intro to the uses and abuses of regular expressions, a run-down of the worksheet functions in the add-in and some examples of their use. Here are a couple of those, I hope you find them useful.

Matching cells which contain variations on a word

There are some words in the English language which Americans, god bless them, spell in their own special way. However, given input on the spreadsheet from users who spell both ways (correctly and incorrectly), you may wish to match both variations of words like ‘realise’ (‘realize’) and ‘colour’ (‘color’).
The pattern to match realise/realize is simple: \breali(s|z)e\b
The word boundary markers ensure we are looking at a complete word, and the alternation of (s|z) means that we match both versions.
Applying the ISRXMATCH formula demonstrates this is successful:

Validating Email Addresses

Given a list of email addresses in a column on a spreadsheet, we wish to ensure that these stick to a form which at least obeys some of the rules governing the format of email addresses. As these are going to be used by a script to send emails, we wish to minimise the number of undeliverable responses due to invalid addresses. The basic rules we specify for these addresses are as follows:
The username part of the address contains one or more alphanumeric characters, and possibly some additional special characters. This is followed by a single @ sign, followed by the domain name, which consists of one or more alphanumeric and special characters, ending with a dot followed by the top-level domain. This must contain only alphanumeric characters, and there must be between 2 and 6 of these. The address should be the entire content of the cell, so the beginning and ending anchors are used at the start and end of the pattern. Case is unimportant, so the case_sensitive flag is set to false.
The pattern is as follows: ^[a-z0-9_%.+-]+@[a-z0-9-.]+\.[a-z]{2,6}$
This is then used in ISRXMATCH – a valid email address according to our rules above will return true:

The second address in the list fails due to the whitespace in the username, whereas the fourth fails because the domain name does not include a top-level domain part of a dot followed by 2-6 letters.
I borrowed this regex from http://www.regular-expressions.info/email.html. As well as a couple of alternative regexes to cover some edge cases which the above doesn’t catch, this page also discusses why email addresses can be tricky, and why you shouldn’t go overboard trying to cover every exception.

Advertisements

7 thoughts on “Regular Expressions – new version

  1. hpc says:

    On loading the add-in, I get an error message.

    Environement : Excel2010 (french) / windowsXP sp3 (french)
    Error message : error 1004
    Module : MUtility
    Sub : RegisterFunctions
    Line : Set rngFunctions = ThisWorkbook.Names(“FunctionTable”).RefersToRange

    Concerning the names :
    – FunctionTable is : defined : =DECALER(Functions!$B$3;0;0;NBVAL(Functions!$B$2:$B$32)-1;3)
    – firstrow is undefined : =#REF!#REF!
    – Matches is undefined : =DECALER(#REF!#REF!;0;0;NBVAL(#REF!#REF!);3)

    • geoffness says:

      Thanks for reporting the issue, and for the complete information! I’m at a loss to explain why that line is causing an error, as it appears the FunctionTable name is resolving ok for you. Still, it wouldn’t hurt to check – if you can, are you able to tell me the return value when you type ‘?ThisWorkbook.Names(“FunctionTable”).RefersToRange.Address’ into the Immediate window? If this causes the same error then at least we know that’s the problem. Anyway, feel free to email me direct (address at ‘About’) and I’ll try to get to the bottom of it.

      • smoky says:

        As a read in the Excel-Help concerning “RefersToRange” this method does not work with constants or functions in ranges (Excel 2010/german version). When you reference “FunctionTable” in the “normal” way, you can eliminate the issue. (I had the same problems with an older Excel-verion too and could get rid of the error in the same way.)

  2. Frank N. Thomsen says:

    Great work, ISRXMATCH was just what I needed.
    I too got the error on loading the add-in. Changed the range reference to “B3:D11”, the it loaded.
    I’m on an English 32bit Vista with Danish locale, Excel version is 14.0.6106.5005
    – Frank

  3. geoffness says:

    Thanks guys for your help in identifying the problem. I’ve updated the add-in now so FunctionTable is now a static range reference rather than the OFFSET formula it was previously. Hopefully this version will now load correctly in your locales, please let me know if not.

  4. mikeg_de says:

    Hi geoffness,

    I’m trying to utilize your addin but it seems that the reference to VBScript won’t work properly. It’s already activated by clicking Alt+F11 > Choosing the project > Extras (german version) > Verweise

    In the opened worksheed it’s also activated but all the time I get an error for the 1st expression like \w. What am I doing wrong? Hope you might have a hint.

    Win XP (VM)
    Excel 2007 (Student edition)
    VBScript 5.5

    Many thanks
    Mike

    • geoffness says:

      Hi Mike, sorry I took so long to reply. I’m not sure from what you wrote exactly what the error is – perhaps you could give me an example of the function you’re trying to enter and the arguments you’re passing to it, and what you would expect it to return.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s