To recap where we left off in the first part, we’d managed to import xml data from the rss feed for NZD rates at themoneyconverter.com into a table (which I’ve named *Rates_Table*). This gave us a list of rates which we can use – however the format is not quite right. In the “description” member of “item”, the rate is expressed in a structured text format, like so: “1 New Zealand Dollar = 2.96886 Argentine Peso”. Now of course that would be useful as it is, if I were prepared to manually type out the 2.96886 part myself to convert from NZ dollars to pesos. Which of course I’m not, lazy data analyst that I am.

What I’m going to do instead is to extract the two relevant parts of that text, the two bits that I need to know: the rate (the numeric bit, in this case 2.96886) and the foreign currency description (the character string that comes after the numeric bit, in this case “Argentine Peso”). This is something that can be done with text functions like MID and FIND in Excel, of course, and this is a perfectly legitimate way to go about it. However, unless you can find a really clever way to do it, you’ll end up with a really long, convoluted and awkward formula. Which is a great reason to use regular expression functions instead. RXGET for instance, is designed to do exactly this sort of extraction. Here’s how I went about it:

First, we need to determine the regular expression pattern that fits the text. The pattern I chose is

`=\s(\d*\.\d+)\s(.+)`

or to break it down a little,

- an equals sign followed by a space: =\s;
- followed by a group, which in turn contains zero or more digits, followed by a decimal point, followed by one or more digits: (\d*\.\d+)
*This will be used to capture the rate part of the expression. Note that the decimal point needs to be escaped here, as it signifies an arbitrary character in a regular expression*; - followed by another space: \s;
- followed by a second group, which contains one or more arbitrary characters: (.+)
*This will capture everything after the space, which in this case will be the currency description. Here the decimal point should not be escaped as we need it to match any character.*

Once I know that this pattern works, I add a new sheet titled ‘Reference’ and enter the pattern into a cell that I name *Rate_Pattern*. Now I can use it to extract the rate and the currency using RXGET. I add a column with the header *rate* at the right-hand end of the *Rates_Table* and enter in it the RXGET formula, with the following arguments:

This corresponds to the formula:

`=RXGET(Rate_Pattern,Rates_Table[[#This Row],[description]],1)`

.

Note that the Submatch argument has the value 1, as we’re extracting the 1st matched group from the expression. To get the currency description, in another new column with the header *currency* I enter the same formula with the Submatch argument changed to 2 to extract the second group. So here’s how the table looks right now (click on the image to take a closer look):

In the final part of this series I’ll complete the converter by adding in a selector to pick out a rate from the table by specifying the currency.

Good Afternoon and Happy Easter Geoff,

I am attempting to follow your steps in building the excel currency converter, and so far successfully completed part 1.

Which I thank you for your clear and concise steps.

I have very limited knowledge in programming, unfortunately I am having difficulties setting up the RXGET formula, to break up the data. I’ve downloaded your macro.. however I am unable to proceed with using the function.

Hi Johnny, thanks for reading. After downloading the Regex add-in, you’ll need to enable it in order to use the functions. The steps for doing that differ according to which version of Excel you’re using – for Excel 2007 here’s a link that talks you through it: http://office.microsoft.com/en-us/excel-help/enable-or-disable-add-ins-in-office-programs-HA010034127.aspx

Once you’ve done that, you should be able to see the functions by clicking Formulas->Insert Function->select a category. The item Regular Expression Functions should now be in the list. Feel free to email me direct (address at https://excelicious.wordpress.com/about/) if you are still having trouble.