Names in Excel

First off, I should apologise for the complete lack of posts over the last few weeks. It’s been a rather busy time for a month or so now, and while there hasn’t been a lack of things to write about, there has certainly been a lack of energy to actually write about them.

Anyway, I posted a few weeks back about the modelling life cycle, where I talked about the desired outcomes of implementing a standard ‘best practice’ modelling process. One of these was effective documentation of the model – effective in that everyone who needs to build, use, test and maintain a model has access to the information they need to understand its intent. The documentation should primarily be external to the model itself, but in my opinion, a best practice model design should also apply an approach adhering to the principles of self-documenting code. That is, the model should be built in a way that is easy to understand, use and extend without the need for extensive reference to external documentation.

Names in Excel are a key part of that approach. While it is entirely possible to build a model which lives up to its specification without using names at all, I can almost guarantee that appropriate and judicious use of names would shorten development time, produce formulas that are easier for the tester and/or user to understand, and produce something that would be easier to extend or modify. There are many references out there to how names can be used in Excel, so without trying to repeat or summarise all of those, I’ll just highlight what I see as the main benefits of each technique in this context.

The first and most obvious way to use names is in defining named ranges. A cell named inpDiscountRate is meaningful to both a developer and a user – The prefix inp signals that the range is an input or assumption, and DiscountRate tells them what it denotes in the context of the model. This means that a formula referencing the discount rate in the model is far more comprehensible at first glance than something which has a reference to, say, $B$6. It’s also helpful to be able to denote lookup tables and arrays used in the model. for instance in a table where an ID field is located in column A, there are a couple of ways to retrieve the ID value from elsewhere in the table. One is to name the whole column (excluding the header), maybe using a dynamic range name, defining ptrID as

=OFFSET($A$1,1,0,COUNTA($A:$A)-1,0)

Referring to ptrID (The ptr prefix denotes a pointer) in a formula in another column will then pick up the value in the ID column by implicit intersection. Another way to accomplish the same feat is to define a relative range name, again ptrID, as $A6, where the active cell is, say, B6. The reference is row-relative, so that a reference to ptrID picks up the cell in the same row as that the formula is entered in, but always looks in column A. I prefer using relative references as they’re less work to code, but either way we get the same benefit, of a formula which has more meaning than one which relies on cell addresses.

Names can also contain constants, thereby eliminating the need for using ‘magic numbers’ in formulae. Using

=ptrCellToLeft*inpDiscountRate^(1/cNumPeriods)

where cNumPeriods is defined to have the value 4, means that not only is the logic in the formula more transparent than it would be otherwise, but it is also more flexible. A change in granularity from quarterly to monthly requires no more than changing the value of cNumPeriods to 12.

Finally, a name can also encapsulate a formula. The benefit I see in a named formula is principally that it can serve to simplify otherwise unwieldy calculations. However, I prefer not to make too much use of them in this way, as in my opinion they can also serve to obscure the logic involved. In most cases, if a formula is too long to understand easily at first reading, a better approach to simplifying it is to split it into two or more cells. Assuming those cells are themselves assigned names, the end result will look pretty similar, but it is easier for the reader to access the logic. Also, this approach could conceivably improve performance – a really long formula can have a lot of dependencies, so by splitting it up we can reduce the amount of recalculation required when any one of those changes.

I also should make mention of the excellent Name Manager add-in, available from JKP Application Development Services. Working with the Insert Names dialog can be very time-consuming, but this add-in greatly simplifies the creation, modifying and documenting of names. The addition of the built-in Name Manager in Excel 2007 is a great improvement, but as was pointed out in a comment on an earlier post, it still doesn’t equal the functionality of this very handy add-in.

Advertisements