December 2009 – Number Cruncher

I’m hoping to get back to work on the regular expressions filter soon – should have some time over the Christmas/New Year break – but in the meantime, I took a look at the Euler problems related to number triangles. Both problems ask for the maximal path through adjacent points in the triangle – from problem 18:

By starting at the top of the triangle below and moving to adjacent numbers on the row below, the maximum total from top to bottom is 23.

3
7 4
2 4 6
8 5 9 3

That is, 3 + 7 + 4 + 9 = 23.

Find the maximum total from top to bottom of the triangle below:

75
95 64
17 47 82
18 35 87 10
20 04 82 47 65
19 01 23 75 03 34
88 02 77 73 07 63 67
99 65 04 28 06 16 70 92
41 41 26 56 83 40 80 70 33
41 48 72 33 47 32 37 16 94 29
53 71 44 65 25 43 91 52 97 51 14
70 11 33 28 77 73 17 78 39 68 17 57
91 71 52 38 17 14 91 43 58 50 27 29 48
63 66 04 68 89 53 67 30 73 16 69 87 40 31
04 62 98 27 23 09 70 98 73 93 38 53 60 04 23

It is of course possible to work out the sum of every single path, by starting at the top and trying every combination of left and right choices to get to the bottom, but for a triangle with n rows this works out at 2^n-1 sums to calculate. This makes the brute-force approach impractical for the larger triangle in problem 67 (100 rows); it also involves a whole lot of unnecessary calculation.

The key to seeing this is understanding that (1) every point in the triangle above the bottom row is itself the apex of a sub-triangle, that has its own maximal path, and (2) where any sub-triangles intersect, this intersection contains another sub-triangle – so working out all the paths through the two intersecting triangles involves working out every path through the intersection twice. For example, in the triangle

3
7 4
2 4 6
8 5 9 3

the two sub-triangles with apexes at 7 and 4 both contain the sub-triangle

4
5 9

with two possible paths. So calculating every possible path through the larger triangle involves traversing both these paths twice, once for each sub-triangle containing them. It’s easy to see that this gets worse quickly as the triangle gets larger.

A smarter approach is what might be termed a ‘bottom-up’ method. Working from the base upwards, calculate the maximal path for every point in each row. Once you have the maxima for two adjacent points in a row, the maximum for the point above both these points is simply the sum of this point and the maximum of these two values. The advantage of this method is that the maximum for any point in the triangle is calculated once and only once. Here’s how this looks in Python.

First, a helper function get_triangle to read the string in and get a list of rows of integers making up the triangle:

def get_triangle(triangle_string):
    string_list=triangle_string.strip().split('\n')
    #reverse to work from the base up
    string_list.reverse()
    return [[int(x) for x in row.split()] \
        for row in string_list]

Then, the main routine max_path works through each row, progressively replacing the list of maxima at each point in the previous row with the same for the current row:

def max_path(triangle_string):
    triangle = get_triangle(triangle_string)
    #every point in the base is its own maximum
    max_list = triangle[0]
    for row in range(1, len(triangle)):
        #add the greater of the two adjacent maxima to
        #each point in the row
        max_list = \
            [triangle[row][x] + max(max_list[x], max_list[x+1]) \
             for x in range(len(triangle[row]))]
    return max_list[0]

That’s all for now – time permitting I will be posting the next part of the regular expression filter sometime in the next couple of days. Until then, hoping you all had a restful and happy Christmas and are taking some time off to be with family. Take care…

	Alex on More Regex Functions, part…
	geoffness on Regular Expressions – ne…
	mikeg_de on Regular Expressions – ne…
	geoffness on Type or Class?
	vbanewbie on Type or Class?