Project Euler #18 and #67

I’m hoping to get back to work on the regular expressions filter soon – should have some time over the Christmas/New Year break – but in the meantime, I took a look at the Euler problems related to number triangles. Both problems ask for the maximal path through adjacent points in the triangle – from problem 18:

By starting at the top of the triangle below and moving to adjacent numbers on the row below, the maximum total from top to bottom is 23.

7 4
2 4 6
8 5 9 3

That is, 3 + 7 + 4 + 9 = 23.

Find the maximum total from top to bottom of the triangle below:

95 64
17 47 82
18 35 87 10
20 04 82 47 65
19 01 23 75 03 34
88 02 77 73 07 63 67
99 65 04 28 06 16 70 92
41 41 26 56 83 40 80 70 33
41 48 72 33 47 32 37 16 94 29
53 71 44 65 25 43 91 52 97 51 14
70 11 33 28 77 73 17 78 39 68 17 57
91 71 52 38 17 14 91 43 58 50 27 29 48
63 66 04 68 89 53 67 30 73 16 69 87 40 31
04 62 98 27 23 09 70 98 73 93 38 53 60 04 23

It is of course possible to work out the sum of every single path, by starting at the top and trying every combination of left and right choices to get to the bottom, but for a triangle with n rows this works out at 2n-1 sums to calculate. This makes the brute-force approach impractical for the larger triangle in problem 67 (100 rows); it also involves a whole lot of unnecessary calculation.

The key to seeing this is understanding that (1) every point in the triangle above the bottom row is itself the apex of a sub-triangle, that has its own maximal path, and (2) where any sub-triangles intersect, this intersection contains another sub-triangle – so working out all the paths through the two intersecting triangles involves working out every path through the intersection twice. For example, in the triangle

7 4
2 4 6
8 5 9 3

the two sub-triangles with apexes at 7 and 4 both contain the sub-triangle

5 9

with two possible paths. So calculating every possible path through the larger triangle involves traversing both these paths twice, once for each sub-triangle containing them. It’s easy to see that this gets worse quickly as the triangle gets larger.

A smarter approach is what might be termed a ‘bottom-up’ method. Working from the base upwards, calculate the maximal path for every point in each row. Once you have the maxima for two adjacent points in a row, the maximum for the point above both these points is simply the sum of this point and the maximum of these two values. The advantage of this method is that the maximum for any point in the triangle is calculated once and only once. Here’s how this looks in Python.

First, a helper function get_triangle to read the string in and get a list of rows of integers making up the triangle:

def get_triangle(triangle_string):
    #reverse to work from the base up
    return [[int(x) for x in row.split()] \
        for row in string_list]

Then, the main routine max_path works through each row, progressively replacing the list of maxima at each point in the previous row with the same for the current row:

def max_path(triangle_string):
    triangle = get_triangle(triangle_string)
    #every point in the base is its own maximum
    max_list = triangle[0]
    for row in range(1, len(triangle)):
        #add the greater of the two adjacent maxima to
        #each point in the row
        max_list = \
            [triangle[row][x] + max(max_list[x], max_list[x+1]) \
             for x in range(len(triangle[row]))]
    return max_list[0]

That’s all for now – time permitting I will be posting the next part of the regular expression filter sometime in the next couple of days. Until then, hoping you all had a restful and happy Christmas and are taking some time off to be with family. Take care…


Project Euler #26

Problem 26 at Project Euler asks

A unit fraction contains 1 in the numerator. The decimal representation of the unit fractions with denominators 2 to 10 are given:

^(1)/_(2) = 0.5

^(1)/_(3) = 0.(3)

^(1)/_(4) = 0.25

^(1)/_(5) = 0.2

^(1)/_(6) = 0.1(6)

^(1)/_(7) = 0.(142857)

^(1)/_(8) = 0.125

^(1)/_(9) = 0.(1)

^(1)/_(10) = 0.1

Where 0.1(6) means 0.166666…, and has a 1-digit recurring cycle. It can be seen that ^(1)/_(7) has a 6-digit recurring cycle.

Find the value of d < 1000 for which ^(1)/_(d) contains the longest recurring cycle in its decimal fraction part.

This is a problem about repeated division by the same number – how many divisions can you do before you start repeating yourself? The division algorithm can be expressed as follows:

def div_by(rem, div):
    y = min([x for x in range(5) if rem*(10**x) > div])
    return rem*(10**y) % div

So then I need to use this to repeatedly divide by the test value d – len_pattern calls div_by until it detects a repeated remainder:

def len_pattern(init, rems, d):
    rem=div_by(init, d)
    if rem in rems:
        return len(rems) - rems.index(rem)
        return len_pattern(rem, rems, d)

I then have to use this to find out which d < 1000 has the longest cycle. However, I don't need to call it on all the numbers under 1000 – the only ones I really need to test are the primes. The upper bound on the length of the cycle is n-1 for any number n (for example the length of the cycle generated by 7 is 6), and primes are the only numbers which generate cycles of this length. So I need to use the list generated by the primes_under() function, then call len_pattern on each of these, returning the maximum length.

def get_longest():
    from EulerFunc import primes_under
    divs = dict([(div, len_pattern(1.0, [], div)) \
        for div in primes_under(1000) if div > 5])
    longest=max(divs,  key=lambda x: divs[x])
    print "Longest cycle is for %d with length %d" % \
        (longest, divs[longest])

if __name__ == '__main__':
    from timeit import Timer
    t = Timer("get_longest()",  "from __main__ import get_longest")
    print t.timeit(number=100)

>>> Longest cycle is for 983 with length 884
Longest cycle is for 983 with length 884
Longest cycle is for 983 with length 884
Longest cycle is for 983 with length 884

So it takes on average around 1.5 seconds to run. Not bad, I see some people had Python code which ran in under a second, so I’m sure this could be optimised…

Finding primes with Python

One of the Euler problems requires finding all the primes below a certain number x.  Without thinking about this very much, it’s possible to take a very easy approach. Let’s say, look at all the numbers below x and check whether each of them is prime. In pseudocode, this would go as follows:
for each y less than x
if y is prime, add it to the list
return the list

The only requirement for this to work is a test to determine whether y is prime. Well, that’s easy enough. This would do the trick:

import math

def is_prime(n):
  pre: n +ve integer
  returns: true if n prime, else false
  for i in range(2, int(math.sqrt(n)) + 1):
    if n % i == 0:
      return False
  return True

So, putting that into code, we’d get:

def primes_under(n):
  pre: n +ve integer
  returns: list of all primes under n
  result = [2]
  # only need to test odd numbers
  for x in xrange(3, n, 2):
    if is_prime(x):
  return result

Now the only problem with this approach is that, as the numbers get larger, the prime test takes longer. For n = 10,007 (which is prime) the test requires 100 division operations. Even removing even numbers from the list we’re checking, this mounts up to hundreds of thousands of operations getting through the full list. If we wanted to test up to, say, 1,000,000, the number of operations required at that point is up in the hundreds of millions. Clearly, this will get prohibitively expensive after this point – so it’s time to start thinking about a smarter way to do this.

One way which does not require nearly as many operations is to use a prime sieve. This was an idea that occurred to Eratosthenes a couple of thousand years ago – it still impresses me to consider its effectiveness and its simplicity. Here’s a quick rundown of the algorithm from its Wikipedia entry:

  1. Create a contiguous list of numbers from two to some highest number n.
  2. Strike out from the list all multiples of two (4, 6, 8 etc.).
  3. The list’s next number that has not been struck out is a prime number.
  4. Strike out from the list all multiples of the number you identified in the previous step.
  5. Repeat steps 3 and 4 until you reach a number that is greater than the square root of n (the highest number in the list).
  6. All the remaining numbers in the list are prime.

There is a trade-off here: this approach is far faster than testing all numbers (and the bigger the number, the more noticeable this gain is), but it requires the space in memory to hold the array. Quite often this is the expense of improvement: reduce one cost and another increases.
This is how the algorithm looks in Python:

import math

def primes_under(n):
  pre: n +ve integer
  returns: list of all primes under n
  sieve = [True]*n
  for x in xrange(4, n, 2):
    sieve[x] = False
  # any multiple of num above sqrt(n) is also a multiple of a
  # lower prime, hence has already been eliminated
  for num in xrange(3, int(math.sqrt(n)) + 1, 2):
    if sieve[num]:
      for y in xrange(num**2, n, 2*num):
        sieve[y] = False
  result = [x for x in range(2, n) if sieve[x]]
  return result

Project Euler #2

Ok, I don’t intend to post all of them, but here’s the second problem:

# Problem 2
# Each new term in the Fibonacci sequence is generated by adding the previous
# two terms. By starting with 1 and 2, the first 10 terms will be:

# 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, …

# Find the sum of all the even-valued terms in the sequence which do not
# exceed four million.
def gen_Fib_even(a, b, below):
“””pre: a, b, below +ve integers
returns: generator with an iterator to obtain the Fibonacci sequence
while (b < below): if b % 2 == 0: yield b c = b b = a + b a = c print sum(gen_Fib_even(1, 1, 4000000))[/sourcecode]

Project Euler #1

Tonight I decided I’d give the problems at Project Euler a go using Python…here is my first shot at problem 1:

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.

Find the sum of all the multiples of 3 or 5 below 1000.


def sum_multiples(num1, num2, upper):
pre: num1, num2, upper +ve integers
num1, num2 < upper returns: sum of all multiples of num1 and num2 below upper """ result = 0 for x1 in xrange(0, upper, num1): result += x1 for x2 in xrange(0, upper, num2): if x2 % num1 != 0: result += x2 return result[/sourcecode] Looking through some of the solutions on the forum, there were some really insightful and elegant ways proposed to retrieve the result, often reducing the problem to a single calculation. I'm really looking forward to seeing what comes out of the later solutions.