Project Euler #26

Problem 26 at Project Euler asks

A unit fraction contains 1 in the numerator. The decimal representation of the unit fractions with denominators 2 to 10 are given:

^(1)/_(2) = 0.5

^(1)/_(3) = 0.(3)

^(1)/_(4) = 0.25

^(1)/_(5) = 0.2

^(1)/_(6) = 0.1(6)

^(1)/_(7) = 0.(142857)

^(1)/_(8) = 0.125

^(1)/_(9) = 0.(1)

^(1)/_(10) = 0.1

Where 0.1(6) means 0.166666…, and has a 1-digit recurring cycle. It can be seen that ^(1)/_(7) has a 6-digit recurring cycle.

Find the value of d < 1000 for which ^(1)/_(d) contains the longest recurring cycle in its decimal fraction part.

This is a problem about repeated division by the same number – how many divisions can you do before you start repeating yourself? The division algorithm can be expressed as follows:

def div_by(rem, div):
    y = min([x for x in range(5) if rem*(10**x) > div])
    return rem*(10**y) % div

So then I need to use this to repeatedly divide by the test value d – len_pattern calls div_by until it detects a repeated remainder:

def len_pattern(init, rems, d):
    rem=div_by(init, d)
    if rem in rems:
        return len(rems) - rems.index(rem)
    else:
        rems.append(rem)
        return len_pattern(rem, rems, d)

I then have to use this to find out which d < 1000 has the longest cycle. However, I don't need to call it on all the numbers under 1000 – the only ones I really need to test are the primes. The upper bound on the length of the cycle is n-1 for any number n (for example the length of the cycle generated by 7 is 6), and primes are the only numbers which generate cycles of this length. So I need to use the list generated by the primes_under() function, then call len_pattern on each of these, returning the maximum length.

def get_longest():
    from EulerFunc import primes_under
    divs = dict([(div, len_pattern(1.0, [], div)) \
        for div in primes_under(1000) if div > 5])
    longest=max(divs,  key=lambda x: divs[x])
    print "Longest cycle is for %d with length %d" % \
        (longest, divs[longest])

if __name__ == '__main__':
    from timeit import Timer
    t = Timer("get_longest()",  "from __main__ import get_longest")
    print t.timeit(number=100)

>>> Longest cycle is for 983 with length 884
Longest cycle is for 983 with length 884
Longest cycle is for 983 with length 884
.
.
.
Longest cycle is for 983 with length 884
151.166881084

So it takes on average around 1.5 seconds to run. Not bad, I see some people had Python code which ran in under a second, so I’m sure this could be optimised…

Finding primes with Python

One of the Euler problems requires finding all the primes below a certain number x.  Without thinking about this very much, it’s possible to take a very easy approach. Let’s say, look at all the numbers below x and check whether each of them is prime. In pseudocode, this would go as follows:
for each y less than x
if y is prime, add it to the list
return the list

The only requirement for this to work is a test to determine whether y is prime. Well, that’s easy enough. This would do the trick:

import math

def is_prime(n):
  """
  pre: n +ve integer
  returns: true if n prime, else false
  """
  for i in range(2, int(math.sqrt(n)) + 1):
    if n % i == 0:
      return False
  return True

So, putting that into code, we’d get:

def primes_under(n):
  """
  pre: n +ve integer
  returns: list of all primes under n
  """
  result = [2]
  # only need to test odd numbers
  for x in xrange(3, n, 2):
    if is_prime(x):
      result.append(x)
  return result

Now the only problem with this approach is that, as the numbers get larger, the prime test takes longer. For n = 10,007 (which is prime) the test requires 100 division operations. Even removing even numbers from the list we’re checking, this mounts up to hundreds of thousands of operations getting through the full list. If we wanted to test up to, say, 1,000,000, the number of operations required at that point is up in the hundreds of millions. Clearly, this will get prohibitively expensive after this point – so it’s time to start thinking about a smarter way to do this.

One way which does not require nearly as many operations is to use a prime sieve. This was an idea that occurred to Eratosthenes a couple of thousand years ago – it still impresses me to consider its effectiveness and its simplicity. Here’s a quick rundown of the algorithm from its Wikipedia entry:

  1. Create a contiguous list of numbers from two to some highest number n.
  2. Strike out from the list all multiples of two (4, 6, 8 etc.).
  3. The list’s next number that has not been struck out is a prime number.
  4. Strike out from the list all multiples of the number you identified in the previous step.
  5. Repeat steps 3 and 4 until you reach a number that is greater than the square root of n (the highest number in the list).
  6. All the remaining numbers in the list are prime.

There is a trade-off here: this approach is far faster than testing all numbers (and the bigger the number, the more noticeable this gain is), but it requires the space in memory to hold the array. Quite often this is the expense of improvement: reduce one cost and another increases.
This is how the algorithm looks in Python:

import math

def primes_under(n):
  """
  pre: n +ve integer
  returns: list of all primes under n
  """
  sieve = [True]*n
  for x in xrange(4, n, 2):
    sieve[x] = False
  # any multiple of num above sqrt(n) is also a multiple of a
  # lower prime, hence has already been eliminated
  for num in xrange(3, int(math.sqrt(n)) + 1, 2):
    if sieve[num]:
      for y in xrange(num**2, n, 2*num):
        sieve[y] = False
  result = [x for x in range(2, n) if sieve[x]]
  return result