Beal's Conjecture: A Search for Counterexamples

Beal's Conjecture is this:
There are no positive integers x,m,y,n,z,r satisfying the equation
xm + yn = zr
where m,n,r > 2 and x,y,z are co-prime (that is, gcd(x,y) = gcd(y,z) = gcd(x,z) = 1).

There is a $100,000 prize for the first proof or disproof of the conjecture. The conjecture is obviously related to Fermat's Last Theorem, which was proved true by Andrew Wiles in 1994. A wide array of sophisticated mathematical techniques could be used in the attempt to prove the conjecture true (and the majority of mathematicians competent to judge seem to believe that it likely is true).

Less sophisticated mathematics and computer programming can be used to try to prove the conjecture false by finding a counterexample. This page documents my progress in this direction.

There are two things that make this non-trivial. First, we quickly get beyond the range of 32 or 64 bit integers, so any program will need a way of dealing with arbitrary precision integers. Second, we need to search a very large space: there are six variables, and the only constraint on them is that the sum must add up.

Suppose we wanted to search, all bases (x,y,z) up to 100,000 and all powers (m,n,r) up to 10. Then the first problem is that we will be manipulating integers up to 100,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 (that is, 1050). And we will be dealing with 1,000,000,000,000,000,000 (that is, 1018) potential combinations of integers.

The "large integer" problem requires either a language that supports such integers directly, such as Lisp or Python, or a package that adds the functionality, such as the NTL package for C++.

The "many combinations" problems requires a clever search algorithm. As a start, we can cut the space in half (without loss of solutions) by only considering combinations with x less than or equal to y. This helps a little, getting us down to 0.5 * 1018 combinations. The next step is to eliminate the need to consider combinations of z, r. The idea is that we can precompute all zr combinations and store them in a hash table. Then when we get a xm + yn sum, we just look it up in the table, rather than enumerating all z, r pairs. This gets us down to 0.5 * 1012 combinations, at the cost of having to store 106 large integers. Both storage and projected computation times are now within range of what I can expect to handle on my home PC.

Now its time to implement the program. I chose Python as the implementation language because it happened to already be installed on the machine I had (I also had Lisp installed, but it was an evaluation copy that was limited in the amount of memory allowed, so I couldn't use it for this problem). The fact that Python is an interpreted language shouldn't matter much, since almost all of the execution time is going into the large integer arithmetic and in fetching and inserting entries in tables. So if you want to make the program faster, don't automatically think of C as the answer; instead look for the best implementation of large integers and tables.

The Python program is very straightforward: enumerate all x,y,m,n combinations, and for each one check if the sum of the exponents is in the table. Initial testing on small search spaces was rewarding: it took only 3.3 minutes to verify that there are no solutions with x,y,m,n less than 100. I was then ready to scale up, but first I made some very minor improvements to the algorithm: instead of looking up the sum in the table with table.get(x**m + y**n), I moved each part of the calculation out of as many loops as possible, and pre-computed as much as possible. In other words, I changed from the original:

def beal(max_base, max_power):
    bases, powers, table = initial_data(max_base, max_power)
    for x in bases: 
        for y in bases:
            if y > x or gcd(x,y) > 1: continue
            for m in powers:
                for n in powers:
                    sum = x**m + y**n
                    r = table.get(sum)
                    if r: report(x, m, y, n, nth_root(sum, r), r)
to the optimized version running about 2.8 times faster:

def beal(max_base, max_power):
    bases, powers, table, pow = initial_data(max_base, max_power)
    for x in bases: 
        powx = pow[x]
        for y in bases:
            if y > x or gcd(x,y) > 1: continue
            powy = pow[y]
            for m in powers:
                xm = powx[m]
                for n in powers:
                    sum = xm + powy[n]
                    r = table.get(sum)
                    if r: report(x, m, y, n, nth_root(sum, r), r)

Results

Alas, I have found no counterexamples yet. But I can tell you some places where you shouldn't look for them (unless you think there's an error in my program). Running my program on a 400 MHz PC using Python 1.5 I found that:
beal(   100, 100)  took    0.6 hours (   9K elements in table)
beal(   100,1000)  took   19.3 hours (  92K elements in table)
beal(  1000, 100)  took    6.2 hours (  95K elements in table)
beal( 10000,  30)  took   52   hours ( 278K elements in table)
beal( 10000, 100)  took  933   hours ( 974K elements in table)
beal( 50000,  10)  took  109   hours ( 399K elements in table) 
beal(100000,  10)  took  445   hours ( 798K elements in table)
beal(250000,   7)  took 1323   hours (1249K elements in table)
To put it another way, in the following table a red cell indicates there is no solution involving a bp for any value of b and p less than or equal to the given number. A yellow cell with a "?" means we don't know yet. Run times are given within some cells in minutes (m), hours (h) or days (d).

 p=7p=10p=30p=100p=1000
b=100---3m19h
b=1,000---6h?
b=10,000--2d39d?
b=100,000-18d???
b=250,00055d????

How do we know the program is correct? We don't for sure, but removing "or gcd(x,y) > 1" prints some non-co-prime solutions that can be verified by hand, suggesting that we're doing something right.

Complete Program

Here is the complete Python program. You have my permission to use this program for anything you want. I give no guarantees of correctness, or of suitability for any purpose.

"""Print counterexamples to Beal's conjecture.
That is, find positive integers x,m,y,n,z,r such that:
x^m + y^n = z^r and m,n,r > 2 and x,y,z co-prime (pairwise no common factor).

ALGORITHM: Initialize the variables table, pow, bases, powers such that:
        pow[z][r] = z**r
        table.get(sum) = r if there is a z such that z**r = sum.  
        bases = [1, 2, ... max_base]
        powers = [3, 4, ... max_power]
Then enumerate x,y,m,n,	and do table.get(pow[x][m]+pow[y][n]).  If we get 
something back, report it as a result.  We consider all values of x,y,z
in bases, and all values of m,n,r in powers.
"""

def beal(max_base, max_power):
    bases, powers, table, pow = initial_data(max_base, max_power)
    for x in bases: 
        powx = pow[x]
        if x % 1000 == 0: print 'x =', x
        for y in bases:
            if y > x or gcd(x,y) > 1: continue
            powy = pow[y]
            for m in powers:
                xm = powx[m]
                for n in powers:
                    sum = xm + powy[n]
                    r = table.get(sum)
                    if r: report(x, m, y, n, nth_root(sum, r), r)
                
def initial_data(max_base, max_power):
    bases = range(1, max_base+1)
    powers = range(3, max_power+1)
    table = {}
    pow = [None] * (max_base+1)
    for z in bases:
        pow[z] = [None] * (max_power+1)
        for r in powers:
            zr = long(z) ** r
            pow[z][r] = zr
            table[zr] = r
    print 'initialized %d table elements' % len(table)
    return bases, powers, table, pow  

def report(x, m, y, n, z, r):
    x, y, z = map(long, (x, y, z))
    assert min(x, y, z) > 0 and min(m, n, r) > 2 
    assert x ** m + y ** n == z ** r
    print '%d ^ %d + %d ^ %d = %d ^ %d = %s' % \
          ( x,   m,   y,   n,   z,   r,  z**r)
    if gcd(x,y) == gcd(x,z) == gcd(y, z) == 1: 
        raise 'a ruckus: SOLUTION!!'

def gcd(x, y):
    while x:
        x, y = y % x, x
    return y

def nth_root(base, n): 
    return long(round(base ** (1.0/n)))

import time

def timer(b, p):
    start = time.clock()
    beal(b, p)
    secs = time.clock() - start
    return {'secs': secs, 'mins': secs/60, 'hrs': secs/60/60}

Next Steps

This program is nearly at the end of its usefulness. I could run it for slightly larger numbers, but I'm approaching the limits of both my patience with the run times and the memory limits of my computer. (I suppose I could always go buy more memory).

I think the obvious next step is to re-implement the algorithm in C for speed (probably with a special-purpose hash table implementation tuned for the problem), and with arithmetic done modulo 264 (or maybe do the arithmetic modulo several large primes, and keep all the results). When we get a solution modulo 264, it may or may not be a real solution, but we could print it to a file and afterwards run a simple Python or Lisp program to determine if it is a real solution. This approach will probably run about 100 times faster, and require about 1/5th to 1/10th the storage, so it can take us out to maybe 1,000,00020.

Beyond that, we need a new approach. My first idea is to only do a slice of the zr values at a time. This would require switching from an approach that limits the bases and powers to one that limits the minimum and maximum sum searched for. That is, we would call something like beal2(10 ** 20, 10 ** 50) to search for all solutions with sum between 1020 and 1050. The program would build a table of all zr values in that range, and then carefully enumerate x,m,y,n to stay within that range. One could then take this program and add a protocol like SETI@home where interested people could download the code, be assigned min and max values to search on, and a large network of people could together search for solutions.

Ultimately, I think that if there are any counterexamples at all, they will probably be found through generating functions based on elliptical curves, or some other such approach. But that's too much math for me; I'm just trying to apply the minimal amount of computer programming to look for the "obvious" counterexamples.

History of my Efforts

It all started when my wife pointed out an article about the one million dollar millennium prize offered by the Clay Mathematics Institute for seven open mathematical problems. My six-year old daughter Isabella said "I'm good at math. Can I do that and win the prize? Then I can buy that toy I wanted." I recognized the equation in the article as the Riemann Hypothesis, which I knew is likely to require complex analysis to solve, not just the adding and take-aways that Isabella's kindergarden education had equipped her for. But, I said, I did recall another contest related to Fermat's last Theorem, which would also be difficult to prove true, but which might be shown to be false with some adding and multiplying (of numbers that are a bit beyond the typical kindergarden range). I was able to find the Beal Conjecture Page, but so far I have not been able to win the $75,000 prize. Maybe you can. If you use my code to do it, save me at least enough money to buy Isabella's toy.


Peter Norvig