In 1637, Pierre de Fermat wrote in the margin of a book that he had a proof of his famous "Last Theorem":

If $A^n + B^n = C^n$,

where $A, B, C, n$ are positive integers

then $n \le 2$.

Centuries passed before Andrew Beal, a businessman and amateur mathematician, made his conjecture in 1993:

If $A^x + B^y = C^z$,

where $A, B, C, x, y, z$ are positive integers and $x, y, z$ are all greater than $2$,

then $A, B$ and $C$ must have a common prime factor.

Andrew Wiles proved Fermat's theorem in 1995, but Beal's conjecture remains unproved, and Beal has offered \$1,000,000 for a proof or disproof. I don't have the mathematical skills of Wiles, so I could never find a proof, but I can write a program to search for counterexamples. I first wrote that program in 2000, and my name got associated with Beal's Conjecture, which means I get a lot of emails with purported proofs or counterexamples (many asking how they can collect their prize money). So far, all the emails have been wrong. This page catalogs some of the more common errors and updates my 2000 program.

- A proof must show that there are no examples that satisfy the conditions. A common error is to show how a certain pattern generates an infinite collection of numbers that satisfy $A^x + B^y = C^z$ and then show that in all of these, $A, B, C$ have a common factor. But that's not good enough, unless you can also prove that no other pattern exists.

- It is valid to use proof by contradiction: assume the conjecture is true, and show that that leads to a contradiction. It is not valid to use proof by circular reasoning: assume the conjecture is true, put in some irrelevant steps, and show that it follows that the conjecture is true.

- A valid counterexample needs to satisfy all four conditions—don't leave one out:

$A, B, C, x, y, z$ are positive integers

$x, y, z > 2$

$A^x + B^y = C^z$

$A, B, C$ have no common prime factor.

(If you think you might have a valid counterexample, before you share it with Andrew Beal or anyone else, you can check it with my Online Beal Counterexample Checker.)

- One correspondent claimed that $27^4 + 162 ^ 3 = 9 ^ 7$ was a solution, because the first three conditions hold, and the common factor is 9, which isn't a prime. But of course, if $A, B, C$ have 9 as a common factor, then they also have 3, and 3 is prime. The phrase "no common prime factor" means the same thing as "no common factor greater than 1."

- Another claimed that $2^3+2^3=2^4$ was a counterexample, because all the bases are 2, which is prime, and prime numbers have no prime factors. But that's not true; a prime number has itself as a factor.

- A creative person offered $1359072^4 - 940896^4 = 137998080^3$, which fails both because $3^3 2^5 11^2$ is a common factor, and because it has a subtraction rather than an addition (although, as Julius Jacobsen pointed out, that can be rectified by adding $940896^4$ to both sides).

- Mustafa Pehlivan came up with an example involving 76-million-digit numbers, which took some work to prove wrong (by using modulo arithmetic).

- Another Beal fan started by saying "Let $C = 43$ and $z = 3$. Since $43 = 21 + 22$, we have $43^3 = (21^3 + 22^3).$" But of course $(a + b)^3 \ne (a^3 + b^3)$. This fallacy is called the freshman's dream (although I remember having different dreams as a freshman).

- Multiple people proposed answers similar to this one:

In [1]:

```
from math import gcd #### In Python versions < 3.5, use "from fractions import gcd"
```

In [2]:

```
A, B, C = 60000000000000000000, 70000000000000000000, 82376613842809255677
x = y = z = 3.
A ** x + B ** y == C ** z and gcd(gcd(A, B), C) == 1
```

Out[2]:

**WOW! The result is True!** Is this a real counterexample to Beal? And also a disproof of Fermat?

Alas, it is not. Notice the decimal point in "`3.`

", indicating a floating point number, with inexact, limited precision. Change the inexact "`3.`

" to an exact "`3`

" and the result changes to "`False`

". Below we see that the two sides of the equation are the same for the first 18 digits, but differ starting with the 19th:

In [3]:

```
(A ** 3 + B ** 3,
C ** 3)
```

Out[3]:

They say "close" only counts in horseshoes and hand grenades, and if you threw two horseshoes at a stake on the planet Kapteyn-b (a possibly habitable and thus possibly horseshoe-playing exoplanet 12.8 light years from Earth) and the two paths differed in the 19th digit, the horseshoes would end up less than an inch apart. That's really, really close, but close doesn't count in number theory.

In two different episodes of *The Simpsons*, close counterexamples to Fermat's Last Theorem are shown:
$1782^{12} + 1841^{12} = 1922^{12}$ and $3987^{12} + 4365^{12} = 4472^{12}$. These were designed by *Simpsons* writer David X. Cohen to be correct up to the precision found in most handheld calculators. Cohen found the equations with a program that must have been something like this:

In [4]:

```
from itertools import combinations
def simpsons(bases, powers):
"""Find the integers (A, B, C, n) that come closest to solving
Fermat's equation, A ** n + B ** n == C ** n.
Let A, B range over all pairs of bases and n over all powers."""
equations = ((A, B, iroot(A ** n + B ** n, n), n)
for A, B in combinations(bases, 2)
for n in powers)
return min(equations, key=relative_error)
def iroot(i, n):
"The integer closest to the nth root of i."
return int(round(i ** (1./n)))
def relative_error(equation):
"Error between LHS and RHS of equation, relative to RHS."
(A, B, C, n) = equation
LHS = A ** n + B ** n
RHS = C ** n
return abs(LHS - RHS) / RHS
```

In [5]:

```
simpsons(range(1000, 2000), [11, 12, 13])
```

Out[5]:

In [6]:

```
simpsons(range(3000, 5000), [12])
```

Out[6]:

In October 2015 I looked back at my original program from 2000.
I ported it from Python 1.5 to 3.5 (by putting parens around the argument to `print`

and adding `long = int`

). The program runs 250 times faster today than it did in 2000, a tribute to both computer hardware engineers and the developers of the Python interpreter.

I found that I was a bit confused about the definition of the problem in 2000. I thought then that, *by definition*, $A$ and $B$ could not have a common factor, but actually, the definition of the conjecture only rules out examples where all three of $A, B, C$ share a common factor. Mark Tiefenbruck (and later Edward P. Berlin and Shen Lixing) wrote to point out that my thought was actually correct, not by definition, but by derivation: if $A$ and $B$ have a commmon prime factor $p$, then the sum of $A^x + B^y$ must also have that factor $p$, and since $A^x + B^y = C^z$, then $C^z$ and hence $C$ must have the factor $p$. So I was wrong twice, and in this case two wrongs did make a right.

Mark Tiefenbruck also suggested an optimization: only consider exponents that are odd primes, or 4. The idea is that a number like 512 can be expressed as either $2^9$ or $8^3$, and my program doesn't need to consider both. In general, any time we have a composite exponent, such as $b^{qp}$, where $p$ is prime, we should ignore $b^{(qp)}$, and instead consider only $(b^q)^p$. There's one complication to this scheme: 2 is a prime, but 2 is not a valid exponent for a Beal counterexample. So we will allow 4 as an exponent, as well as all odd primes up to `max_x`

.

Here is the complete, updated, refactored, optimized program:

In [7]:

```
from math import gcd, log
from itertools import combinations, product
def beal(max_A, max_x):
"""See if any A ** x + B ** y equals some C ** z, with gcd(A, B) == 1.
Consider any 1 <= A,B <= max_A and x,y <= max_x, with x,y prime or 4."""
Apowers = make_Apowers(max_A, max_x)
Czroots = make_Czroots(Apowers)
for (A, B) in combinations(Apowers, 2):
if gcd(A, B) == 1:
for (Ax, By) in product(Apowers[A], Apowers[B]):
Cz = Ax + By
if Cz in Czroots:
C = Czroots[Cz]
x, y, z = exponent(Ax, A), exponent(By, B), exponent(Cz, C)
print('{} ** {} + {} ** {} == {} ** {} == {}'
.format(A, x, B, y, C, z, C ** z))
def make_Apowers(max_A, max_x):
"A dict of {A: [A**3, A**4, ...], ...}."
exponents = exponents_upto(max_x)
return {A: [A ** x for x in (exponents if (A != 1) else [3])]
for A in range(1, max_A+1)}
def make_Czroots(Apowers): return {Cz: C for C in Apowers for Cz in Apowers[C]}
def exponents_upto(max_x):
"Return all odd primes up to max_x, as well as 4."
exponents = [3, 4] if max_x >= 4 else [3] if max_x == 3 else []
for x in range(5, max_x, 2):
if not any(x % p == 0 for p in exponents):
exponents.append(x)
return exponents
def exponent(Cz, C):
"""Recover z such that C ** z == Cz (or equivalently z = log Cz base C).
For exponent(1, 1), arbitrarily choose to return 3."""
return 3 if (Cz == C == 1) else int(round(log(Cz, C)))
```

In [8]:

```
%time beal(100, 100)
```

`max_A`

, so the following should take about 100 times longer:

In [9]:

```
%time beal(1000, 100)
```

`beal`

Works¶The function `beal`

first does some precomputation, creating two data structures:

`Apowers`

: a dict of the form`{A: [A**3, A**4, ...]}`

giving the nonredundant powers (prime and 4th powers) of each base,`A`

, from 3 to`max_x`

.`Czroots`

: a dict of`{C**z : C}`

pairs, giving the zth root of each power in`Apowers`

.

Here is a very small example Apowers table:

In [10]:

```
Apowers = make_Apowers(6, 10)
Apowers
```

Out[10]:

`A`

and `B`

, from `Apowers`

. Consider the combination where `A`

is `3`

and `B`

is `6`

. Of course `gcd(3, 6) == 3`

, so the program would not consider them further, but imagine if they did not share a common factor. Then we would look at all possible `Ax + By`

sums, for `Ax`

in `[27, 81, 243, 2187]`

and `By`

in `[216, 1296, 7776, 279936].`

One of these would be `27 + 216`

, which sums to `243`

. We look up `243`

in `Czroots`

:

In [11]:

```
Czroots = make_Czroots(Apowers)
Czroots
```

Out[11]:

In [12]:

```
Czroots[243]
```

Out[12]:

We see that `243`

is in `Czroots`

, with value `3`

, so this would be a counterexample (except for the common factor). The program uses the `exponent`

function to recover the values of `x, y, z`

, and prints the results.

Can we gain confidence in the program? It is difficult to test `beal`

, because the expected output is nothing, for all known inputs.
One thing we can do is verify that `beal`

finds cases like `3 ** 3 + 6 ** 3 == 3 ** 5 == 243`

that would be a counterexample except for the common factor `3`

. We can test this by temporarily replacing the `gcd`

function with a mock function that always reports no common factors:

In [13]:

```
def gcd(a, b): return 1
beal(100, 100)
```

Let's make sure all those expressions are true:

In [14]:

```
{3 ** 3 + 6 ** 3 == 3 ** 5 == 243,
7 ** 7 + 49 ** 3 == 98 ** 3 == 941192,
8 ** 4 + 16 ** 3 == 2 ** 13 == 8192,
8 ** 5 + 32 ** 3 == 16 ** 4 == 65536,
9 ** 3 + 18 ** 3 == 9 ** 4 == 6561,
16 ** 5 + 32 ** 4 == 8 ** 7 == 2097152,
17 ** 4 + 34 ** 4 == 17 ** 5 == 1419857,
19 ** 4 + 38 ** 3 == 57 ** 3 == 185193,
27 ** 3 + 54 ** 3 == 3 ** 11 == 177147,
28 ** 3 + 84 ** 3 == 28 ** 4 == 614656,
34 ** 5 + 51 ** 4 == 85 ** 4 == 52200625}
```

Out[14]:

I get nervous having an incorrect version of `gcd`

around; let's change it back, quick!

In [15]:

```
from math import gcd
beal(100, 100)
```

We can also provide some test cases for the subfunctions of `beal`

:

In [16]:

```
def tests():
assert make_Apowers(6, 10) == {
1: [1],
2: [8, 16, 32, 128],
3: [27, 81, 243, 2187],
4: [64, 256, 1024, 16384],
5: [125, 625, 3125, 78125],
6: [216, 1296, 7776, 279936]}
assert make_Czroots(make_Apowers(5, 8)) == {
1: 1, 8: 2, 16: 2, 27: 3, 32: 2, 64: 4, 81: 3,
125: 5, 128: 2, 243: 3, 256: 4, 625: 5, 1024: 4,
2187: 3, 3125: 5, 16384: 4, 78125: 5}
Czroots = make_Czroots(make_Apowers(100, 100))
assert 3 ** 3 + 6 ** 3 in Czroots
assert 99 ** 97 in Czroots
assert 101 ** 100 not in Czroots
assert Czroots[99 ** 97] == 99
assert exponent(10 ** 5, 10) == 5
assert exponent(7 ** 3, 7) == 3
assert exponent(1234 ** 999, 1234) == 999
assert exponent(12345 ** 6789, 12345) == 6789
assert exponent(3 ** 10000, 3) == 10000
assert exponent(1, 1) == 3
assert exponents_upto(2) == []
assert exponents_upto(3) == [3]
assert exponents_upto(4) == [3, 4]
assert exponents_upto(40) == [3, 4, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37]
assert exponents_upto(100) == [
3, 4, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61,
67, 71, 73, 79, 83, 89, 97]
assert gcd(3, 6) == 3
assert gcd(3, 7) == 1
assert gcd(861591083269373931, 94815872265407) == 97
assert gcd(2*3*5*(7**10)*(11**12), 3*(7**5)*(11**13)*17) == 3*(7**5)*(11**12)
return 'tests pass'
tests()
```

Out[16]:

The program is mostly straightforward, but relies on the correctness of these arguments:

- Are we justified in taking
`combinations`

without replacements from the`Apowers`

table? In other words, are we sure there are no solutions of the form $A^x + A^x = C^z$? Yes, we can be sure, because then $2\;A^x = C^z$, and all the factors of $A$ would also be factors of $C$.

- Are we justified in having a single value for each key in the
`Czroots`

table? Consider that $81 = 3^4 = 9^2$. We put`{81: 3}`

in the table and discard`{81: 9}`

, because any number that has 9 as a factor will always have 3 as a factor as well, so 3 is all we need to know. But what if a number could be formed with two bases where neither was a multiple of the other? For example, what if $2^7 = 5^3 = s$; then wouldn't we have to have both 2 and 5 as values for $s$ in the table? Fortunately, that can never happen, because of the fundamental theorem of arithmetic.

- Could there be a rounding error involving the
`exponent`

function that was not caught by the tests? Possibly; but`exponent`

is not used to find counterexamples, only to print them, so any such error wouldn't cause us to miss a counterexample.

- Are we justified in only considering exponents that are odd primes, or the number 4? In one sense, yes, because when we consider the two terms $A^{(qp)}$ and $(A^q)^p$, we find they are always equal, and always have the same prime factors (the factors of $A$), so for the purposes of the Beal problem, they are equivalent, and we only need consider one of them. In another sense, there is a difference. With this optimization, when we run
`beal(6, 10)`

, we are no longer testing $512$ as a value of $A$ or $B$, even though $512 = 2^9$ and both $2$ and $9$ are within range, because the program chooses to express $512$ as $8^3$, and $8$ is not in the specified range. So the program is still correctly searching for counterexamples, but the space that it searches for given`max_A`

and`max_x`

is different with this optimization.

- Are we really sure that when $A$ and $B$ have a common factor greater than 1, then $C$ also shares that common factor? Yes, because if $p$ is a factor of both $A$ and $B$, then it is a factor of $A^x + B^y$, and since we know this is equal to $C^z$, then $p$ must also be a factor of $C^z$, and thus a factor of $C$.

Arithmetic is slow with integers that have thousands of digits. If we want to explore much further, we'll have to make the program more efficient. An obvious improvement would be to do all the arithmetic module some number $m$. Then we know:

$$\mbox{if} ~~ A^x + B^y = C^z ~~ \mbox{then} ~~ (A^x (\mbox{mod} ~ m) + B^y (\mbox{mod} ~ m)) (\mbox{mod} ~ m) = C^z \;(\mbox{mod} ~ m)$$So we can do efficient tests modulo $m$, and then do the full arithmetic only for combinations that work modulo $m$. Unfortunately there will be collisions (two numbers that are distinct, but are equal mod $m$), so the tables will have to have lists of values. Here is a simple, unoptimized implementation:

In [17]:

```
from math import gcd
from itertools import combinations, product
from collections import defaultdict
def beal_modm(max_A, max_x, m=2**31-1):
"""See if any A ** x + B ** y equals some C ** z (mod p), with gcd(A, B) == 1.
If so, verify that the equation works without the (mod m).
Consider any 1 <= A,B <= max_A and x,y <= max_x, with x,y prime or 4."""
assert m >= max_A
Apowers = make_Apowers_modm(max_A, max_x, m)
Czroots = make_Czroots_modm(Apowers)
for (A, B) in combinations(Apowers, 2):
if gcd(A, B) == 1:
for (Axm, x), (Bym, y) in product(Apowers[A], Apowers[B]):
Czm = (Axm + Bym) % m
if Czm in Czroots:
lhs = A ** x + B ** y
for (C, z) in Czroots[Czm]:
if lhs == C ** z:
print('{} ** {} + {} ** {} == {} ** {} == {}'
.format(A, x, B, y, C, z, C ** z))
def make_Apowers_modm(max_A, max_x, m):
"A dict of {A: [(A**3 (mod m), 3), (A**4 (mod m), 4), ...]}."
exponents = exponents_upto(max_x)
return {A: [(pow(A, x, m), x) for x in (exponents if (A != 1) else [3])]
for A in range(1, max_A+1)}
def make_Czroots_modm(Apowers):
"A dict of {C**z (mod m): [(C, z),...]}"
Czroots = defaultdict(list)
for A in Apowers:
for (Axm, x) in Apowers[A]:
Czroots[Axm].append((A, x))
return Czroots
```

`Apowers`

table is a list of `(A**x (mod p), x)`

pairs.
For example, $6^7 = 279,936$, so in our (mod 1000) table we have the pair `(936, 7)`

under `6`

.

In [18]:

```
Apowers = make_Apowers_modm(6, 10, 1000)
Apowers
```

Out[18]:

`Czroots`

table is of the form `{C**z (mod m): [(C, z), ...]}`

.
For example, `936: [(6, 7)]`

.

In [19]:

```
make_Czroots_modm(Apowers)
```

Out[19]:

Let's run the program:

In [20]:

```
%time beal_modm(1000, 100)
```

We don't see a speedup here, but the idea is that as we start dealing with much larger integers, this version should be faster. I could improve this version by caching certain computations, managing the memory layout better, moving some computations out of loops, considering using multiple different numbers as the modulus (as in a Bloom filter), finding a way to parallelize the program, and re-coding in a faster compiled language (such as C++ or Go or Julia). Then I could invest thousands (or millions) of CPU hours searching for counterexamples.

But Witold Jarnicki and David Konerding already did that: they wrote a C++ program that, in parallel across thousands of machines, searched for $A, B$ up to 200,000 and $x, y$ up to 5,000, but found no counterexamples. So I don't think it is worthwhile to continue on that path.

This was fun, but I can't recommend anyone spend a serious amount of computer time looking for counterexamples to the Beal Conjecture—the money you would have to spend in computer time would be more than the expected value of your prize winnings. I suggest you work on a proof rather than a counterexample, or work on some other interesting problem instead!

In [ ]:

```
```