## Thursday, May 03, 2012 ... /////

### When experimental mathematics fails

Interesting web technicality: Google has localized motls.blogspot.cz in many more smaller countries including Czechia, Slovakia, Denmark, and others.
Many of you have surely shared the following attitude of mine. Since I was a kid, I had a pragmatical and "physical" attitude to the truth in mathematics even though I may have been closer to mathematics than to physics at many points. Is the heuristic reasoning justifiable?

When I was a sophomore in Prague, many of us discussed a paper by Arthur Jaffe and Frank Quinn,
"Theoretical mathematics": Toward a cultural synthesis of mathematics and theoretical physics
A year later, Michael Atiyah and 14 other prominent characters from the mathematical physics and physical mathematics waters wrote a response. I had probably read the response more carefully than the original article. Today, the original article seems kind of balanced to me when it comes to the question whether mathematics and theoretical physics should mix.

In the early 1990s, I didn't know that the first author of the original paper would become my colleague at Harvard 20 years later – and I knew Arthur very well. And in fact, five years had to pass from the moment when Arthur ceased to be my colleague for me to realize that he had co-authored the paper whose existence I realized so well in the early 1990s. ;-)

Nevertheless, the people who wrote the "response" were probably even more open to a free market of ideas in between mathematics and theoretical physics than the original authors and considered many of the cautionary tales against the cooperation of the fields to be demagogic because they were about bad research, not about the cooperation per se.

But let me get to the topic I want to discuss.

One of the manifestations of the pragmatic attitude I was referring to was the idea that certain claims – especially numerical claims involving real numbers – may be adopted as the truth if they're supported by a large enough body of evidence, typically a large number of digits in an identity, even if we don't know any full-fledged proofs of this identity.

In this text, I want to discuss remarkable situations in which this methodology may fail. Even though I wrote about similar issues a few times in the past, the topic was brought to my attention once again when I read John Baez's text about insanely large integers which may arise from easily formulated problems (and about "dark integers" which so far seem to be occultist nonsense to me).

At the beginning of his blog entry, he refers to some interesting non-truths summarized in an article by David Bailey and Jonathan Borwein. And I will get to that paper, especially its last section, momentarily.

Experimental proofs of mathematical identities

Consider a mathematical identity. It could be an identity relating complicated functions of real or complex variables and their sums and integrals but let me pick a simple representative, namely Machin's formula:$\pi = 16 \arctan (1/5) - 4 \arctan (1/239).$ Because $\arctan$ may be nicely Taylor-expanded, the formula is useful to calculate digits of $\pi$ using an expansion that converges reasonably quickly. 25 years ago, it took a week for my Commodore 64 and a machine code program to calculate 38,000 digits using this formula.

If you don't know it, switch your calculator to radians and you will see that you will get $\pi$ as accurately as you can. But will all the digits agree? If you calculate 10 more digits of the constant above and you will see that it still agrees with $\pi$, and it does, your certainty that the identity is right will increase. If the agreement had been a coincidence, there wouldn't be any reason for the extra digits to agree, either. The probability that the agreement in the next 10 digits is an accident is $10^{-10}$ or so. That's equivalent to some kind of six-sigma proof of an identity. The only sensible explanation of the match is that the identity is actually exact.

Do you get my point? We will look at examples where such "experimental mathematics" fails. The reasoning "the only sensible explanation is..." isn't rigorous and it will actually be untrue in our examples.

Proving Machin's formula

Before we get there, I just want to insert an intermezzo here. It's actually easy to prove Machin's formula if you know some complex numbers. You only need to know that$\arctan(a/b) = \arg (b+ia)$ where $\arg(r\cdot \exp (i\delta)) = \delta$ is the argument, or the angle, associated with a complex number in the complex plane that we will take from the interval $(-\pi,+\pi]$. If you imagine the number $b+ia$ in the complex plane, the identity above is simply a definition of the tangent using the ordinary triangles. We also need to know that$\arg(z_1 z_2) = \arg(z_1)+\arg(z_2)$ plus an integer multiple of $2\pi$ if there's a risk that we surpass $\pm\pi$. This equation simply comes from the fact that the multiplication by a complex number is a combination of scaling and a rotation of the complex plane so the angles i.e. arguments from the two complex numbers add up. Let's return to Machin's formula and divide it by four. We want to prove $4 \arctan (1/5) - \arctan (1/239) = \frac{\pi}{4}.$ How do we prove that? Well, the left-hand side may be written as $\arg[(5+i)^4 (239-i)]$ That's great because we just need to compute the complex number which is the product of powers. Now, $(5+i)^2 = 5^2 + 10 i -1 = 24+10i = 2(12+5i)$ and $\eq{ (5+i)^4 &= [(5+i)^2]^2 = 2^2 (12+5i)^2 =\\ &= 4 (144+120i - 25) = 4(119+120i). }$ Now, multiply it by the last factor,$\eq{ (5+i)^4 (239-i) &= 4(119+120i)(239-i) = \\ &= 4(119\times 239+120)+\\ &+4(120\times 239i-119i) }$ What you may immediately see is that the real part and the imaginary part of the number coincide because $120+119=239$. So the complex number arising from the product has the form $N(1+i)$ where $N$ is a positive integer so it's clear that the argument is $\pi/4$ which we wanted to prove.

There are billions of known identities in mathematics and many of them may be proven in a similar indisputable way. But our discussion is about the hypothetical identities for which we can't do it – or we're lazy to do it. Is the numerical evidence enough to make us feel certain? In most cases we know, it is. But there may exist examples in which it is not.

Playing with $j$-functions

Just a month ago, and a few times before that, I discussed a funny number$v=\sin (\pi\cdot \exp (\pi\cdot \sqrt{163}))$ You may compute it numerically. It seems like the sine of a random large number. There's no reason for the argument to be a multiple of $\pi/2$ i.e no reason for the exponential to be integer so the sine should be a random number between $-1$ and $+1$. When I tell you that $|v|$ is smaller than $10^{-11}$, you should be surprised. Why did such a simple expression produce such an unnaturally small number? But then you may decide to accept the fact that $v$ is indeed small. The fact that it's so close to zero means that it's exactly zero. But you would be wrong:$v\sim 2.356 \times 10^{-12}$ is definitely not zero. However, the smallness of this number i.e. the proximity of the inner exponential to an integer may be demonstrated by a proof based on the $j$-function and its various expansions. The function is important e.g. for one-loop calculations in string theory.

Examples from Bailey and Borwein

The bulk of the paper I have mentioned offers many examples of how the computers may be useful to do mathematics. But at the end, they mention a few more examples in which the experimental mathematics fails. Consider the following integral:$\Pi = 8\int_0^\infty \dd x\,\cos(2x) \prod_{n=1}^\infty \cos(x/n).$ It's an integral of an infinite product. Use your best numerical software to compute the integral numerically. If you happen to calculate the first 40 digits, you will get$\eq{ \Pi&=3.14159\, 26535\, 89793\, 23846\,\\ &~\dots 26433\, 83279\, 50288\, 41971\dots }$ It's good to have memorized 100 digits of $\pi$: one doesn't need to search for anything when he writes blog entries such as this one haha. At any rate, fourty digits agree with $\pi$. So you would surely conclude that $\Pi=\pi$, right? You would calculate the 41st digit as well and it would still work. The 42nd digit: works. Now the case is settled. It must be $\pi$. Burn the deniers.

Well, the reality is that the 43rd figure of $\Pi$ already disagrees with $\pi$. Much like in the numerological example involving $\sqrt{163}$, there exists an explanation why the $\Pi\approx\pi$ is so close to a true statement while it's still false. One may think about a physical problem of random walks on which you "run out of fuel". This situation is actually enough to deduce the following identity:$\pi = 8\sum_{m=0}^\infty \int_0^\infty \dd x \,\cos [2(2m+1)x]\prod_{n=1}^\infty \cos(x/n).$ This identity is exact and provable. If you pick the $m=0$ term only, you get those 42 digits correctly. If you only included the $m=0$ and $m=1$ terms, the accuracy becomes a whopping 500 digits. In such situations, you would be eager to claim that the numerical agreement has to be a proof that the identity $\Pi=\pi$ is exactly true. But you would be wrong. There exists a refinement of $\Pi$ that is actually equal to $\pi$ and the original term is just an approximation even though it is remarkably accurate!

Undermining the belief in the Riemann Hypothesis

I personally believe that the Riemann Hypothesis is true. The amount of circumstantial evidence is huge. Recall that it says that all $s$ for which $\zeta(s)=0$ obey $s(s-1)\in \RR$. So the zeroes are either real (the trivial zeroes) or they lie on the $1/2+it$ axis in the complex plane. The first trillions (?) of the nontrivial roots have been shown to satisfy this condition and there are other "hints" that the Riemann Hypothesis seems to be true. Most people including me expect that there will be a proof of the Riemann Hypothesis sometimes in the future. But could there be, after some gazillion roots, a root of the zeta-function that doesn't lie on the axis?

Equivalently, the Riemann Hypothesis says something about the distribution of prime integers on the real axis, essentially that they're randomly distributed with the density such that the probability that $P$ is prime goes like $1/\ln(P)$ for a very large $P$. Could there be a conspiracy – some kind of "periodicity" – in the distribution of the primes that only manifests itself if you study really large primes?

It seems implausible, doesn't it? But consider the other example that Bailey and Borwein offer.

Define ${\rm sinc}(x)=\sin(x)/x$ and the function$S_p(x) = \prod_{m\in M(p)} {\rm sinc} (x/m)$ where $p$ is a prime and where $M(p) = \{1,3,5,7,11,13,17,19,\dots , p\}$ is the list of all primes up to $p$ with $1$ replacing $2$. Now, it turns out that the following identity seems to hold:$\int_{-\infty}^\infty \dd x\,S_p(x) = \sum_{n=-\infty}^\infty S_p(n).$ For a fixed prime $p$, the integral of $S_p(x)$ over the real axis will give you the same result as the values of the function summed over all integer values of the argument (some brutal "discretization" of the Riemann integral, if you wish). I've reformulated their statement so that it's prettier. Verify the identity for $p=29$ or any prime $p$ among the 10,000 smallest primes. For all of them, the identity above will hold absolutely exactly – as many digits as you want.

This can't be a coincidence, can it? When you verify 10,000 such identities and all of them work totally exactly, you should be allowed to use the "mathematical induction" and claim that the identity works for all primes $p$, shouldn't you? Shockingly enough, you would be wrong again. Despite the perfect agreement for the 10,000 smallest primes, it can be proved that the identity above refuses to hold for all $p\gt p_0$ where $p_0$ is a particular finite prime integer.

How big the disagreement is?

It can be proved that while the disagreement for all the "larger primes" is nonzero, it is smaller than$\abs{\frac{L}{R}-1} \lt 10^{-10^{100}}$ i.e. one part in a googolplex. Just write the ratio of the left hand side and the right hand side. It will be exactly one for the first more than 10,000 primes. However, for larger primes behind some threshold, the ratio will differ from 1 but only at the googolth place. You will have to calculate one googol i.e. $10^{100}$ digits of the ratio – more digits than the number of electrons in the visible Universe – to explicitly prove that the left hand side and the right hand side differ. Of course, you won't be able to do this by a "brute numerical computational force". Nevertheless, mathematics has allowed the people to prove that this "deviation from common sense expectations" is completely real yet it is unnaturally tiny, too.

I have mentioned the Riemann Hypothesis because if the integral identity above could have worked for the first 10,000 primes and still fail for others, although the relative error is smaller than one inverse googolplex, it's also imaginable that the Riemann Hypothesis seems to be true according to the first trillion of roots but it may refuse to hold later. We know that the counterexamples are surely very far if they exist at all – so they don't affect our life near the center of the complex plane, so to say – but until someone finds a proof, we can't really be quite sure whether the counterexamples exist.

All the examples above may look contrived. The heuristic methods would work in your everyday life and even in the everyday life of most scientists. But we may still be entering a realm of mathematics and theoretical physics in which such "malicious" violations of apparent "theorems" and "truths extrapolated in seemingly straightforward yet illegitimate directions" could play a role. Perhaps, the tiny cosmological constant arises because of a similar tiny deviation from an idealized value and people in the future will be able to construct a similar proof to the proofs sketched above that will show that the vacuum energy is nonzero but really tiny.

Of course, the cosmological constant is a modest example here because it's rather close to one, relatively to the example we just discussed. After all, it is close to the inverse googol in the Planck units. If Nature includes effects whose numerical size or probability is comparable to the inverse googolplex or even smaller, we will have no chance to find these things experimentally. Only a perfectionist knowledge of the mathematics underlying the Universe could help us to find such "maliciously tiny effects" if they existed.

Bonus: the crazily large numbers

John Baez mentions the incomprehensibly large numbers. If you think about a googol, $10^{100}$, or a googolplex, $10$ to the power of $10^{100}$, then you're being far too modest. Even if you combine the methods to generate even larger numbers by combining the exponentiation many times and repeating this "embiggenification" many times, you will still get numbers that are smaller than some numbers that may appear as solutions to rather simply defined problems such as this one:
What is the maximal length of a word that you may construct out of letters A,B,C so that it satisfies the following condition? The condition is that no sequence of the letters from the $n$-th letter to the $(2n)$-th letter is allowed to be repeated as a subsequence of a longer block from the $m$-th letter to the $(2m)$-th letter.
The answer is finite but shocking.

We're dealing just with three letters so the answer can't be astronomical, you might say. However, it is actually much worse than an astronomical number.

Just to be sure, if we only had one letter A, the longest sequence would be AAA. The block from the 1st to the 2nd letter, AA, isn't repeated in a larger block $m-2m$ because there's no such larger block. The sequence has less than four letters. For four letters, it wouldn't work anymore.

If we had two letters, the longest sequence would be ABBBAAAAAAA. AB isn't repeated later, BBB isn't repeated later, BBAA isn't repeated later. Two more claims of this type are right. You can't really make it any longer. If the first two letters were identical, you would be forced to make it shorter because AA or BB couldn't get repeated, and so on.

But what about the three letters A,B,C? In 1998, Harvey Friedman proved that the answer – the maximum length you may achieve – is (much) larger than $A_7(184)$, an Ackermann number.

When I write it in this way, it doesn't sound impressive if you don't know what it is. But it actually is scary. The number $184$ just means that you are starting from somewhat higher exponents. But the true parameter that makes $A_7(184)$ pathologically large is the subscript seven. It measures the true "complexity" of the insanely enormous integer.

If the subscript were one, the numbers are just doubled:$A_1(n) = 2n$ That's easy. If the subscript were two, we get a power of two:$A_2(n) = 2^n$ What if you want a larger value of the subscript such as three? You use this recursive rule:$A_{k+1}(n) = A_k( A_k ( A_k ( \cdots A_k (1) \cdots )))$ where the number of $A_k$ functions embedded into each other is equal to $n$. This rule is making the numbers increase with $k$ in a much more dramatic way than other rules you could think of. It isn't just making the tower of exponents embedded into each other longer. It's making the rules by which you're "embiggenifying" the numbers automatically more complex. Already $A_4(4)$ is equal to${\Huge A_4(4) = 2^{2^{2^{2^{2^{2^{2^{2^\iddots}}}}}}} }$ where the tower must actually contain 65,536 copies of two. I used larger fonts because the digits are getting smaller in the exponent (oops, the reduction of the font size actually seems to stop beyond the third level so the large font was unnecessary). And you may imagine the formula only because the subscript is only four here. What if the subscript is seven? You can't possibly imagine any similar "design of an explicit formula" that captures the greatness of such numbers. And $A_7(184)$ is still the lower bound on the maximum number of letters in a word composed out of a simple A,B,C alphabet that obeys a pretty simply stated non-repetition condition!

I should have told you immediately but Friedman actually proved a much stronger lower bound on the length of the maximum word,$n(3)\gt A_{7198} (158386).$ That's quite something. The subscript could have been raised from seven to more than seven thousand. To calm you down a little bit, it's been also proved that $n(3)$ is smaller than $A_{A_5(5)}(A_5(5))$. If you have avoided a heart attack so far, let it give one more chance.

If you considered four letters, the length of the maximal word would be qualitatively more enormous still. You would encounter the Ackermann function that is embedded into itself an Ackermann number of times. I have been personally overwhelmed by those numbers so I no longer "feel" that this is worse than what we have had above (with three letters) but it's actually much worse than the Ackermann function with any "ordinary" integer arguments – which was already inhuman by itself. Feel free to consider a five-letter alphabet, six-letter alphabet, or an alphabet that has as many letters as is the length of the maximum four-letter-alphabet word. Those numbers get "qualitatively worse" after each step and you must feel resigned, too. Do we really care about the differences between these effectively infinite numbers?

John Baez's article presents some other examples of large numbers but they're totally "ordinary and small" relatively to the example above. For example, the number of primes less than $x$ may be approximated by $N(x) = \int_0^x \frac{\dd t}{\ln t}$ because the probability that the number $t$ or so is prime is approximately $1/\ln t$. However, it's interesting to ask whether the actual number of primes is higher or lower than the integral above. It turns out that for reasonably low or imaginably high $x$, the integral above actually overstates the number of primes.

What is the smallest $x$ for which the inequality gets reversed and the number of primes is actually larger than the integral? In 1955, Skewes proved that the number was smaller than ${\Huge 10^{10^{10^{963}}} }$ But this is an upper bound and later research actually showed that it may be significantly reduced. Even if the number were this high, it's just a number smaller than $A_3(m)$ for a reasonable $m$. Nothing to be compared with $A_7(184)$.

I am surely not the only one who thinks that these mathematical curiosities involving insanely huge numbers can't have any implications for the physical world because the physical world is all about particles whose number etc. is vastly smaller than the numbers described above. But I don't have a full proof for my belief. We – or people in the far future – could be shocked and learn that some of these nearly infinite yet finite numbers actually impact physics.