(a+b)+c≠a+(b+c)
Computers do arithmetic to a finite number of significant figures, and this can mean that the answers are quite different to the rules of algebra in which it is assumed that all results are calculated exactly.
One well-known example involves the sum of the harmonic
series
1+1/2+1/3+1/4+1/5...
The problem manifests itself much more rapidly in single precision arithmetic, so this example is in C (or C++) in which one can specify single precision.
#include<stdio.h> int main(){ int i; float sum; sum=0.0; for(i=1;i<=3000000;i++){ sum+=1.0/i; if (i%100000==0) printf("%10d: %f\n",i,sum); } return 0; }
The above code accumulates in the single precision
variable sum
the sum of the harmonic series for its
first three million terms. Every one hundred thousand terms it
prints the current number of terms and sum.
To run it, put the above code into a file whose name must
end .c
, e.g. sum.c
, and then type:
$ gcc sum.c $ ./a.out
The interesting part of its output reads:
1900000: 15.215665 2000000: 15.311032 2100000: 15.403683 2200000: 15.403683 2300000: 15.403683
The sum "converges" after about two million terms to 15.403683.
Why is this? With numbers stored to a finite precision, for
sufficiently small ε the numbers 15.4 and 15.4+ε are
identical. It seems that this is true for ε smaller than
about 0.000 000 5, so 15.4 and 15.400 000 5 are
indistinguishable. Computers actually calculate in base-2, not
base-10, so 15 would be represented as 11112. Single
precision has effectively 24 binary digits of precision, so the next
number larger than
1111.0000 0000 0000 0000 00002 is
1111.0000 0000 0000 0000 00012.
That last number would be interpreted as
1x23+1x22+1x21+1x20+1x2-20
or 15+1/220. As 220 is just over a million
(1,048,576), it might be expected that the sum would stop growing
after a million terms. This is not quite so. Given that the computer
can store, in single precision, 15 and 15+2-20, attempts
to add numbers between 2-20 and 2-21 to 15
will all give the answer 15+2-20 as this is nearer to the
correct answer than the other possibility of simply 15. So the sum
stops growing after 221 terms, or 2,097,152.
In this case moving to double precision "solves" the issue, for double precision actually has slightly more than twice the precision of single precision, at 53 binary digits. The sum will exceed 32, so there will be six binary digits before the point, but that still leaves 47 after it, so the sum will grow until 248 terms are added. This would take many days to achieve.
Better Sums
Rather than summing the harmonic series forwards, one could sum it backwards. This helps quite a lot, as each term is then added to the sum of all terms smaller than it, rather than the sum of all terms larger than it. Or would could simply move to double precision. The following code performs the sums using all three techniques: single precision forwards, single precision backwards, and double precision forwards.
#include<stdio.h> int main(){ int i,j; float forward,backward; double sum; for(i=32;i<40000000;i*=2){ forward=0; for(j=1;j<=i;j++) forward+=1.0/j; backward=0; for(j=i;j>0;j--) backward+=1.0/j; sum=0; for(j=1;j<=i;j++) sum+=1.0/j; printf("%10d: %f %f %lf\n",i,forward,backward,sum); } return 0; }
It can be compiled and run in the same way as the first example.
Its output will be something like:
32: 4.058496 4.058496 4.058495 64: 4.743892 4.743891 4.743891 128: 5.433147 5.433146 5.433147 256: 6.124346 6.124345 6.124345 512: 6.816517 6.816516 6.816517 1024: 7.509183 7.509176 7.509176 2048: 8.202082 8.202080 8.202079 4096: 8.895108 8.895104 8.895104 8192: 9.588196 9.588188 9.588190 16384: 10.281306 10.281313 10.281307 32768: 10.974409 10.974443 10.974439 65536: 11.667428 11.667588 11.667578 131072: 12.360085 12.360732 12.360722 262144: 13.051304 13.053881 13.053867 524288: 13.737018 13.747056 13.747013 1048576: 14.403684 14.440231 14.440160 2097152: 15.403683 15.132899 15.133307 4194304: 15.403683 15.829607 15.826454 8388608: 15.403683 16.514153 16.519601 16777216: 15.403683 17.232708 17.212748 33554432: 15.403683 17.827648 17.905895
It is often said that single precision numbers have about seven decimal digits of precision. This may be so, but the final answer after millions of calculations might not. Here the final two single precision answers of what is the same sum in algebraic terms do not even agree to two significant figures.
There are other things that can be understood from the above table. Our theory suggests that, for the forward single precision sum, every term between 220+1 and 221 will be added as though it were 2-20. This is adding 2-20 a total of 220 times, so should add one to the total, a number which is larger than the correct sum for this range of just under 0.7. Between 1048576 (220) and 2097152 (221) the total for the forward single precision sum does indeed grow by one, putting it ahead of the better methods. Then it stops growing entirely, so the other methods overtake it.