Division vs Multiplication
It turns out that it is relatively easy to produce hardware which will add two floating-point numbers. So most CPUs contain a dedicated floating-point addition "functional unit" which does just that, typically taking three to five clock-cycles to produce an answer, and being able to accept a new sum on every clock cycle.
Any addition unit which can handle signed numbers can trivally support subtraction too.
Multiplication is harder (it needs more transistors), but it is a common operation, and again most CPUs support it, and can perform multiplications in a similar time to additions.
Division is much harder, and much rarer in code. It would be a pity to dedicate a large area of a CPU to an operation which is rarely used. But it would also be odd not to support division at all.
Of course division can be replaced by multiplication by the reciprocal of the quotient, but that begs the question how does one find the reciprocal?
Consider the series
For a good starting guess, this will converge rapidly to the reciprocal of b. With b=6:
n xn 0 0.2 1 0.15999999999999998 2 0.1664 3 0.16666624 4 0.1666666666655744 5 0.16666666666666666
But for a poor starting guess, it is useless. Again with b=6:
n xn 0 0.4 1 -0.16000000000000014 2 -0.4736000000000006 3 -2.2929817600000044 4 -36.13255563015632 5 -7905.634569458359
But this can be made to work for any number quite easily.
First write the number in the form m*10p with 1<=m<10. Then choose x0 based on the following table:
In code this might look like the following:
#!/usr/bin/python3 from math import * b=float(input("Please input b: ")) p=floor(log10(b)) m=b/pow(10,p) print("b=",m,"x10^",p,"\n") if (m<2): x=0.67 elif (m<4): x=0.33 elif (m<8): x=0.17 else: x=0.11 for i in range (0,11): print(i,x) x=x*(2-m*x) x=x*pow(10,-p) print("\nFinal answer:",x) print("b*x=",b*x)
Does the above series actually work for all inputs?
If so, what is the minimum number of iterations it should do to converge all the reciprocals to 15 decimal digits?
One would describe this algorithm of starting from a look-up table of length four. Is this actually the best choice of look-up table of that length? If the look-up table were longer, could the minimum number of iterations to converge any reciprocal be reduced?
Computers work in base 2, not base 10. This enables the approach to work with no look-up table, as the possible range for m is much reduced. But in practice a look-up table would be used in order to reduce the number of iterations required to converge the answer.
#!/usr/bin/python3 from math import * b=float(input("Please input b: ")) p=floor(log2(b)) m=b/pow(2,p) print("b=",m,"x2^",p,"\n") x=0.7 for i in range (0,11): print(i,x) x=x*(2-m*x) x=x*pow(2,-p) print("\nFinal answer:",x) print("b*x=",b*x)
for b=2 can be written as
with λ=2. So for this one value of b this series is also part of the logistic map.
These series are rather different to the usual polynomial approximations to sin, exp or tan. With those polynominal series, extra accuracy requires the knowledge of more polynominal coefficients, and all calculations have to be done with at least the final precision, as any errors made in calculating the first terms will not be corrected later. With the series given here, one can choose to do the first couple of iterations rather approximately, and the series will still converge correctly once one increases the precision for later iterations. A CPU may choose to use a look-up table, followed by a couple of low-precision iterations, followed by some high-precision iterations.
Note for pedants: the above is not the only technique used for finding reciprocals, but it shows that division is possible, and not too slow, when one has access to just a multiply unit and an addition unit.
A Final Series
There is one other, similar, series used by some CPUs.
Again it converges only for good starting guesses.
Setting x0 to be 0.4 and calculating with b=2, 4 and 9, can you work out to what function it is converging? In this case both x∞ and b*x∞ are useful functions.