The following Python 3.x integer multiplication takes on average between 1.66s and 1.77s:
import time
start_time = time.time()
num = 0
for x in range(0, 10000000):
# num += 2 * (x * x)
num += 2 * x * x
print("--- %s seconds ---" % (time.time() - start_time))
if I replace 2 * x * x
with 2 *(x * x)
, it takes between 2.04
and 2.25
. How come?
On the other hand it is the opposite in Java: 2 * (x * x)
is faster in Java. Java test link: Why is 2 * (i * i) faster than 2 * i * i in Java?
I ran each version of the program 10 times, here are the results.
2 * x * x | 2 * (x * x)
---------------------------------------
1.7717654705047607 | 2.0789272785186768
1.735931396484375 | 2.1166207790374756
1.7093875408172607 | 2.024367570877075
1.7004504203796387 | 2.047525405883789
1.6676218509674072 | 2.254328966140747
1.699510097503662 | 2.0949244499206543
1.6889283657073975 | 2.0841963291168213
1.7243537902832031 | 2.1290600299835205
1.712965488433838 | 2.1942825317382812
1.7622807025909424 | 2.1200053691864014
Answer
First of all, note that we don't see the same thing in Python 2.x:
>>> timeit("for i in range(1000): 2*i*i")
51.00784397125244
>>> timeit("for i in range(1000): 2*(i*i)")
50.48330092430115
So this leads us to believe that this is due to how integers changed in Python 3: specifically, Python 3 uses long
(arbitrarily large integers) everywhere.
For small enough integers (including the ones we're considering here), CPython actually just uses the O(MN) grade-school digit by digit multiplication algorithm (for larger integers it switches to the Karatsuba algorithm). You can see this yourself in the source.
The number of digits in x*x
is roughly twice that of 2*x
or x
(since log(x2) = 2 log(x)). Note that a "digit" in this context is not a base-10 digit, but a 30-bit value (which are treated as single digits in CPython's implementation). Hence, 2
is a single-digit value, and x
and 2*x
are single-digit values for all iterations of the loop, but x*x
is two-digit for x >= 2**15
. Hence, for x >= 2**15
, 2*x*x
only requires single-by-single-digit multiplications whereas 2*(x*x)
requires a single-by-single and a single-by-double-digit multiplication (since x*x
has 2 30-bit digits).
Here's a direct way to see this (Python 3):
>>> timeit("a*b", "a,b = 2, 123456**2", number=100000000)
5.796971936999967
>>> timeit("a*b", "a,b = 2*123456, 123456", number=100000000)
4.3559221399999615
Again, compare this to Python 2, which doesn't use arbitrary-length integers everywhere:
>>> timeit("a*b", "a,b = 2, 123456**2", number=100000000)
3.0912468433380127
>>> timeit("a*b", "a,b = 2*123456, 123456", number=100000000)
3.1120400428771973
(One interesting note: If you look at the source, you'll see that the algorithm actually has a special case for squaring numbers (which we're doing here), but even still this is not enough to overcome the fact that 2*(x*x)
just requires processing more digits.)
No comments:
Post a Comment