python - Why is [] faster than list()?

Tuesday 26 December 2017

python - Why is [] faster than list()?

itemprop="text">

I recently compared the processing
speeds of [] and list() and was
surprised to discover that [] runs more than three
times faster than list(). I ran the same test with
{} and dict() and the results were
practically identical: [] and {} both
took around 0.128sec / million cycles, while list() and
dict() took roughly 0.428sec / million cycles
each.

Why is this? Do
[] and {} (and probably
() and '', too) immediately pass back
a copies of some empty stock literal while their explicitly-named counterparts
(list(), dict(),
tuple(), str()) fully go about
creating an object, whether or not they actually have
elements?

I have no idea how these two methods
differ but I'd love to find out.
I couldn't find an answer in the docs or on
SO, and searching for empty brackets turned out to be more problematic than I'd
expected.

I got my timing results by
calling timeit.timeit("[]") and
timeit.timeit("list()"), and
timeit.timeit("{}") and
timeit.timeit("dict()"), to compare lists and dictionaries,
respectively. I'm running Python 2.7.9.

I
recently discovered " href="https://stackoverflow.com/questions/18123965/why-if-true-is-slower-than-if-1">Why
is if True slower than if 1?" that compares the performance of if True to if 1 and seems to touch on a similar
literal-versus-global scenario; perhaps it's worth considering as
well.

Answer

Because [] and
{} are literal syntax. Python can create
bytecode just to create the list or dictionary
objects:

>>> import
            dis
>>> dis.dis(compile('[]', '', 'eval'))
 1 0 BUILD_LIST
            0
 3 RETURN_VALUE 

>>> dis.dis(compile('{}', '',
            'eval'))
 1 0 BUILD_MAP 0
 3 RETURN_VALUE

list()
and dict() are separate objects. Their names need to be
resolved, the stack has to be involved to push the arguments, the frame has to be stored
to retrieve later, and a call has to be made. That all takes more
time.

For the empty case, that means you have at
the very least a href="https://docs.python.org/2/library/dis.html#opcode-LOAD_NAME"
rel="noreferrer">LOAD_NAME (which has to search
through the global namespace as well as the href="https://docs.python.org/2/library/__builtin__.html"
rel="noreferrer">__builtin__ module) followed by a
rel="noreferrer">CALL_FUNCTION, which has to
preserve the current
frame:

>>>
            dis.dis(compile('list()', '', 'eval'))

 1 0 LOAD_NAME 0
            (list)
 3 CALL_FUNCTION 0
 6 RETURN_VALUE 
>>>
            dis.dis(compile('dict()', '', 'eval'))
 1 0 LOAD_NAME 0 (dict)
 3
            CALL_FUNCTION 0
 6 RETURN_VALUE

You can time the name
lookup separately with
timeit:

>>>
            import timeit
>>> timeit.timeit('list',
            number=10**7)
0.30749011039733887
>>> timeit.timeit('dict',
            number=10**7)
0.4215109348297119

The
time discrepancy there is probably a dictionary hash collision. Subtract those times
from the times for calling those objects, and compare the result against the times for
using
literals:

>>>
            timeit.timeit('[]', number=10**7)
0.30478692054748535
>>>
            timeit.timeit('{}', number=10**7)
0.31482696533203125
>>>
            timeit.timeit('list()',
            number=10**7)
0.9991960525512695
>>>
            timeit.timeit('dict()',
            number=10**7)
1.0200958251953125

So
having to call the object takes an additional 1.00 - 0.31 - 0.30 == 0.39 seconds per 10 million
calls.

You can avoid the global lookup cost by
aliasing the global names as locals (using a timeit setup,
everything you bind to a name is a
local):

>>>
            timeit.timeit('_list', '_list = list',
            number=10**7)
0.1866450309753418
>>> timeit.timeit('_dict',
            '_dict = dict', number=10**7)
0.19016098976135254
>>>
            timeit.timeit('_list()', '_list = list',
            number=10**7)
0.841480016708374

>>>
            timeit.timeit('_dict()', '_dict = dict',
            number=10**7)
0.7233691215515137

but
you never can overcome that CALL_FUNCTION
cost.

Blog

Tuesday 26 December 2017

python - Why is [] faster than list()?

No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file