Deepcopying a byte string is quicker than copying it - problem?

Ian Kelly ian.g.kelly at gmail.com
Thu Feb 27 03:02:01 EST 2014


On Wed, Feb 26, 2014 at 11:30 PM, Frank Millman <frank at chagford.com> wrote:
> Hi all
>
> I noticed this a little while ago, but dismissed it as a curiosity. On
> reflection, I decided to mention it here in case it indicates a problem.
>
> This is with python 3.3.2.
>
> C:\>python -m timeit -s "import copy" "copy.copy('a'*1000)"
> 100000 loops, best of 3: 6.91 usec per loop
>
> C:\>python -m timeit -s "import copy" "copy.deepcopy('a'*1000)"
> 100000 loops, best of 3: 11.8 usec per loop
>
> C:\>python -m timeit -s "import copy" "copy.copy(b'a'*1000)"
> 10000 loops, best of 3: 79.9 usec per loop
>
> C:\>python -m timeit -s "import copy" "copy.deepcopy(b'a'*1000)"
> 100000 loops, best of 3: 11.7 usec per loop
>
> As you can see, deepcopying a string is slightly slower than copying it.
>
> However, deepcopying a byte string is orders of magnitude quicker than
> copying it.
>
> Actually, looking closer, it is the 'copy' that is slow, not the 'deepcopy'
> that is quick..
>
> Expected, or odd?

This will shed some light:

>>> a = 'a' * 1000
>>> b = copy.copy(a)
>>> a is b
True
>>> c = copy.deepcopy(a)
>>> a is c
True
>>> d = b'a' * 1000
>>> e = copy.copy(d)
>>> d is e
False
>>> f = copy.deepcopy(d)
>>> d is f
True


For some reason, calling copy.copy on a bytes object actually copies
the bytes, rather than just returning the immutable object  passed in.
 That's probably not intentional and is worth opening a ticket for.

The difference between copy and deepcopy is probably just due to the
extra overhead of deepcopy.



More information about the Python-list mailing list