Deep vs. shallow copy?

Alex van der Spek zdoor at xs4all.nl
Wed Mar 12 11:29:59 EDT 2014


On Wed, 12 Mar 2014 10:00:09 -0500, Zachary Ware wrote:

> On Wed, Mar 12, 2014 at 9:25 AM, Alex van der Spek <zdoor at xs4all.nl>
> wrote:
>> I think I understand the difference between deep vs. shallow copies but
>> I was bitten by this:
>>
>> with open(os.path.join('path', 'foo.txt', 'rb') as txt:
>>      reader = csv.reader(txt)
>>      data = [row.append(year) for row in reader]
>>
>> This does not work although the append does complete. The below works:
>>
>> with open(os.path.join('path', 'foo.txt', 'rb') as txt:
>>      reader = csv.reader(txt)
>>      data = [row + [year] for row in reader]
>>
>> However in this context I am baffled. If someone can explain what is
>> going on here, I would be most grateful.
> 
> Deep/shallow copying doesn't really come into this.  row.append()
> mutates the list (row), it doesn't return a new list.  Like most
> in-place/mutating methods in Python, it returns None instead of self to
> show that mutation was done, so your listcomp fills `data` with Nones;
> there is no copying done at all.  The second example works as you
> expected because `row + [year]` results in a new list, which the
> listcomp is happy to append to `data`--which does mean that `row` is
> copied.
> 
> To avoid the copy that the second listcomp is doing (which really
> shouldn't be necessary anyway, unless your rows are astronomically
> huge), you have a couple of options.  First, you can expand your
> listcomp and use append:
> 
>    with open(os.path.join('path', 'foo.txt'), 'rb') as txt: # with
> your typo fixed ;)
>        reader = csv.reader(txt)
>        data = []
>        for row in reader:
>            row.append(year)
>            data.append(row)
> 
> To me, that's pretty readable and pretty clear about what it's doing.
> Then there's this option, which I don't recommend:
> 
>    import operator
>    with open(os.path.join('path', 'foo.txt'), 'rb') as txt:
>        reader = csv.reader(txt)
>        data = [operator.iadd(row, [year]) for row in reader]
> 
> This works because operator.iadd is basically shorthand for
> row.__iadd__([year]), which does return self (otherwise, the assignment
> part of `row += [year]` couldn't work).  But, it's not as clear about
> what's happening, and only saves a whole two lines (maybe 3 if you
> already have operator imported).
> 
> Hope this helps,

Thank you, that helped immensely! 

Having been taught programming in Algol60 Python still defeats me at times!
Particularly since Algol60 wasn't very long lived and what came
thereafter (FORTRAN) much worse.

I get it now, the below is equivalent! 

I am perfectly happy with the one copy of the list row + [year]. 
Just wanted to learn something here and I have!


Python 2.6.5 (r265:79063, Feb 27 2014, 19:44:14) 
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> a = [1,2,3]
>>> b = 'val'
>>> a.append(b)
>>> a
[1, 2, 3, 'val']
>>> c = a.append(b)
>>> print c
None
>>> 




More information about the Python-list mailing list