Efficiently Split A List of Tuples
Ron Adam
rrr at ronadam.com
Mon Jul 18 02:37:36 EDT 2005
Raymond Hettinger wrote:
>>Variant of Paul's example:
>>
>>a = ((1,2), (3, 4), (5, 6), (7, 8), (9, 10))
>>zip(*a)
>>
>>or
>>
>>[list(t) for t in zip(*a)] if you need lists instead of tuples.
>
>
>
> [Peter Hansen]
>
>>(I believe this is something Guido considers an "abuse of *args", but I
>>just consider it an elegant use of zip() considering how the language
>>defines *args. YMMV]
>
>
> It is somewhat elegant in terms of expressiveness; however, it is also
> a bit disconcerting in light of the underlying implementation.
>
> All of the tuples are loaded one-by-one onto the argument stack. For a
> few elements, this is no big deal. For large datasets, it is a less
> than ideal way of transposing data.
>
> Guido's reaction makes sense when you consider that most programmers
> would cringe at a function definition with thousands of parameters.
> There is a sense that this doesn't scale-up very well (with each Python
> implementation having its own limits on how far you can push this
> idiom).
>
>
> Raymond
Currently we can implicitly unpack a tuple or list by using an
assignment. How is that any different than passing arguments to a
function? Does it use a different mechanism?
(Warning, going into what-if land.)
There's a question relating to the above also so it's not completely in
outer space. :-)
We can't use the * syntax anywhere but in function definitions and
calls. I was thinking the other day that using * in function calls is
kind of inconsistent as it's not used anywhere else to unpack tuples.
And it does the opposite of what it means in the function definitions.
So I was thinking, In order to have explicit packing and unpacking
outside of function calls and function definitions, we would need
different symbols because using * in other places would conflict with
the multiply and exponent operators. Also pack and unpack should not be
the same symbols for obvious reasons. Using different symbols doesn't
conflict with * and ** in functions calls as well.
So for the following examples, I'll use '~' as pack and '^' as unpack.
~ looks like a small 'N', for put stuff 'in'.
^ looks like an up arrow, as in take stuff out.
(Yes, I know they are already used else where. Currently those are
binary operators. The '^' is used with sets also. I did say this is a
"what-if" scenario. Personally I think the binary operator could be
made methods of a bit type, then they ,including the '>>' '<<' pair,
could be freed up and put to better use. The '<<' would make a nice
symbol for getting values from an iterator. The '>>' is already used in
print as redirect.)
Simple explicit unpacking would be:
(This is a silly example, I know it's not needed here but it's just to
show the basic pattern.)
x = (1,2,3)
a,b,c = ^x # explicit unpack, take stuff out of x
So, then you could do the following.
zip(^a) # unpack 'a' and give it's items to zip.
Would that use the same underlying mechanism as using "*a" does? Is it
also the same implicit unpacking method used in an assignment using
'='?. Would it be any less "a bit disconcerting in light of the
underlying implementation"?
Other possible ways to use them outside of function calls:
Sequential unpacking..
x = [(1,2,3)]
a,b,c = ^^x -> a=1, b=2, c=3
Or..
x = [(1,2,3),4]
a,b,c,d = ^x[0],x[1] -> a=1, b=2, c=3, d=4
I'm not sure what it should do if you try to unpack an item not in a
container. I expect it should give an error because a tuple or list was
expected.
a = 1
x = ^a # error!
Explicit packing would not be as useful as we can put ()'s or []'s
around things. One example that come to mind at the moment is using it
to create single item tuples.
x = ~1 -> (1,)
Possible converting strings to tuples?
a = 'abcd'
b = ~^a -> ('a','b','c','d') # explicit unpack and repack
and:
b = ~a -> ('abcd',) # explicit pack whole string
for:
b = a, -> ('abcd',) # trailing comma is needed here.
# This is an error opportunity IMO
Choice of symbols aside, packing and unpacking are a very big part of
Python, it just seems (to me) like having an explicit way to express it
might be a good thing.
It doesn't do anything that can't already be done, of course. I think
it might make some code easier to read, and possibly avoid some errors.
Would there be any (other) advantages to it beside the syntax sugar?
Is it a horrible idea for some unknown reason I'm not seeing. (Other
than the symbol choices breaking current code. Maybe other symbols
would work just as well?)
Regards,
Ron
More information about the Python-list
mailing list