help with explaining how to split a list of tuples into parts

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sat Jul 13 04:11:18 EDT 2013


On Fri, 12 Jul 2013 23:43:55 -0700, peter wrote:

> Hi List,
> 
> I am new to Python and wondering if there is a better python way to do
> something.  As a learning exercise I decided to create a python bash
> script to wrap around the Python Crypt library (Version 2.7).

A Python bash script? What does that mean? Python and bash are two 
different languages.


> My attempt is located here - https://gist.github.com/pjfoley/5989653

A word of advice: don't assume that just because people are reading your 
posts, that they necessarily will follow links to view your code. There 
could be all sorts of reasons why they might not:

- they may have email access, but are blocked from the web;

- they may not have a browser that gets on with github;

- they may be reading email via a smart phone, and not want to pay extra 
to go to a website;

- too lazy, or too busy, to follow a link;

- they don't want to get bogged down in trying to debug a large block of 
someone else's code.


Or some other reason. For best results, you should try to simplify the 
problem as much as possible, bringing it down to the most trivial, easy 
example you can, small enough to include directly in the body of your 
email. That might be one line, or twenty lines.

See also: http://sscce.org/


> I am trying to wrap my head around list comprehensions, I have read the
> docs at
> http://docs.python.org/2/tutorial/datastructures.html#list-
comprehensions
> and read various google results.  I think my lack of knowledge is making
> it difficult to know what key word to search on.

I don't really think that list comps have anything to do with the problem 
at hand. You seem to have discovered a hammer (list comps) and are now 
trying to hammer everything with it. I don't think the list comp is the 
right tool for the job below.

But, for what it is worth, a list comp is simply a short-cut for a for-
loop, where the body of the loop is limited to a single expression. So 
this for-loop:

result = []
for value in some_values:
    result.append(calculate(value))


can be re-written as this list comp:

result = [calculate(value) for value in some_values]



> Essentially I have this list of tuples
> 
> # Tuple == (Hash Method, Salt Length, Magic String, Hashed Password
> Length) 
> supported_hashes=[('crypt',2,'',13), ('md5',8,'$1$',22),
> ('sha256',16,'$5$',43), ('sha512',16,'$6$',86)]
> 
> This list contains the valid hash methods that the Crypt Library
> supports plus some lookup values I want to use in the code.

Consider using namedtuple to use named fields rather than just numbered 
fields. For example:


from collections import namedtuple
Record = namedtuple("Record", "method saltlen magic hashpwdlen")


defines a new type called "Record", with four named fields. They you 
might do something like this:


x = Record('crypt', 2, '', 13)
print(x.saltlen)

=> prints 2

You might do this:


supported_hashes = [
    Record('crypt', 2, '', 13),
    Record('md5', 8, '$1$', 22),
    Record('sha256', 16, '$5$', 43),
    Record('sha512', 16, '$6$', 86),
    ]


although I think a better plan would be to use a dict rather than a list, 
something like this:


from collections import namedtuple
CryptRecord = namedtuple("CryptRecord", 
    "salt_length magic_string expected_password_length")


supported_hashes = {
    'crypt': CryptRecord(2, '', 13),
    'md5': CryptRecord(8, '$1$', 22),
    'sha256': CryptRecord(16, '$5$', 43),
    'sha512': CryptRecord(16, '$6$', 86),
    }


This will let you look crypt methods up by name:

method = supported_hashes['md5']
print(method.magic_string)

=> prints '$1$'


> I have managed to work out how to extract a list of just the first value
> of each tuple (line 16) which I use as part of the validation against
> the --hash argparse option.
> 
> My Question.
> 
> Looking at line 27, This line returns the tuple that mataches the hash
> type the user selects from the command line.  Which I then split the
> seperate parts over lines 29 to 31.
> 
> I am wondering if there is a more efficient way to do this such that I
> could do:
> 
> salt_length, hash_type, expected_password_length = [x for x in
> supported_hashes if x[0] == args.hash]

Have you tried it? What happens when you do so? What error message do you 
get? If you print the list comp, what do you get?

Hint: on the left hand side, you have three names. On the right hand 
side, you have a list containing one item. That the list was created from 
a list comprehension is irrelevant. What happens when you do this?

spam, ham, eggs = [(1, 2, 3)]  # List of 1 item, a tuple.

What happens when you extract the item out of the list?

spam, ham, eggs = [(1, 2, 3)][0]



> From my limited understanding the first x is the return value from the
> function which meets the criteria.  So could I do something like:

No, not the first x. The *only* x, since you only have one x that matches 
the condition.

Consider your list of supported_hashes. If you run this code:


result = []
for x in supported_hashes:
    if x == arg.hash:
        result.append(x)

what is the value of result? How many items does it have? If need be, 
call len(result) to see.

You need to extract the first (only) item from the list. Then you will 
get a different error: *too many* items to unpack, instead of too few. So 
you need either an extra name on the left, which you ignore:

spam, ham, eggs = [(1, 2, 3, 4)][0]  # fails

who_cares, spam, ham, eggs = [(1, 2, 3, 4)][0]


or you need to reduce the number of items on the right:

spam, ham, eggs = [(1, 2, 3, 4)][0][1:]


Can you see why the last one works? The word you are looking for is 
"slicing", and you can test it like this:


print( [100, 200, 300, 400, 500][1:] )
print( [100, 200, 300, 400, 500][2:4] )
print( [100, 200, 300, 400, 500][2:5] )


> ... = [(x[0][1], x[0][2], x[0][3]) for x in supported_hashes if x[0] ==
> args.hash]

You don't need to manually split the x tuple into 3 pieces. Slicing is 
faster and simpler:

[x[1:] for x in supported_hashes if x[0] == args.hash]


But if you use my suggestion for a dictionary, you can just say:


salt_length, hash_type, password_length = supported_hashes[args.hash]


as a simple, fast lookup, no list comp needed.



-- 
Steven



More information about the Python-list mailing list