Namedtuples problem

Deborah Swanson python at deborahswanson.net
Thu Feb 23 17:51:49 EST 2017


Peter Otten wrote, on February 23, 2017 2:34 AM
> 
> Deborah Swanson wrote:
> 
<snipping everything from my first post except the question, please
refer to my first post if you have questions>
>
> > Can anyone see why I'm getting this Index error? and how to fix it?
> 
> I'm not completely sure I can follow you, but you seem to be 
> mixing two problems
> 
> (1) split a list into groups
> (2) convert a list of rows into a list of columns

Actually, I was trying to duplicate your original intentions, which I
thought were quite excellent, but turned out to cause problems when I
tried to use the code you gave.

Your original intention was to make a dictionary with the keys being
each of the unique titles in all the records, and the values are the
complete records that contain the unique title. (Each rental listing can
have many records, each with the same title.)

> and making a kind of mess in the process. Functions to the rescue:

I'm sorry you think I made a mess of it and I agree that the code I
wrote is clumsy, although it does work and gives the correct results up
to the last line I gave. I was hoping, among other things, that you
would help me clean it up, so let's look at what you said.

> #untested
> 
> def split_into_groups(records, key):
>     groups = defaultdict(list)
>     for record in records:
>         # no need to check if a group already exists
>         # an empty list will automatically added for every 
>         # missing key
>         groups[key(record)].append(record)
>     return groups

I used this approach the first time I tried this for both defaultdict
and OrderedDict, and for both of them I immediately got a KeyError for
the first record. groups is empty, so the title for the first record
wouldn't already be in groups.

Just to check, I commented out the extra lines that I added to handle
new keys in my code and immediately got the same KeyError.

My guess is that while standard dictionaries will automatically make a
new key if it isn't found in the dict, defaultdict and OrderedDict will
not. So it seems you need to handle new keys yourself. Unless you think
I'm doing something wrong and dicts from collections should also
automatically make new keys.

Rightly or wrongly, I chose not to use functions in my attempt, just to
keep the steps sequential and understandable. Probably I should have
factored out the functions before I posted it.

> def extract_column(records, name):
>     # you will agree that extracting one column is easy :)
>     return [getattr(record, name) for record in records]
> 
> def extract_columns(records, names):
>     # we can build on that to make a list of columns
>     return [extract_column(records, name) for name in names]
> 
> wanted_columns = ['Location', ...]
> records = ...
> groups = split_into_groups(records, operator.attrgetter("title"))
> 
> Columns = namedtuple("Columns", wanted_columns)
> for title, group in groups.items():
>     # for easier access we turn the list of columns
>     # into a namedtuple of columns
>     groups[title] = Columns._make(extract_columns(wanted_columns))

This approach essentially reverses the order of the steps from what I
did, making the columns first and then grouping the records by title.
Either order should work in principle.

> If all worked well you should now be able to get a group with
> 
> group["whatever"]
> 
> and all locations for that group with
> 
> group["whatever"].Locations
> 
> If there is a bug you can pinpoint the function that doesn't work and
ask 
> for specific help on that one.

I'll play with your code and see if it works better than what I had. I
can see right off that the line

group["whatever"].Locations

will fail because group only has a 'Location' field and doesn't have a
'Locations' field.

Running it in the watch window confirms, and it gets:

AttributeError: 'Record' object has no attribute 'Locations'

Earlier on I tried several methods to get an anology to your line

group["whatever"].Locations, 

and failing to do it straightforwardly is part of why my code is so
convoluted.

Many thanks for your reply. Quite possibly getting the columns first and
grouping the records second will be an improvement.




More information about the Python-list mailing list