Namedtuples: some unexpected inconveniences

MRAB python at mrabarnett.plus.com
Wed Apr 12 16:42:12 EDT 2017


On 2017-04-12 20:57, Deborah Swanson wrote:
> I won't say the following points are categorically true, but I became
> convinced enough they were true in this instance that I abandoned the
> advised strategy. Which was to use defaultdict to group the list of
> namedtuples by one of the fields for the purpose of determining whether
> certain other fields in each group were either missing values or
> contained contradictory values.
> 
> Are these bugs, or was there something I could have done to avoid these
> problems? Or are they just things you need to know working with
> namedtuples?
> 
> The list of namedtuples was created with:
> 
> infile = open("E:\\Coding projects\\Pycharm\\Moving\\Moving 2017 in -
> test.csv")
> rows = csv.reader(infile)fieldnames = next(rows)
> Record = namedtuple("Record", fieldnames)
> records = [Record._make(fieldnames)]
> records.extend(Record._make(row) for row in rows)
>      . . .
> (many lines of field processing code)
>      . . .
> 
> then the attempt to group the records by title:
> 
> import operator
> records[1:] = sorted(records[1:], key=operator.attrgetter("title",
> "Date")) groups = defaultdict() for r in records[1:]:
>      # if the key doesn't exist, make a new group
>      if r.title not in groups.keys():
>          groups[r.title] = [r]
>      # if key (group) exists, append this record
>      else:
>          groups[r.title].append(r)
> 
> (Please note that this default dict will not automatically make new keys
> when they are encountered, possibly because the keys of the defaultdict
> are made from namedtuples and the values are namedtuples. So you have to
> include the step to make a new key when a key is not found.)
> 
The defaultdict _will_ work when you use it properly. :-)

The line should be:

     groups = defaultdict(list)

so that it'll make a new list every time a new key is automatically added.

Another point: namedtuples, as with normal tuples, are immutable; once 
created, you can't change an attribute. A dict might be a better bet.

> If you succeed in modifying records in a group, the dismaying thing is
> that the underlying records are not updated, making the entire exercise
> totally pointless, which was a severe and unexpected inconvenience.
> 
> It looks like the values and the structure were only copied from the
> original list of namedtuples to the defaultdict. The rows of the
> grouped-by dict still behave like namedtuples, but they are no longer
> the same namedtuples as the original list of namedtuples. (I'm sure I
> didn't say that quite right, please correct me if you have better words
> for it.)
> 
> It might be possible to complete the operation and then write out the
> groups of rows of namedtuples in the dict to a simple list of
> namedtuples, discarding the original, but at the time I noticed that
> modifying rows in a group didn't change the values in the original list
> of namedtuples, I still had further to go with the dict of groups,  and
> it was looking easier by the minute to solve the missing values problem
> directly from the original list of namedtuples, so that's what I did.
> 
> If requested I can reproduce how I saw that the original list of
> namedtuples was not changed when I modified field values in group rows
> of the dict, but it's lengthy and messy. It might be worthwhile though
> if someone might see a mistake I made, though I found the same behavior
> several different ways. Which was when I called it barking up the wrong
> tree and quit trying to solve the problem that way.
> 
> Another inconvenience is that there appears to be no way to access field
> values of a named tuple by variable, although I've had limited success
> accessing by variable indices. However, direct attempts to do so, like:
> 
> values = {row[label] for row in group}
>      (where 'label' is a variable for the field names of a namedtuple)
>      
>      gets "object has no attribute 'label'
> 
> or, where 'record' is a row in a list of namedtuples and 'label' is a
> variable for the fieldnames of a namedtuple:
> 
>      value = getattr(record, label)
>      setattr(record, label, value)	also don't work.
>      
> You get the error 'object has no attribute 'label' every time.
> 



More information about the Python-list mailing list