Efficiently building ordered dict

Bryan bryanvick at gmail.com
Mon Feb 22 18:31:32 EST 2010


On Feb 22, 3:00 pm, "Diez B. Roggisch" <de... at nospam.web.de> wrote:
> Am 22.02.10 23:48, schrieb Bryan:
>
>
>
> > On Feb 22, 2:16 pm, "Diez B. Roggisch"<de... at nospam.web.de>  wrote:
> >> Am 22.02.10 22:29, schrieb Bryan:
>
> >>> On Feb 22, 10:57 am, "Alf P. Steinbach"<al... at start.no>    wrote:
> >>>> * Bryan:
>
> >>>>> I am looping through a list and creating a regular dictionary.  From
> >>>>> that dict, I create an ordered dict.  I can't think of a way to build
> >>>>> the ordered dict while going through the original loop.  Is there a
> >>>>> way I can avoid creating the first unordered dict just to get the
> >>>>> ordered dict?  Also, I am using pop(k) to retrieve the values from the
> >>>>> unordered dict while building the ordered one because I figure that as
> >>>>> the values are removed from the unordered dict, the lookups will
> >>>>> become faster.  Is there a better idiom that the code below to create
> >>>>> an ordered dict from an unordered list?
>
> >>>>> unorderedDict = {}
> >>>>> for thing in unorderedList:
> >>>>>      if thing.id in unorderedDict:
> >>>>>              UpdateExistingValue(unorderedDict[thing.id])
> >>>>>      else:
> >>>>>              CreateNewValue(unorderedDict[thing.id])
>
> >>>> If this were real code the last statement would generate an exception.
>
> >>>>> orderedDict = OrderedDict()
> >>>>> for k in sorted(unorderedDict.keys()):
> >>>>>      orderedDict[k]  unorderedDict.pop(k)
>
> >>>> This is not even valid syntax.
>
> >>>> Please
>
> >>>>      (1) explain the problem that you're trying to solve, not how you
> >>>>          imagine the solution, and
>
> >>>>      (2) if you have any code, please post real code (copy and paste).
>
> >>>> The above code is not real.
>
> >>>> Cheers&    hth.,
>
> >>>> - Alf
>
> >>> Sorry about the sorted != ordered mix up.  I want to end up with a
> >>> *sorted* dict from an unordered list.  *Sorting the list is not
> >>> practical in this case.*  I am using python 2.5, with an ActiveState
> >>> recipe for an OrderedDict.
>
> >>> Given these requirements/limitations, how would you do it?
>
> >>> My solution is to create a regular dict from the list.  Then sort the
> >>> keys, and add the keys+values to an OrderedDict.  Since the keys are
> >>> being added to the OrderedDict in the correctly sorted order, at the
> >>> end I end up with a OrderedDict that is in the correctly *sorted*
> >>> order.
>
> >> If that works for you, I don't understand your assertion that you can't
> >> sort the list. If you have the space&  time to sort the intermediate
> >> dict, then it's as easy to create the list&  sort&  then the ordered
> >> dict from it. It should be faster, because you sort the keys anyway.
>
> >> Diez
>
> > Here is how I am converting a regular dict to an ordered dict that is
> > sorted by keys.
>
> > def _getOrderedDict(theDict):
> >    ordered = OrderedDict()
> >    for k in sorted(theDict.keys()):
> >            ordered[k] = theDict.pop(k)
> >    return ordered
>
> > The list is a bunch of objects that represent hours worked by
> > employees on particular jobs, and accounts, and client purchase orders
> > etc.  From this list, I am generating these sorted dicts that contain
> > summarizing information about the list.  So one of the sorted dicts
> > will give a summary of hours worked by job number.  Another one will
> > give summary information by client PO number etc.  So instead of
> > sorting the list a bunch of different ways, I keep the list as is,
> > generate the summaries I need into dictionaries, and then sort those
> > dictionaries as appropriate.
>
> Again - why? Unless there is some filtering going on that reduces the
> number of total entries before the sorting when building the
> intermediate dict, a simple
>
> ordered = OrderedDict(sorted(the_list, key=lambda v: v['some_key']))
>
> won't cost you a dime more.
>
> I think you believe in building the dict so that ou can have the key for
> sorting. As shown abov - you don't need to.
>
> It might even benefitial to really re-sort the original list, because
> that spares you the intermediate list.
>
> Diez

Lets say my list has 1 million items.  If I sort the list before
summarizing it, I will absolutley have to sort 1 million items.

However, if I summarize first, then just sort the keys in the summary
dict, I could wind up only sorting 4 or 5 items, if within that 1
million item list, there were only 4 or 5 job numbers for example.
The list stays as is, and my summary dictionary is sorted so I can
iterate through it and make little reports that humans will like.





More information about the Python-list mailing list