Nested dictionaries trouble
Bruno Desthuilliers
bdesth.quelquechose at free.quelquepart.fr
Wed Apr 11 16:57:05 EDT 2007
IamIan a écrit :
> Hello,
>
> I'm writing a simple FTP log parser that sums file sizes as it runs. I
> have a yearTotals dictionary with year keys and the monthTotals
> dictionary as its values. The monthTotals dictionary has month keys
> and file size values. The script works except the results are written
> for all years, rather than just one year. I'm thinking there's an
> error in the way I set my dictionaries up or reference them...
>
> import glob, traceback
>
> years = ["2005", "2006", "2007"]
> months = ["01","02","03","04","05","06","07","08","09","10","11","12"]
> # Create months dictionary to convert log values
> logMonths =
> {"Jan":"01","Feb":"02","Mar":"03","Apr":"04","May":"05","Jun":"06","Jul":"07","Aug":"08","Sep":"09","Oct":"10","Nov":"11","Dec":"12"}
DRY violation alert !
logMonths = {
"Jan":"01",
"Feb":"02",
"Mar":"03",
"Apr":"04",
"May":"05",
#etc
}
months = sorted(logMonths.values())
> # Create monthTotals dictionary with default 0 value
> monthTotals = dict.fromkeys(months, 0)
> # Nest monthTotals dictionary in yearTotals dictionary
> yearTotals = {}
> for year in years:
> yearTotals.setdefault(year, monthTotals)
A complicated way to write:
yearTotals = dict((year, monthTotals) for year in years)
And without even reading further, I can tell you have a problem here:
all 'year' entry in yearTotals points to *the same* monthTotal dict
instance. So when updating yearTotals['2007'], you see the change
reflected for all years. The cure is simple: forget the monthTotals
object, and define your yearTotals dict this way:
yearTotals = dict((year, dict.fromkeys(months, 0)) for year in years)
NB : for Python versions < 2.4.x, you need a list comp instead of a
generator expression, ie:
yearTotals = dict([(year, dict.fromkeys(months, 0)) for year in years])
HTH
More information about the Python-list
mailing list