Dictionaries
Peter Otten
__peter__ at web.de
Thu Mar 20 10:08:31 EDT 2014
ishish wrote:
> This might sound weird, but is there a limit how many dictionaries a
> can create/use in a single script?
No.
> My reason for asking is I split a 2-column-csv (phone#, ref#) file into
> a dict and am trying to put duplicated phone numbers with different ref
> numbers into new dictionaries. The script deducts the duplicated 46
> numbers but it only creates batch1.csv. Since I obviously can't see the
> wood for the trees here, can someone pls punch me into the right
> direction....
> ...(No has_key is fine, its python 2.7)
>
> f = open("file.csv", 'r')
Consider a csv with the lines
Number...
123,first
123,second
456,third
> myDict = {}
> Batch1 = {}
> Batch2 = {}
> Batch3 = {}
>
> for line in f:
> if line.startswith('Number' ):
> print "First line ignored..."
> else:
> k, v = line.split(',')
> myDict[k] = v
the first time around the assignment is
myDict["123"] = "first\n"
the second time it is
myDict["123"] = "second\n"
i. e. you are overwriting the previous value and only keep the value
corresponding to the last occurrence of a key.
A good approach to solve the problem of keeping an arbitrary number of
values per key is to make the dict value a list:
myDict = {}
with open("data.csv") as f:
next(f) # skip first line
for line in f:
k, v = line.split(",")
myDict.setdefault(k, []).append(v)
This will produce a myDict
{
"123": ["first\n", "second\n"],
"456": ["third\n"]
}
You can then proceed to find out the number of batches:
num_batches = max(len(v) for v in myDict.values())
Now write the files:
for index in range(num_batches):
with open("batch%s.csv" % (index+1), "w") as f:
for key, values in myDict.items():
if len(values) > index: # there are more than index duplicates
f.write("%s,%s" % (key, values[index]))
More information about the Python-list
mailing list