Data type ideas
Joel Ricker
joejava at dragoncat.net
Sat Mar 30 01:19:28 EST 2002
HI all, got a new problem :)
I have a tab delimited file of people plus a list of groups they belong to like so:
Person 1 Group A
Group B
Person 2 Group B
Person 3 Group A
Group C
So basically a person can be part of one of more groups. I'm looking to process this list so that I can take each group and examine the list of people in it. Basically turn the list into:
Group A Person 1
Person 3
Group B Person 1
Person 2
Group C Person 3
The drawback I have to all this is, the file I'm working is pretty big: about 40 megs. A majority of the file is going to be extraneous data that I have weeded out with regular expressions but it is still a large data file.
My first (naive) approach was to just create a Dict type using the name of the group as a the key and for the value a list of people. I learned that due to the overhead, that was going to take alot of memory and processing time.
It would look something like this:
{"Group A" : ["Person 1", "Person 3"],
"Group B" : ["Person 1", "Person 2"],
"Group C" : ["Person 3"]}
My next idea was what about references? Maybe create a list of people and a Dict as above with a list of references to the list of people. But as I learned you can't do references to simple data objects (like a subscript of a list). I could be wrong but thats what I gathered. I tried using a list of integers for the value of the Group Dict, "pointing" to the list of People:
{"Group A" : [0, 2],
"Group B" : [0, 1],
"Group C" : [2] }
["Person 1", "Person 2", "Person 3"]
This helped a little but obviously not much since it isn't much of a change from what I've had before.
So what next? Any ideas that I can use?
Thanks
Joel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20020330/86c2ac80/attachment.html>
More information about the Python-list
mailing list