list vs. dict
John Machin
sjmachin at lexicon.net
Wed Feb 27 19:11:10 EST 2002
Beda Kosata <kosatab at vscht.cz> wrote in message news:<3C7CF0EE.3010009 at vscht.cz>...
> Hi,
> for a piece of code I need to store relatively large amount of records
> (100-1000).
Sorry, a 1000-record amount is relatively tiny.
> Each record will contain few (10-20) pieces of information
> (mostly strings or integers).
> For convenience it would be better for me to make this records as dicts,
> so I don't have to remember what data has what position in the list.
There are *two* levels to think about: how you store the fields in
each record, and how you store the collection(s) of records
Fields in each record: for *convenience*, contemplate using a class,
not a dict.
>>> class Person(object):
... __slots__ = ("name", "salary", "title", "empno")
...
>>> p1 = Person() # new record
>>> p1.name = "Attila the Hun"
>>> p1.salary = 1000000
>>> p1.title = "CEO"
>>> p1.empno = 666
>>> p2 = Person()
>>> p2.name = "Marmaduke Murgatroyd"
>>> p2.salary = 20000
>>> p2.tittle = "Clerk" # __slots__ gives you typo-checking on
assignments
Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: 'Person' object has no attribute 'tittle'
>>> p2.title = "Clerk"
>>> p2.empno = 1
>>> for x in (p1, p2):
... print x.name, x.salary, x.title
...
Attila the Hun 1000000 CEO
Marmaduke Murgatroyd 20000 Clerk
>>> print ["false", "true"][p1.salary < p2.salary]
false
Now, how you store your collection(s) of records depends on many
things, but speed is *not* one of those things when you have only 1000
records.
If the record has a unique key, such as an employee number, and you
need to access records by employee number, then you can set up a dict
with employee number as the key and the class instance as the value --
employee_dict[p1.empno] = p1 --
If you need merely to access the records sequentially, you can use a
list.
employee_list = []
employee_dict = {}
for buffer in file("employees.data"):
fld = buffer.rstrip().split("~") # assuming data separated by "~"
p = Person()
p.name = fld[0]
p.salary = int(fld[1])
p.title = fld[2]
p.empno = int(fld[3])
# The above is where you do have to "remember" the
# correspondence between positions and names
# but this could be automated using the __slots__ and
# a parallel sequence of functions to apply:
# conv_funcs = (None, int, None, int)
employee_list.append(p)
employee_dict[p.empno] = p
# note -- above needs much error checking and exception handling
# to make it robust -- e.g. salary or empno not an int,
# empno not unique, too few/many fields in input file, ...
> However the most important for me is speed
Speed of what? Speed of running? Speed of implementing a robust
functional application? Speed of maintenance when bugs surface or
requirements change?
Hope this helps,
John
More information about the Python-list
mailing list