performance question: dictionary or list, float or string?

Matimus mccredie at gmail.com
Tue Dec 2 14:37:24 EST 2008


On Dec 2, 3:51 am, bkamr... at gmail.com wrote:
> I forgot to mention that I did a simple timeit test which doesn't
> show
> significant runtime difference 3.5 sec for dictionary case and 3.48
> for
> list case.
>
> def read_as_dictionary():
>     fil = open('myDataFile', 'r')
>     forces = {}
>     for region in range(25):
>         forces[region] = {}
>
>     for step in range(20000):
>         for region in range(25):
>             line = fil.next(); spl = line.split()
>             forces[region] [step] = spl
>
> def read_as_list():
>     fil = open('myDataFile.txt', 'r')
>     forces = []
>     for region in range(25):
>         forces.append([])
>
>     for step in range(20000):
>         for region in range(25):
>             line = fil.next(); spl = line.split()
>             forces[region].append(spl)
>
> Cheers,
> /Ben

There really isn't enough information to recommend a particular
direction. A dictionary doesn't seem appropriate for
this information though. Also, you are hard coding the step range to
20000. Is that the number of lines in the file? That isn't really a
safe way to do it.

# this is just bad style in python:
line = fil.next(); spl = line.split()
# better written
spl = fil.next().split()

I would just do it this way:

def read_as_list(data, regions=25, maxlines=20000):
    # If data is a filename, open the file. If it is a file
    # object or any sequence of 'lines' it should just work.

    file_opened = False
    if isinstance(data, basestring):
        data = open(data, 'r')
        file_opened = True

    forces = [[] for _ in xrange(regions)]
    try:
        for i, line in data:
            if i == maxlines:
                break
            forces[i % 25].append(line.split())
    finally:
        if file_opened:
            f.close()
    return forces


Matt



More information about the Python-list mailing list