speeding up reading files (possibly with cython)

per perfreem at gmail.com
Sat Mar 7 17:06:46 EST 2009


hi all,

i have a program that essentially loops through a textfile file thats
about 800 MB in size containing tab separated data... my program
parses this file and stores its fields in a dictionary of lists.

for line in file:
  split_values = line.strip().split('\t')
  # do stuff with split_values

currently, this is very slow in python, even if all i do is break up
each line using split() and store its values in a dictionary, indexing
by one of the tab separated values in the file.

is this just an overhead of python that's inevitable? do you guys
think that switching to cython might speed this up, perhaps by
optimizing the main for loop?  or is this not a viable option?

thank you.



More information about the Python-list mailing list