speed question, reading csv using takewhile() and dropwhile()

Vincent Davis vincent at vincentdavis.net
Fri Feb 19 13:22:09 EST 2010


I have some some (~50) text files that have about 250,000 rows each. I am
reading them in using the following which gets me what I want. But it is not
fast. Is there something I am missing that should help. This is mostly an
question to help me learn more about python. It takes about 4 min right now.

def read_data_file(filename):
    reader = csv.reader(open(filename, "U"),delimiter='\t')
    read = list(reader)
    data_rows = takewhile(lambda trow: '[MASKS]' not in trow, [x for x in
read])
    data = [x for x in data_rows][1:]

    mask_rows = takewhile(lambda trow: '[OUTLIERS]' not in trow,
list(dropwhile(lambda drow: '[MASKS]' not in drow, read)))
    mask = [row for row in mask_rows if row][3:]

    outlier_rows = dropwhile(lambda drows: '[OUTLIERS]' not in drows, read)
    outlier = [row for row in outlier_rows if row][3:]


  *Vincent Davis
720-301-3003 *
vincent at vincentdavis.net
 my blog <http://vincentdavis.net> |
LinkedIn<http://www.linkedin.com/in/vincentdavis>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20100219/07e0c7c7/attachment.html>


More information about the Python-list mailing list