slow loop?

Peter Abel p-abel at t-online.de
Mon Jan 13 20:08:01 EST 2003


bk at whack.org (Brian Kranson) wrote in message news:<1964c4d7.0301130842.2f693958 at posting.google.com>...
> Is there a way I can make this small script any faster?  The file it
> reads in used to be only about a 100 lines and now it is well over
> 2000.  It takes about 14 seconds to run it on my PentiumII.  Thanks in
> advance - Bk

Pádraig]for line in file.readlines():
Pádraig]      line = line.strip().replace('"','')
         will result in ... ,,,, ...

Pádraig]      finalList.append(line.split(','))
         will produce several empty items

Fletcher] import re
Fletcher] finder = re.compile( r'".*?"')
Fletcher] 
Fletcher] def getstrings( filename ):
Fletcher]     result = []
Fletcher]     for line in open(filename, 'r'):
   if you have something like this:   "TEXT","","NEXTTEXT" 
   it will find this -------------------------^^^ and the comma as
text

Fletcher]         result.append( finder.findall( line ) )
Fletcher]     return result

I converted your text in a two-line Text and wrote it to 'Export.txt'
So try the following:
>>> # Read the complete File in one string
>>> text=file('Export.txt').read()
>>> import re
>>> # Delete the last newline
>>> text=text[:-1]
>>> # Replace the other newlines by comma, because there is no one
between two lines
>>> # Replace the quotmarks by nothing ''
>>> # Replace multiple semikolon by "','"
>>> # Add a single quotationmark "'" at the beginning and the end of
the string
>>> # Evaluate this string-sequence and convert all to list
>>> result=list(eval("'"+re.sub(',+',"','",re.sub('"','',re.sub('\n',',',text)))+"'"))
>>> # That's it
>>> print ",\n".join(result)
IDNUMBER,
VRNUMBER,
LASTNAME,
FIRSTNAME,
MAILNAME,
SALUTATION,
ADDRESS1,
ADDRESS2,
ADDRESS3,
CITY,
STATE,
ZIP,
COUNTY,
CONGRESS,
SENATEDIST,
HOUSEDIST,
RECSTATUS,
REGISTER,
CARRT,
HPHONE,
WPHONE,
FPHONE,
APHONE,
EMPLOYER,
OCCUPATION,
BIRTHDATE,
CONSTITUENTTYPE,
ORIGIN,
AFFILIATION,
SEX,
PRECINCT,
HOMESOUND,
UPDATEDATE,
FAMILY,
CONTEXIST,
EXPEXIST,
VOTEEXIST,
NTEEXIST,
VOLEXIST,
VIPEXIST,
CASEEXIST,
REMIND,
BILLEXIST,
WFAX_PHONE,
CAR_PHONE,
E_MAIL,
PAGER,
MAIL_WHERE,
CONTACT,
AWARD_TYPE,
AWARD_DATE,
ELIGIBLE,
JOIN_DATE,
PAY_GRP,
PAY_AMOUNT,
WADDRESS1,
WADDRESS2,
WADDRESS3,
WCITY,
WSTATE,
WZIP,
WCARRT,
MADDRESS1,
MADDRESS2,
MADDRESS3,
MCITY,
MSTATE,
MZIP,
MCARRT,
DADDRESS1,
DADDRESS2,
DADDRESS3,
DCITY,
DSTATE,
DZIP,
CARRT,
HOUSEHOLDMAILNAME,
HOUSEHOLDSALUTATION,
TOTALCONTRIBUTIONS,
LARGESTCONTRIBUTION,
AVERAGECONTRIBUTION,
LASTCONTRIBUTION,
LASTCONTRIBUTIONDATE,
YEAR1TOTAL,
YEAR2TOTAL,
CONTRIBUTIONSYEAR1,
CONTRIBUTIONSYEAR2,
NONE,
63,
Last,
First,
Ms. First L. Last,
Sal,
220 StreetBlvd.,
Oregon,
OH,
43616,
Lucas,
9,
1,
  /  /  ,
12/30/1899,
Individual,
Female,
HOME,
07/13/1999,
Y,
H,
 /  /    ,
  /  /    ,
0,
220 StreetBlvd.,
Oregon,
OH,
43616,
220 StreetBlvd.,
Oregon,
OH,
43616,
220 StreetBlvd.,
Oregon,
OH,
43616-,
Ms. Frist L.Last,
First,
25.00,
25.00,
25.00,
25.00,
01/06/1999,
0.00,
0.00,
0,
0,
none




More information about the Python-list mailing list