[Tutor] help with re module and parsing data

Kushal Kumaran kushal.kumaran+python at gmail.com
Mon Mar 7 12:22:57 CET 2011


On Mon, Mar 7, 2011 at 1:24 PM, vineeth <vineethrakesh at gmail.com> wrote:
> Hello all I am doing some analysis on my trace file. I am finding the lines
> Recvd-Content and Published-Content. I am able to find those lines but the
> re module as predicted just gives the word that is being searched. But I
> require the entire  line similar to a grep in unix. Can some one tell me how
> to do this. I am doing the following way.
>
> import re
> file = open('file.txt','r')
> file2 = open('newfile.txt','w')
>
> LineFile = ' '
>
> for line in file:
>    LineFile += line
>
> StripRcvdCnt = re.compile('(P\w+\S\Content|Re\w+\S\Content)')
>
> FindRcvdCnt = re.findall(StripRcvdCnt, LineFile)
>
> for SrcStr in FindRcvdCnt:
>    file2.write(SrcStr)
>

Is there any particular reason why you're using regular expressions
for this?  You are already iterating over the lines in your first for
loop.  You can just make the tests you need there.

for line in file:
  if 'Recvd-Content' in line or 'Published-Content' in line:
    <do something with the line>

Your regular expression seems like it will match a lot more strings
than the two you mentioned earlier.

Also, 'file' is a python built-in.  It will be best to use a different
name for your variable.

-- 
regards,
kushal


More information about the Tutor mailing list