Problem with re module

Tue Mar 22 14:40:11 EDT 2011

On Mar 22, 11:16 am, John Bokma <j... at castleamber.com> wrote:
> John Harrington <beartiger.... at gmail.com> writes:
> > I'm trying to use the following substitution,
>
> >      lineList[i]=re.sub(r'(\\begin{document})([^$])',r'\1\n\n
> > \2',lineList[i])
>
> > I intend this to match any string "\begin{document}" that doesn't end
> > in a line ending.  If there's no line ending, then, I want to place
> > two carriage returns between the string and the non-line end
> > character.
>
> > However, this places carriage returns even when the string is followed
> > directly after with a line ending.  Can someone explain to me why this
> > match is not behaving as I intend it to, especially the ([^$])?
>
> [^$] matches: not a $ character
>
> You might want [^\n]

Thank you, John.

I thought that when you use "r" before the regex, $ matches an end of
line.  But, in any case, if I use "[^\n]" as you suggest I get the
same result.

Here's a script that illustrates the problem.  Any help would be
appreciated!:

#BEGIN SCRIPT
import re

outlist = []
myfile  = "raw.tex"

fin = open(myfile, "r")
lineList = fin.readlines()
fin.close()

for i in range(0,len(lineList)):

     lineList[i]=re.sub(r'(\\begin{document})([^\n])',r'\1\n\n
\2',lineList[i])

     outlist.append(lineList[i])

fou = open(myfile, "w")
for i in range(len(outlist)):
   fou.write(outlist[i])
fou.close
#END SCRIPT

And the file raw.tex:

%BEGIN TeX FILE
\begin{document}
This line should remain right after the above line in the output, but
doesn't

\begin{document}Extra stuff here should appear below the begin line
and does in the output.
%END TeX FILE