regexp
Jonathan Curran
jonc at icicled.net
Tue Dec 19 21:15:44 EST 2006
On Tuesday 19 December 2006 15:32, Paul Arthur wrote:
> On 2006-12-19, vertigo <spam at spam.pl> wrote:
> > Hello
> >
> >> Take a look at Chapter 8 of 'Dive Into Python.'
> >> http://diveintopython.org/toc/index.html
> >
> > i read whole regexp chapter -
>
> Did you read Chapter 8? Regexes are 7; 8 is about processing HTML.
> Regexes are not well suited to this type of processing.
>
> > but there was no solution for my problem.
> > Example:
> >
> > re.sub("<!--.*-->","",htmldata)
> > would remove only comments which are in one line.
> > If comment is in many lines like this:
> ><!--start
> > of
> > commend, end-->
> >
> > it would not work. It's because '.' sign does not matches '\n' sign.
> >
> > Does anybody knows solution for this particular problem ?
>
> Yes. Use DOTALL mode.
Paul, I mentioned Chapter 8 so that the HTML processing section would be taken
a look at. What Vertigo wants can be done with relative ease with SGMLlib.
Anyway, if you (Vertigo) want to use regular expressions to do this, you can
try and use some regular expression testing programs. I'm not quite sure of
the name but there is one that comes with KDE.
- Jonathan Curran
More information about the Python-list
mailing list