regexp

johnzenger at gmail.com johnzenger at gmail.com
Tue Dec 19 16:16:41 EST 2006


Oops, I mean obj.sub("", htmldata)

On Dec 19, 4:15 pm, johnzen... at gmail.com wrote:
> You want re.sub("(?s)<!--.*?-->", "", htmldata)
>
> Explanation:  To make the dot match all characters, including newlines,
> you need to set the DOTALL flag.  You can set the flag using the (?_)
> syntax, which is explained in section 4.2.1 of the Python Library
> Reference.
>
> A more readable way to do this is:
>
> obj = re.compile("<!--.*?-->", re.DOTALL)
> re.sub("", htmldata)
>
> On Dec 19, 3:59 pm, vertigo <s... at spam.pl> wrote:
>
>
>
> > Hello
>
> > > On Tuesday 19 December 2006 13:15, vertigo wrote:
> > >> Hello
>
> > >> I need to use some regular expressions for more than one line.
> > >> And i would like to use some modificators like: /m or /s in perl.
> > >> For example:
> > >> re.sub("<script.*>.*</script>","",data)
>
> > >> will not cut out all javascript code if it's spread on many lines.
> > >> I could use something like /s from perl which treats . as all signs
> > >> (including new line). How can i do that ?
>
> > >> Maybe there is other way to achieve the same results ?
>
> > >> Thanx
>
> > > Take a look at Chapter 8 of 'Dive Into Python.'
> > >http://diveintopython.org/toc/index.htmliread whole regexp chapter - but there was no solution for my problem.
> > Example:
>
> > re.sub("<!--.*-->","",htmldata)
> > would remove only comments which are in one line.
> > If comment is in many lines like this:
> > <!--start
> > of
> > commend, end-->
>
> > it would not work. It's because '.' sign does not matches '\n' sign.
>
> > Does anybody knows solution for this particular problem ?
>
> > Thanx- Hide quoted text -- Show quoted text -- Hide quoted text -- Show quoted text -




More information about the Python-list mailing list