[Tutor] Simple regex replacement match (converting from Perl)

Jerry Hill malaclypse2 at gmail.com
Fri Sep 4 21:00:43 CEST 2009


On Fri, Sep 4, 2009 at 1:34 PM, GoodPotatoes<goodpotatoes at yahoo.com> wrote:
> I simply want to remark out all non-word characters read from a line.
>
> Line:
> Q*bert says "#@!$%  "
>
> in Perl
> #match each non-word character, add "\" before it, globally.
>
> $_=s/(\W)/\\$1/g;
>
> output:
> Q\*bert\ says\ \"\#\@\!\$\%\ \ \"  #perfect!
>
> Is there something simple like this in python?
>
> I would imagine:
> foo='Q*bert says "#@!$%  "'
> pNw=re.compile('(\W)')
> re.sub(pNw,'\\'+(match of each non-word character),foo)
>
> How do I get the match into this function?  Is there a different way to do
> this?

Like this:

>>> import re
>>> line = 'Q*bert says "#@!$%  "'
>>> pattern = re.compile(r"(\W)")

>>> re.sub(pattern, r"\\\1", line)
'Q\\*bert\\ says\\ \\"\\#\\@\\!\\$\\%\\ \\ \\"'

Note that line is showing the single backslashes doubled up, because
it's the repr of the string.  If you print the string instead, you'll
see what you expect:

>>> print re.sub(pattern, r"\\\1", line)
Q\*bert\ says\ \"\#\@\!\$\%\ \ \"

>>>

When you're using re.sub, \1 is the value of the first match.  It's
also helpful to use raw strings when you're using the python re
library, since both python and the regular expression library use '\'
as an escape character.  (See the top of
http://docs.python.org/library/re.html for more details).

-- 
Jerry


More information about the Tutor mailing list