Python backreference replacing doesn't work as expected

MRAB python at mrabarnett.plus.com
Sun Jul 12 15:25:57 EDT 2015


On 2015-07-12 10:40, Yonggang Chen wrote:
> There are two named groups in my pattern: myFlag and id, I want to add one more myFlag immediately before group id.
>
> Here is my current code:
> ## code begin
> # i'm using Python 3.4.2
> import re
> import os
> contents = b'''
> xdlg::xdlg(x_app* pApp, CWnd* pParent)
>      : customized_dlg((UINT)0, pParent, pApp)
>      , m_pReaderApp(pApp)
>      , m_info(pApp)
> {
>
> }
> '''
>
> pattern = rb'(?P<myFlag>[a-zA-Z0-9_]+)::(?P=myFlag).+:.+(?P<id>\(UINT\)0 *,)'
> res = re.search(pattern, contents, re.DOTALL)
> if None != res:
>      print(res.groups()) # the output is (b'xdlg', b'(UINT)0,')
>
> # 'replPattern' becomes b'(?P<myFlag>[a-zA-Z0-9_]+)::(?P=myFlag).+:.+((?P=myFlag)\\(UINT\\)0 *,)'

In a replacement template, the (?P...) parts are just literals.

> replPattern = pattern.replace(b'?P<id>', b'(?P=myFlag)', re.DOTALL)

This .replace method is a string method. It has nothing to do with
regex.

> print(replPattern)
> contents = re.sub(pattern, replPattern, contents)

You're not passing in the DOTALL flag; this function doesn't have an
argument for the flags, anyway.

You could compile the regex and then use its .sub method, or use inline
flags instead.

> print(contents)
> # code end
>
> The expected results should be:
>
> xdlg::xdlg(x_app* pApp, CWnd* pParent)
>      : customized_dlg(xdlg(UINT)0, pParent, pApp)
>      , m_pReaderApp(pApp)
>      , m_info(pApp)
> {
>
> }
>
> but now the result this the same with the original:
>
>   xdlg::xdlg(x_app* pApp, CWnd* pParent)
>      : customized_dlg((UINT)0, pParent, pApp)
>      , m_pReaderApp(pApp)
>      , m_info(pApp)
> {
>
> }
>




More information about the Python-list mailing list