Regex Group case change

Cameron Simpson cs at cskk.id.au
Thu Oct 1 17:52:35 EDT 2020


On 01Oct2020 12:41, Raju <ch.nagaraju008 at gmail.com> wrote:
>I want to change the case on input string i was able to match using 
>python regex but couldn't find the way to change the case.
>
>For example string:
>Input: 7Section Hello Jim
>output: 7Section hello Jim
>
>I was doing if statment with regex
>
>if re.match("(\d+\w* )(Hello)( \w+)",string)):
>   print(r"(\d+\w* )(Hello)( \w+)","\1\2.lower()\3",string)
>
>Output was
>7Section \2.lower() Jim
>Above one is one of the regex i have in function, i have total 6 regex 
>patterns and i want to keep all in this if elif else statment. It is 
>matching, but can someone advise how to replace Hello to hello?

Please paste the _exact_ code you're using to produce the problem. I do 
not believe the code above generates the output you show. I imagine 
there's som kind of regexp replacement call in the real code.

There's a few things going on in the code above which will cause 
trouble:

Be consistent using "raw strings", which look like r".......". Normal 
Python string recognise a variety of backslash escaped things, like \n 
for a newline character. The purpose of a raw string is to disable that, 
which is important with regular expressions because they also use 
backslash escapes such as \d for a digit. Try to _always_ use raw 
strings when working with regular expressions.

Your print() call _looks_ like it should be printing the result of a 
regexp substitute() function call, based on the "\1\2.lower()\3" in the 
second field. The substitute() syntax does not support embedding str 
methods in the result, so the .lower() will just be written out 
directly. To do more complicated things you need to pull out the matched 
groups and work with them separately, then assemble your desired result.

You do not keep the result of the re.match call here:

    if re.match("(\d+\w* )(Hello)( \w+)",string)):

Traditionally one would write:

    m = re.match("(\d+\w* )(Hello)( \w+)",string))
    if m:

and in recent Python (3.8+) you can write:

    if m := re.match("(\d+\w* )(Hello)( \w+)",string)):

This preserves the result fo the match in the variable "m", which you 
will require if you want to do any work with the result, such as 
lowercasing something.

The matches components of the regexp are available via the .group() 
method of the match result. So:

    m.group(1) == "7Section"
    m.group(2) == "Hello"

and to print "Hello" lowercased you might write:

    m.group(2).lower()

Since this looks much like homework we will leave it to you to apply 
this approach to your existing code.

Cheers,
Cameron Simpson <cs at cskk.id.au>


More information about the Python-list mailing list