Regular Expression

MRAB python at mrabarnett.plus.com
Sun Apr 12 21:06:13 EDT 2015


On 2015-04-13 01:55, Pippo wrote:
> On Sunday, 12 April 2015 20:46:19 UTC-4, Cameron Simpson  wrote:
>> >> >> > >      constraint = re.compile(r'(#C\[\w*\]'))
>> >> >> > >      result = constraint.search(content[j],re.MULTILINE)
>> >> >> > >      text.append(result)
>> >> >> > >      print(text)
>> [...]
>> >> >> result is empty! Although it should have a content.
>> [...]
>> >> > I fixed the syntax error but the result shows:
>> >> > [None]
>> >> > [None, None]
>> [...]
>>
>> Note that "None" is not "empty", though in this case more or less means what
>> you think.
>>
>> You're getting None because the regexp fails to match.
>>
>> >> Try printing each string you're trying to match using 'repr', i.e.:
>> >>      print(repr(content[j]))
>> >>
>> >> Do any look like they should match?
>> > print(repr(content[j])) gives me the following:
>> >
>> >[None]
>> >'#D{#C[Health] #P[Information] - \n'
>> [...]
>> >shouldn't it match "#C[Health]" in the first row?
>>
>> It looks like it should, unless you have mangled your regular expression. You
>> mentioned earlier that you fixed the syntax error, but you never actually
>> recited the line after fixing it. Please cut/paste the _exact_ line where you
>> compile the regexp as it is now.  Superficially I would expect your regexp to
>> work, but we would like to see it as it current is.
>>
>> Also note that you can print the regexp's .pattern attribute:
>>
>>   print(constraint.pattern)
>>
>> as a check that what was compiled is what you intended to compile.
>>
>> >If not, what is the best way to fetch these items in an array?
>>
>> What you've got is ok. I would point out that as you're processing each line on
>> its own you should not need "re.MULTILINE" in your .compile() call.
>>
>> Cheers,
>> Cameron Simpson <cs at zip.com.au>
>>
>> The upside of PHP is that it lets non-programmers create complex
>> applications. The downside of PHP is that it lets non-programmers create
>> complex applications. - Elliot Lee
>
> This is the complete code:
>
> import re
> import tkinter.filedialog
> import readfile
>
>
>
> j = 0
>
> text = []
>
>
> #content = "#C[Health] #P[Information]"
>
> content = readfile.pattread()
>
> while j < len(content):
>
>      constraint = re.compile(r'(#C\[\w*\])')
>      result = constraint.search(content[j])
>      text.append(result)
>      print(constraint.pattern)
>      print(text)
>      print(repr(content[j]))
>      j = j+1
>
> This is the result I get:
> (#C\[\w*\])
> [<_sre.SRE_Match object at 0x10292ee40>]
> '#D{#C[Health] #P[Information] - \n'
> (#C\[\w*\])
> [<_sre.SRE_Match object at 0x10292ee40>, None]
> 'means any information, including #ST[genetic information], \n'
> (#C\[\w*\])
> [<_sre.SRE_Match object at 0x10292ee40>, None, None]
> 'whether #C[oral | (recorded in (any form | medium))], that \n'
> (#C\[\w*\])
> [<_sre.SRE_Match object at 0x10292ee40>, None, None, None]
> '(1)#C[Is created or received by] a \n'
> (#C\[\w*\])
> [<_sre.SRE_Match object at 0x10292ee40>, None, None, None, None]
> '#A[health care provider | health plan | public health authority | employer | life insurer | school | university | or health care clearinghouse];  \n'
> (#C\[\w*\])
> [<_sre.SRE_Match object at 0x10292ee40>, None, None, None, None, None]
> '(2)#C[Relates to] #C[the past, present, or future physical | mental health | condition of an individual] | \n'
> (#C\[\w*\])
> [<_sre.SRE_Match object at 0x10292ee40>, None, None, None, None, None, None]
> '#C[the provision of health care to an individual] | \n'
> (#C\[\w*\])
> [<_sre.SRE_Match object at 0x10292ee40>, None, None, None, None, None, None, None]
> '#C[the past, present, or future payment for the provision of health care to an individual].}\n'
>
The search returns a match object. If you want the text that it found,
use the match object's .group method.




More information about the Python-list mailing list