Accent character problem
Mark Tolonen
M8R-yfto6h at mailinator.com
Fri Jun 20 10:32:39 EDT 2008
"Sallu" <praveen.sunsetpoint at gmail.com> wrote in message
news:faf4b0f2-1492-48bf-91d5-f6e0d19619ed at t12g2000prg.googlegroups.com...
> Hi all and one
> i wrote this script, working fine without fail( just run it)
>
> import re
> value='This is Praveen'
> print value
> #value = 'riché gerry'
> #words=str(value.split()).strip('[]').replace(', ', '') ( here i tried
> to convert in to list and then back to string)
> #print words
> richre=re.compile(r'[a-zA-Z0-9]')
>
> if(richre.match(value)):
> print "Valid"
> else:
> print "Not allowed special characters"
>
> Output 1: (Fair)
> This is Praveen
> Valid
> but when i change the value like
> value='éhis is Praveen'
> then
>
> Output 2:(Fair)
> éhis is Praveen
> Not allowed special characters
> (because i wanted to check out the ascent(é) character so its working
> fine no issue)
>
> but when i give ascent(é) character in middle like
> value='This és Praveen'
>
> Output 3:(not fair)
>
> This és Praveen
> Valid
> even it have ascent character it should display message "Not allowed
> special characters"
> Please help me out.
> Thanks
The match function only matches the pattern at the start of a string. Use
search instead. Printing the results of a successful match will help you
debug the problems also.
matchobj = richre.match(value)
if matchobj:
print matchobj.group()
else:
print 'no match'
As written you will only get a successful match if your string starts with
a-zA-Z0-9, which is why #1 and #3 print 'Valid'.
You also should declare the encoding of the file and consider using Unicode
strings.
-Mark
More information about the Python-list
mailing list