Accent character problem

Mark Tolonen M8R-yfto6h at mailinator.com
Fri Jun 20 10:32:39 EDT 2008


"Sallu" <praveen.sunsetpoint at gmail.com> wrote in message 
news:faf4b0f2-1492-48bf-91d5-f6e0d19619ed at t12g2000prg.googlegroups.com...
> Hi all and one
> i wrote this script, working fine without fail( just run it)
>
> import re
> value='This is Praveen'
> print value
> #value = 'riché gerry'
> #words=str(value.split()).strip('[]').replace(', ', '') ( here i tried
> to convert in to list and then back to string)
> #print words
> richre=re.compile(r'[a-zA-Z0-9]')
>
> if(richre.match(value)):
>   print "Valid"
> else:
>   print "Not allowed special characters"
>
> Output 1: (Fair)
> This is Praveen
> Valid
> but when i change the value like
> value='éhis is Praveen'
> then
>
> Output 2:(Fair)
> éhis is Praveen
> Not allowed special characters
>  (because i wanted to check out the ascent(é) character so its working
> fine no issue)
>
> but when i give ascent(é) character in middle like
> value='This és Praveen'
>
> Output 3:(not fair)
>
> This és Praveen
> Valid
> even it have ascent character it should display message "Not allowed
> special characters"
> Please help me out.
> Thanks

The match function only matches the pattern at the start of a string.  Use 
search instead.  Printing the results of a successful match will help you 
debug the problems also.

    matchobj = richre.match(value)
    if matchobj:
        print matchobj.group()
    else:
    print 'no match'

As written you will only get a successful match if your string starts with 
a-zA-Z0-9, which is why #1 and #3 print 'Valid'.

You also should declare the encoding of the file and consider using Unicode 
strings.

-Mark




More information about the Python-list mailing list