unicode text file

Junaid junu.pv at gmail.com
Sat Oct 3 05:40:54 EDT 2009


On Sep 27, 6:39 pm, "Mark Tolonen" <metolone+gm... at gmail.com> wrote:
> "Junaid" <junu... at gmail.com> wrote in message
>
> news:0267bef9-9548-4c43-bcdf-b624350c8f15 at p23g2000vbl.googlegroups.com...
>
> >I want to do replacements in a utf-8 text file. example
>
> > f=open("test.txt","r") #this file is uft-8 encoded
> > raw = f.read()
> > txt = raw.decode("utf-8")
>
> You can use the codecs module to open and decode the file in one step
>
>
>
> > txt.replace{'English', ur'ഇംഗ്ലീഷ്') #replacing raw unicode string,
> > but not working
>
> The replace method returns the altered string.  It does not modify it in
> place.  You also should use Unicode strings for both the arguments (although
> it doesn't matter in this case).  Using a raw Unicode string is also
> unnecessary in this case.
>
>     txt = txt.replace(u'English', u'ഇംഗ്ലീഷ്')
>
> > f.write(txt)
>
> You opened the file for writing.  You'll need to close the file and reopen
> it for writing.
>
> > f.close()
> > f.flush()
>
> Flush isn't required.  close() will flush.
>
> Also to have text like ഇംഗ്ലീഷ് in a file you'll need to declare the
> encoding of the file at the top and be sure to actually save the file in the
> encoding.
>
> In summary:
>
>     # coding: utf-8
>     import codecs
>     f = codecs.open('test.txt','r','utf-8')
>     txt = f.read()
>     txt = txt.replace(u'English', u'ഇംഗ്ലീഷ്')
>     f.close()
>     f = codecs.open('test.txt','w','utf-8')
>     f.write(txt)
>     f.close()
>
> -Mark

thanx everyone for replying,

I did as Mark suggested, and it worked :)

thanx once more



More information about the Python-list mailing list