print UTF-8 file with BOM
Kevin Yuan
farproc at gmail.com
Fri Dec 23 09:47:30 EST 2005
Sorry, I'm newbie in python. I can't help you further, indeed I don't know
either.:)
2005/12/23, David Xiao <davihigh at gmail.com>:
>
> Hi Kuan:
>
> Thanks a lot! One more question here: How to write if I want to
> specify locale other than current locale?
>
> For example, running on Korea locale system, and try read a UTF-8 file
> that save chinese.
>
> Regards, David
>
>
>
>
> 2005/12/23, Kevin Yuan <farproc at gmail.com>:
> > import codecs
> > def read_utf8_txt_file (filename):
> > fileObj = codecs.open( filename, "r", "utf-8" )
> > content = fileObj.read()
> > content = content[1:] #exclude BOM
> > print content
> > fileObj.close()
> >
> > read_utf8_txt_file("e:\\u.txt")
> >
> > 22 Dec 2005 18:12:28 -0800, davihigh at gmail.com < davihigh at gmail.com>:
> > > Hi Friends:
> > >
> > > fileObj = codecs.open( filename, "r", "utf-8" )
> > > u = fileObj.read() # Returns a Unicode string from the UTF-8
> bytes
> > in
> > > the file
> > > print u
> > >
> > > It says error:
> > > UnicodeEncodeError: 'gbk' codec can't encode character
> u'\ufeff'
> > in
> > > position 0:
> > > illegal multibyte sequence
> > >
> > > I want to know how read from UTF-8 file, and convert to specified
> > > locale (default is current system locale) and print out string. I hope
> > > put away BOM header automatically.
> > >
> > > Rgds, David
> > >
> > > --
> > > http://mail.python.org/mailman/listinfo/python-list
> > >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20051223/3d4fd2e9/attachment.html>
More information about the Python-list
mailing list