print UTF-8 file with BOM

Kevin Yuan farproc at gmail.com
Fri Dec 23 00:33:33 EST 2005


import codecs
def read_utf8_txt_file (filename):
    fileObj = codecs.open( filename, "r", "utf-8" )
    content = fileObj.read()
    content = content[1:] #exclude BOM
    print content
    fileObj.close()

read_utf8_txt_file("e:\\u.txt")

22 Dec 2005 18:12:28 -0800, davihigh at gmail.com <davihigh at gmail.com>:
>
> Hi Friends:
>
>         fileObj = codecs.open( filename, "r", "utf-8" )
>         u = fileObj.read() # Returns a Unicode string from the UTF-8 bytes
> in
> the file
>         print u
>
> It says error:
>         UnicodeEncodeError: 'gbk' codec can't encode character u'\ufeff'
> in
> position 0:
>         illegal multibyte sequence
>
> I want to know how read from UTF-8 file, and convert to specified
> locale (default is current system locale) and print out string. I hope
> put away BOM header automatically.
>
> Rgds, David
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20051223/39bb8d4a/attachment.html>


More information about the Python-list mailing list