read Unicode characters one by one in python2

Chris Angelico rosuav at gmail.com
Sun Feb 25 09:50:16 EST 2018


On Mon, Feb 26, 2018 at 12:33 AM, Chris Warrick <kwpolska at gmail.com> wrote:
> On 24 February 2018 at 17:17, Peng Yu <pengyu.ut at gmail.com> wrote:
>> Here shows some code for reading Unicode characters one by one in
>> python2. Is it the best code for reading Unicode characters one by one
>> in python2?
>>
>> https://rosettacode.org/wiki/Read_a_file_character_by_character/UTF8#Python
>
> No, it’s terrible. So is the Python 3 version. All you need for both
> Pythons is this:
>
> import io
> with io.open('input.txt', 'r', encoding='utf-8') as fh:
>     for character in fh:
>         print(character)

If you actually need character-by-character, you'd need "for character
in fh.read()" rather than iterating over the file itself. Iterating
over a file yields lines.

(BTW, if you know for sure that you're running in Python 3, "io.open"
can be shorthanded to just "open". They're the same thing in Py3.)

ChrisA



More information about the Python-list mailing list