read Unicode characters one by one in python2

Sun Feb 25 09:50:16 EST 2018

On Mon, Feb 26, 2018 at 12:33 AM, Chris Warrick <kwpolska at gmail.com> wrote:
> On 24 February 2018 at 17:17, Peng Yu <pengyu.ut at gmail.com> wrote:
>> Here shows some code for reading Unicode characters one by one in
>> python2. Is it the best code for reading Unicode characters one by one
>> in python2?
>>
>> https://rosettacode.org/wiki/Read_a_file_character_by_character/UTF8#Python
>
> No, it’s terrible. So is the Python 3 version. All you need for both
> Pythons is this:
>
> import io
> with io.open('input.txt', 'r', encoding='utf-8') as fh:
>     for character in fh:
>         print(character)

If you actually need character-by-character, you'd need "for character
in fh.read()" rather than iterating over the file itself. Iterating
over a file yields lines.

(BTW, if you know for sure that you're running in Python 3, "io.open"
can be shorthanded to just "open". They're the same thing in Py3.)

ChrisA