tail

Cameron Simpson cs at cskk.id.au
Sun May 1 21:52:53 EDT 2022


On 01May2022 23:30, Stefan Ram <ram at zedat.fu-berlin.de> wrote:
>Dan Stromberg <drsalists at gmail.com> writes:
>>But what about Unicode?  Are all 10 bytes newlines in Unicode encodings?
>  It seems in UTF-8, when a value is above U+007F, it will be
>  encoded with bytes that always have their high bit set.

Aye. Design festure enabling easy resync-to-char-boundary at an 
arbitrary point in the file.

>  But Unicode has NEL "Next Line" U+0085 and other values that
>  conforming applications should recognize as line terminators.

I disagree. Maybe for printing things. But textual data records? I would 
hope to end them with NL, and only NL (code 10).

Cheers,
Cameron Simpson <cs at cskk.id.au>


More information about the Python-list mailing list