tail

Barry barry at barrys-emacs.org
Mon May 9 17:07:22 EDT 2022



> On 9 May 2022, at 20:14, Marco Sulla <Marco.Sulla.Python at gmail.com> wrote:
> 
> On Mon, 9 May 2022 at 19:53, Chris Angelico <rosuav at gmail.com> wrote:
>> 
>>> On Tue, 10 May 2022 at 03:47, Marco Sulla <Marco.Sulla.Python at gmail.com> wrote:
>>> 
>>> On Mon, 9 May 2022 at 07:56, Cameron Simpson <cs at cskk.id.au> wrote:
>>>> 
>>>> The point here is that text is a very different thing. Because you
>>>> cannot seek to an absolute number of characters in an encoding with
>>>> variable sized characters. _If_ you did a seek to an arbitrary number
>>>> you can end up in the middle of some character. And there are encodings
>>>> where you cannot inspect the data to find a character boundary in the
>>>> byte stream.
>>> 
>>> Ooook, now I understand what you and Barry mean. I suppose there's no
>>> reliable way to tail a big file opened in text mode with a decent performance.
>>> 
>>> Anyway, the previous-previous function I posted worked only for files
>>> opened in binary mode, and I suppose it's reliable, since it searches
>>> only for b"\n", as readline() in binary mode do.
>> 
>> It's still fundamentally impossible to solve this in a general way, so
>> the best way to do things will always be to code for *your* specific
>> use-case. That means that this doesn't belong in the stdlib or core
>> language, but in your own toolkit.
> 
> Nevertheless, tail is a fundamental tool in *nix. It's fast and
> reliable. Also the tail command can't handle different encodings?

POSIX tail just prints the bytes to the output that it finds between \n bytes.
At no time does it need to care about encodings as that is a problem solved
by the terminal software. I would not expect utf-16 to work with tail on
linux systems.

You could always get the source of tail and read It’s implementation.

Barry

> -- 
> https://mail.python.org/mailman/listinfo/python-list
> 



More information about the Python-list mailing list