tail

Sun May 1 18:18:39 EDT 2022

On 01May2022 18:55, Marco Sulla <Marco.Sulla.Python at gmail.com> wrote:
>Something like this is OK?
[...]
>def tail(f):
>    chunk_size = 100
>    size = os.stat(f.fileno()).st_size

I think you want os.fstat().

>    positions = iter(range(size, -1, -chunk_size))
>    next(positions)

I was wondering about the iter, but this makes sense. Alternatively you 
could put a range check in the for-loop.

>    chunk_line_pos = -1
>    pos = 0
>
>    for pos in positions:
>        f.seek(pos)
>        chars = f.read(chunk_size)
>        chunk_line_pos = chars.rfind(b"\n")
>
>        if chunk_line_pos != -1:
>            break

Normal text file _end_ in a newline. I'd expect this to stop immediately 
at the end of the file.

>    if chunk_line_pos == -1:
>        nbytes = pos
>        pos = 0
>        f.seek(pos)
>        chars = f.read(nbytes)
>        chunk_line_pos = chars.rfind(b"\n")

I presume this is because unless you're very lucky, 0 will not be a 
position in the range(). I'd be inclined to avoid duplicating this code 
and special case and instead maybe make the range unbounded and do 
something like this:

    if pos < 0:
        pos = 0
    ... seek/read/etc ...
    if pos == 0:
        break

around the for-loop body.

>    if chunk_line_pos == -1:
>        line_pos = pos
>    else:
>        line_pos = pos + chunk_line_pos + 1
>    f.seek(line_pos)
>    return f.readline()
>
>This is simply for one line and for utf8.

And anything else where a newline is just an ASCII newline byte (10) and 
can't be mistaken otherwise. So also ASCII and all the ISO8859-x single 
byte encodings. But as Chris has mentioned, not for other encodings.

Seems sane. I haven't tried to run it.

Cheers,
Cameron Simpson <cs at cskk.id.au>