tail

Marco Sulla Marco.Sulla.Python at gmail.com
Thu May 12 16:45:42 EDT 2022


Thank you very much. This helped me to improve the function:

import os

_lf = b"\n"
_err_n = "Parameter n must be a positive integer number"
_err_chunk_size = "Parameter chunk_size must be a positive integer number"

def tail(filepath, n=10, chunk_size=100):
    if (n <= 0):
        raise ValueError(_err_n)

    if (n % 1 != 0):
        raise ValueError(_err_n)

    if (chunk_size <= 0):
        raise ValueError(_err_chunk_size)

    if (chunk_size % 1 != 0):
        raise ValueError(_err_chunk_size)

    n_chunk_size = n * chunk_size
    pos = os.stat(filepath).st_size
    chunk_line_pos = -1
    newlines_to_find = n
    first_step = True

    with open(filepath, "rb") as f:
        text = bytearray()

        while pos != 0:
            pos -= n_chunk_size

            if pos < 0:
                pos = 0

            f.seek(pos)
            chars = f.read(n_chunk_size)
            text[0:0] = chars
            search_pos = n_chunk_size

            while search_pos != -1:
                chunk_line_pos = chars.rfind(_lf, 0, search_pos)

                if first_step and chunk_line_pos == search_pos - 1:
                    newlines_to_find += 1

                first_step = False

                if chunk_line_pos != -1:
                    newlines_to_find -= 1

                    if newlines_to_find == 0:
                        break

                search_pos = chunk_line_pos

            if newlines_to_find == 0:
                break

    return bytes(text[chunk_line_pos+1:])



On Thu, 12 May 2022 at 20:29, Stefan Ram <ram at zedat.fu-berlin.de> wrote:

>   I am not aware of a definition of "line" above,
>   but the PLR says:
>
> |A physical line is a sequence of characters terminated
> |by an end-of-line sequence.
>
>   . So 10 lines should have 10 end-of-line sequences.
>

Maybe. Maybe not. What if the file ends with no newline?


More information about the Python-list mailing list