tail

Dennis Lee Bieber wlfraed at ix.netcom.com
Sat May 7 15:08:57 EDT 2022


On Sat, 7 May 2022 20:35:34 +0200, Marco Sulla
<Marco.Sulla.Python at gmail.com> declaimed the following:

>Well, ok, but I need a generic method to get LF and CR for any
>encoding an user can input.

	Other than EBCDIC, <lf> and <cr> AS BYTES should appear as x0A and x0D
in any of the 8-bit encodings (ASCII, ISO-8859-x, CPxxxx, UTF-8). I believe
those bytes also appear in UTF-16 -- BUT, they will have a null (x00) byte
associated with them as padding; as a result, you can not search for just
x0Dx0A (Windows line end convention -- they may be x00x0Dx00x0A or
x0Dx00x0Ax00 depending on endianness cf:
https://docs.microsoft.com/en-us/cpp/text/support-for-unicode?view=msvc-170
)

	For EBCDIC <cr> is still x0D, but <lf> is x25 (and there is a separate
<nl> [new line] at x15)


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/


More information about the Python-list mailing list