Is there a unicode EOF mark like DOS ascii ctl-z or unix crl-d ?

Piet van Oostrum piet at cs.uu.nl
Mon Sep 8 08:04:32 EDT 2003


>>>>> "Michael Geary" <Mike at DeleteThis.Geary.com> (MG) wrote:

MG> Martin v. Löwis wrote:
>> No, there is no need to have one (neither is there a need to have one
>> for plain ASCII files): The end-of-file is when the file ends. Most
>> operating systems support a notion of a "file size", and the file ends
>> when file-size bytes have been consumed.
>> 
>> Why Microsoft decided to use ctr-z in text files is beyond me, it does
>> not fulfil any useful function...

MG> It came from CP/M, which believe it or not had *no* way to specify an exact
MG> file length. File lengths were measured in sectors, not bytes. So there had
MG> to be some way to tell where a text file ended, and CP/M used Ctrl+Z.

MG> MS-DOS picked up this convention, although if memory serves it always had
MG> exact file lengths even in version 1.0.

MG> Nobody uses Ctrl+Z in Windows/DOS text files any more, although I think the
MG> COPY command still respects it if you use the /A switch or concatenate
MG> files.

I believe even stdio respects it when a file is opened in text mode. This
is a common problem when people read binary files without specifying the
"b" modifier: Apart from the stripped CR bytes they are often surprised
that their programs stop reading early in the file. This even happens in
Python. 
-- 
Piet van Oostrum <piet at cs.uu.nl>
URL: http://www.cs.uu.nl/~piet [PGP]
Private email: P.van.Oostrum at hccnet.nl




More information about the Python-list mailing list