Reading from text files

Paul Watson pwatson at redlinec.com
Sun Feb 22 21:46:54 EST 2004


"Thomas Philips" <tkpmep at hotmail.com> wrote in message
news:b4a8ffb6.0402221213.7bc0c493 at posting.google.com...
> In the course of playing around with file input and output, I came
> across some behavior that is not quite intuitive. I created a simple
> text file, test.txt, which contains only 3 lines, and which I expect
> will have 5 characters (the digits 1, 2, and 3, and two newline
> characters, the first after 1 and the second after 2). Here it is in
> all its glory:
> 1
> 2
> 3
>
> However, when I read it using open()and then view it using
> >>> file.seek(0); file.read(); file.tell()
> I get:
> '1\n2\n3'
> 7L
>
> Python thinks there are 7 characters in the file! If I type
> >>> file.seek(1); file.read() OR >>>file.seek(2); file.read()
> I get
> '\n2\n3'
>
> but
> >>> file.seek(3); file.read()
> gives me what I expected to get with file.seek(2); file.read()
> '2\n3'
>
> It appears that Python sometimes counts each of the newline escape
> sequences as 2 separate characters and at other times as 1 indivisible
> character. What is the appropriate way to think about these
> characters?
>
> Thomas Philips

If you want to actually "see" what is in the file do a directory listing and
dump the file in hex.

On DOS/Windows do a 'dir test.txt' command and inspect the size of the file.
Then, do a 'debug test.txt' command.  At the prompt, enter the 'r' command
and press enter.  Examine the CX register.  It will have the same value as
the size of the file.  Then do a 'd' command to dump the bytes out and you
can see exactly what is in the file.

On UNIX/Linux use 'ls -l test.txt' to see the directory listing containing
the size of the file.  Use something like 'od -Ax -x test.txt' to see the
contents of the file.  If that command does not produce something you like,
use 'man od' to find the parameters with which you are more comfortable.





More information about the Python-list mailing list