.split() Qeustion

Steven D'Aprano steve+comp.lang.python at pearwood.info
Thu Aug 15 07:22:56 EDT 2013


On Thu, 15 Aug 2013 02:46:20 -0700, wxjmfauth wrote:

> A technical ascpect of triple quoted strings is that the "end of lines"
> are not respected.
> 
>>>> import zzz
>>>> zzz.__doc__
> 'abc\ndef\n'


You are misinterpreting what you are seeing. You are not reading lines of 
text from a file. You are importing a module, and they accessing its 
__doc__ attribute. The relationship between the module object and text 
from a file is tenuous, at best:

- the module's file could use \n line endings, or \r, or \r\n, or even 
something else, depending on the platform;

- the module might be a compiled .pyc file, and there are no lines of 
text to read, just byte code;

- or a .dll or .so library, again, no lines of text, just compiled code;

- or there might not even be a file, like the sys module, which is 
entirely built into the interpreter; 

- or it might not even be a module object, you can put anything into 
sys.module. It might be an instance with docstrings computed on the fly.

So you can't conclude *anything* about text files from the fact that 
module docstrings typically contain only \n rather than \r\n line 
endings. Modules are not necessarily text files, and even when they are, 
once you import them, what you get is *not text*, but Python objects.


>>>> with open('zzz.py', 'rb') as fo:
> ...     r = fo.read()
> ...
>>>> r
> b'"""abc\r\ndef\r\n"""\r\n'

And again, you are misinterpreting what you are seeing. By opening the 
file in binary mode, you are instructing Python to treat it as binary 
bytes, and return *exactly* what is stored on disk. If you opened the 
file in text mode, you would (likely, but not necessarily) get a very 
different result: the string would contain only \n line endings.

Python is not a low-level language like C. If you expect it to behave 
like a low-level language like C, you will be confused and upset.

But to prove that you are mistaken, we can do this:

py> s = """Triple-quote string\r
... containing carriage-return+newline line\r
... endings."""
py> s
'Triple-quote string\r\ncontaining carriage-return+newline line\r
\nendings.'


-- 
Steven



More information about the Python-list mailing list