[Python-checkins] r57103 - peps/trunk/pep-3116.txt

guido.van.rossum python-checkins at python.org
Thu Aug 16 23:21:35 CEST 2007


Author: guido.van.rossum
Date: Thu Aug 16 23:21:30 2007
New Revision: 57103

Modified:
   peps/trunk/pep-3116.txt
Log:
New spec for newline= parameter to open() and TextIOBase().


Modified: peps/trunk/pep-3116.txt
==============================================================================
--- peps/trunk/pep-3116.txt	(original)
+++ peps/trunk/pep-3116.txt	Thu Aug 16 23:21:30 2007
@@ -342,20 +342,55 @@
     ``.__init__(self, buffer, encoding=None, newline=None)``
 
         ``buffer`` is a reference to the ``BufferedIOBase`` object to
-        be wrapped with the ``TextIOWrapper``.  ``encoding`` refers to
-        an encoding to be used for translating between the
-        byte-representation and character-representation.  If it is
-        ``None``, then the system's locale setting will be used as the
-        default.  ``newline`` can be ``None``, ``'\n'``, ``'\r'``, or
-        ``'\r\n'`` (all other values are illegal); it indicates the
-        translation for ``'\n'`` characters written.  If ``None``, a
-        system-specific default is chosen, i.e., ``'\r\n'`` on Windows
-        and ``'\n'`` on Unix/Linux.  Setting ``newline='\n'`` on input
-        means that no CRLF translation is done; lines ending in
-        ``'\r\n'`` will be returned as ``'\r\n'``.  (``'\r'`` support
-        is still needed for some OSX applications that produce files
-        using ``'\r'`` line endings; Excel (when exporting to text)
-        and Adobe Illustrator EPS files are the most common examples.
+        be wrapped with the ``TextIOWrapper``.
+
+        ``encoding`` refers to an encoding to be used for translating
+        between the byte-representation and character-representation.
+        If it is ``None``, then the system's locale setting will be
+        used as the default.
+
+        ``newline`` can be ``None``, ``''``, ``'\n'``, ``'\r'``, or
+        ``'\r\n'``; all other values are illegal.  It controls the
+        handling of line endings.  It works as follows:
+
+        * On input, if ``newline`` is ``None``, universal newlines
+          mode is enabled.  Lines in the input can end in ``'\n'``,
+          ``'\r'``, or ``'\r\n'``, and these are translated into
+          ``'\n'`` before being returned to the caller.  If it is
+          ``''``, universal newline mode is enabled, but line endings
+          are returned to the caller untranslated.  If it has any of
+          the other legal values, input lines are only terminated by
+          the given string, and the line ending is returned to the
+          caller translated to ``'\n'``.
+
+        * On output, if ``newline`` is ``None``, any ``'\n'``
+          characters written are translated to the system default
+          line separator, ``os.linesep``.  If ``newline`` is ``''``,
+          no translation takes place.  If ``newline`` is any of the
+          other legal values, any ``'\n'`` characters written are
+          translated to the given string.
+
+        Further notes on the ``newline`` parameter:
+
+        * ``'\r'`` support is still needed for some OSX applications
+          that produce files using ``'\r'`` line endings; Excel (when
+          exporting to text) and Adobe Illustrator EPS files are the
+          most common examples.
+
+        * If translation is enabled, it happens regardless of which
+          method is called for reading or writing.  For example,
+          {{{f.read()}}} will always produce the same result as
+          {{{''.join(f.readlines())}}}.
+
+        * If universal newlines without translation are requested on
+          input (i.e. ``newline=''``), if a system read operation
+          returns a buffer ending in ``'\r'``, another system read
+          operation is done to determine whether it is followed by
+          ``'\n'`` or not.  In universal newlines mode with
+          translation, the second system read operation may be
+          postponed until the next read request, and if the following
+          system read operation returns a buffer starting with
+          ``'\n'``, that character is simply discarded.
 
 Another implementation, ``StringIO``, creates a file-like ``TextIO``
 implementation without an underlying Buffered I/O object.  While
@@ -422,7 +457,7 @@
         assert isinstance(mode, str)
         assert buffering is None or isinstance(buffering, int)
         assert encoding is None or isinstance(encoding, str)
-        assert newline in (None, "\n", "\r", "\r\n")
+        assert newline in (None, "", "\n", "\r", "\r\n")
         modes = set(mode)
         if modes - set("arwb+t") or len(mode) > len(modes):
             raise ValueError("invalid mode: %r" % mode)


More information about the Python-checkins mailing list