PEP 259: Omit printing newline after newline

Tue Jun 12 12:43:00 EDT 2001

In article <mailman.992290220.14360.python-list at python.org>, 
guido at digicool.com says...
> Please comment on the following.  This came up a while ago in
> python-dev and I decided to follow through.  I'm making this a PEP
> because of the risk of breaking code (which everybody on Python-dev
> seemed to think was acceptable).
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> 
> PEP: 259
> Title: Omit printing newline after newline
> Version: $Revision: 1.1 $
> Author: guido at python.org (Guido van Rossum)
> Status: Draft
> Type: Standards Track
> Python-Version: 2.2
> Created: 11-Jun-2001
> Post-History: 11-Jun-2001
> 
> Abstract
> 
>     Currently, the print statement always appends a newline, unless a
>     trailing comma is used.  This means that if we want to print data
>     that already ends in a newline, we get two newlines, unless
>     special precautions are taken.
> 
>     I propose to skip printing the newline when it follows a newline
>     that came from data.
> 
>     In order to avoid having to add yet another magic variable to file
>     objects, I propose to give the existing 'softspace' variable an
>     extra meaning: a negative value will mean "the last data written
>     ended in a newline so no space *or* newline is required."
> 
> 
> Problem
> 
>     When printing data that resembles the lines read from a file using
>     a simple loop, double-spacing occurs unless special care is taken:
> 
>         >>> for line in open("/etc/passwd").readlines():           
>         ... print line 
>         ... 
>         root:x:0:0:root:/root:/bin/bash
> 
>         bin:x:1:1:bin:/bin:
> 
>         daemon:x:2:2:daemon:/sbin:
> 
>         (etc.)
> 
>         >>>
> 
>     While there are easy work-arounds, this is often noticed only
>     during testing and requires an extra edit-test roundtrip; the
>     fixed code is uglier and harder to maintain.
> 
> 
> Proposed Solution
> 
>     In the PRINT_ITEM opcode in ceval.c, when a string object is
>     printed, a check is already made that looks at the last character
>     of that string.  Currently, if that last character is a whitespace
>     character other than space, the softspace flag is reset to zero;
>     this suppresses the space between two items if the first item is a
>     string ending in newline, tab, etc. (but not when it ends in a
>     space).  Otherwise the softspace flag is set to one.
> 
>     The proposal changes this test slightly so that softspace is set
>     to:
> 
>         -1 -- if the last object written is a string ending in a
>               newline
> 
>          0 -- if the last object written is a string ending in a
>               whitespace character that's neither space nor newline
> 
>          1 -- in all other cases (including the case when the last
>               object written is an empty string or not a string)
> 
>     Then, the PRINT_NEWLINE opcode, printing of the newline is
>     suppressed if the value of softspace is negative; in any case the
>     softspace flag is reset to zero.
> 
> 
> Scope
> 
>     This only affects printing of 8-bit strings.  It doesn't affect
>     Unicode, although that could be considered a bug in the Unicode
>     implementation.  It doesn't affect other objects whose string
>     representation happens to end in a newline character.
> 
> 
> Risks
> 
>     This change breaks some existing code.  For example:
> 
>         print "Subject: PEP 259\n"
>         print message_body
> 
>     In current Python, this produces a blank line separating the
>     subject from the message body; with the proposed change, the body
>     begins immediately below the subject.  This is not very robust
>     code anyway; it is better written as
> 
>         print "Subject: PEP 259"
>         print
>         print message_body
> 
>     In the test suite, only test_StringIO (which explicitly tests for
>     this feature) breaks.
> 
> 
> Implementation
> 
>     A patch relative to current CVS is here:
> 
>         http://sourceforge.net/tracker/index.php?func=detail&aid=432183&group_id=5470&atid=305470
> 
> 
> Copyright
> 
>     This document has been placed in the public domain.
> 
> 
> Local Variables:
> mode: indented-text
> indent-tabs-mode: nil
> End:
> 
> 
Having read the other's comments my issue with the idea is mildly 
different. Programming languages ought to require _some_ precision in 
thinking and making this sort of change encourages sloppy thinking imho.

I'd rather more time be invested in improving the error reporting from 
Python - that would be of real benefit to new users! I'd think a "Got 
newline, expecting ":"" would be far more educational to the user then 
"Syntax Error" which is used all over the place. Come to think of it if 
you want to adopt this sort of handholding (the newline thing), then why 
not just turn this into a "Warning: substituted ":" for unexpected 
newline." (it seems clear to me that this could be done and, more often 
then not, it would be the correct parser recovery, but that is not the 
point).

I used the syntax error bit to help me make a point about the downsides 
of this sort of handholding, but I really DO think that Python uses 
"Syntax Error" and leaves one to puzzle things out when it seems to me 
it's possible to give a more concise report in many instances. Lets not 
gild the lily until the flower is in full bloom!

Regards,

Dave LeBlanc