[Python-ideas] Proposal: Tuple of str with w'list of words'

Stephen J. Turnbull turnbull.stephen.fw at u.tsukuba.ac.jp
Tue Dec 6 19:51:25 EST 2016


Random832 writes:

 > Is there any particular objection to allowing the backslash-space escape
 > (and for escapes that mean whitespace characters, such as \t, \x20, to
 > not split, if you meant to imply that they do)? That would provide the
 > extra push to this being beneficial over split().

You're suggesting that (1) most escapes would be processed after
splitting while (2) backslash-space (what about backslash-tab?) would
be treated as an escape during splitting?

 > I also have an alternate idea: sl{word1 word2 'string 3' "string 4"}

word1 and word2 are what perl would term "barewords"?  Ie treated as
strings?

-1 to w"", -1 to inconsistent interpretation of escapes, and -1 to a
completely new syntax.

" ", "\x20", "\u0020", and "\U00000020" currently are different
representations of the same string, so it would be confusing if the
same notations meant different things in this context.  Another syntax
plus overloading standard string notation with yet another semantics
(strings, rawstrings) doesn't seem like a win to me.

As I accept the usual Pythonic aversion to mere abbreviations, I don't
see any benefit to these notations, except for the case where a list
just won't do, so you can avoid a call to tuple.  We already have
three good ways to do this:

    wordlist = ["word1", "word2", "string 3", "string 4"]
    wordlist = "word1,word2,string 3,string 4".split(",")
    wordlist = open(word_per_line_file).readlines()

and for maximum Unicode-conforming generality with compact notation:

    wordlist = "word1\UFFFFword2\UFFFFstring 3\UFFFFstring 4".split("\UFFFF")

More seriously, in most use cases there will be ASCII control
characters that you could use, which most editors can enter (though
they might be visually unattractive in many editors, eg, \x0C).

Steve


More information about the Python-ideas mailing list