os.path.isfile

Erik python at lucidity.plus.com
Sun Feb 12 18:56:45 EST 2017


On 12/02/17 04:53, Steve D'Aprano wrote:
> py> s = r'documents\'
>   File "<stdin>", line 1
>     s = r'documents\'
>                     ^
> SyntaxError: EOL while scanning string literal
>
>
> (I still don't understand why this isn't just treated as a bug in raw string
> parsing and fixed...)

I would imagine that it's something to do with the way the parser works 
in edge cases such as those below where it struggles to determine how 
many string tokens there are (is there a PEP that describes how this 
could be easily fixed?)

Python 3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
 >>> r'sp\'am' "ha\m"
"sp\\'amha\\m"
 >>> r'sp\'am\' "ha\m"
   File "<stdin>", line 1
     r'sp\'am\' "ha\m"
                     ^
SyntaxError: EOL while scanning string literal
 >>> r'sp\'am\' "ha\m'
'sp\\\'am\\\' "ha\\m'
 >>>

Actually, while contriving those examples, I noticed that sometimes when 
using string literal concatenation, the 'rawness' of the initial string 
is sometimes applied to the following string and sometimes not:

 >>> "hello \the" r"worl\d"
'hello \theworl\\d'

Makes sense - the initial string is not raw, the concatenated string is.

 >>> r"hello \the" "worl\d"
'hello \\theworl\\d'

Slightly surprising. The concatenated string adopts the initial string's 
'rawness'.

 >>> "hello \the" r"worl\d" "\t"
'hello \theworl\\d\t'

The initial string is not raw, the following string is. The string 
following _that_ becomes raw too.

 >>> r"hello \the" "worl\d" "\t"
'hello \\theworl\\d\t'

The initial string is raw. The following string adopts that (same as the 
second example), but the _next_ string does not!

 >>> r"hello \the" "worl\d" r"\t"
'hello \\theworl\\d\\t'

... and this example is the same as before, but makes the third string 
"raw" again by explicitly declaring it as such.

Presumably (I haven't checked), this also applies to u-strings and 
f-strings - is this a documented and known "wart"/edge-case or is it 
something that should be defined and fixed?

E.



More information about the Python-list mailing list