[Python-Dev] Parsing f-strings from PEP 498 -- Literal String Interpolation

Fabio Zadrozny fabiofz at gmail.com
Wed Nov 9 11:20:27 EST 2016


On Sat, Nov 5, 2016 at 10:36 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 5 November 2016 at 04:03, Fabio Zadrozny <fabiofz at gmail.com> wrote:
> > On Fri, Nov 4, 2016 at 3:15 PM, Eric V. Smith <eric at trueblade.com>
> wrote:
> >> Using PyParser_ASTFromString is the easiest possible way to do this.
> Given
> >> a string, it returns an AST node. What could be simpler?
> >
> >
> > I think that for implementation purposes, given the python
> infrastructure,
> > it's fine, but for specification purposes, probably incorrect... As I
> don't
> > think f-strings should accept:
> >
> >  f"start {import sys; sys.version_info[0];} end" (i.e.:
> > PyParser_ASTFromString doesn't just return an expression, it accepts any
> > valid Python code, even code which can't be used in an f-string).
>
> f-strings use the "eval" parsing mode, which starts from the
> "eval_input" node in the grammar (which is only a couple of nodes
> higher than 'test', allowing tuples via 'testlist' as well as trailing
> newlines and EOF):
>
>     >>> ast.parse("import sys; sys.version_info[0];", mode="eval")
>     Traceback (most recent call last):
>      File "<stdin>", line 1, in <module>
>      File "/usr/lib64/python3.5/ast.py", line 35, in parse
>        return compile(source, filename, mode, PyCF_ONLY_AST)
>      File "<example>", line 1
>        import sys; sys.version_info[0];
>             ^
>     SyntaxError: invalid syntax
>
> You have to use "exec" mode to get the parser to allow statements,
> which is why f-strings don't do that:
>
>     >>> ast.dump(ast.parse("import sys; sys.version_info[0];",
> mode="exec"))
>     "Module(body=[Import(names=[alias(name='sys', asname=None)]),
> Expr(value=Subscript(value=Attribute(value=Name(id='sys', ctx=Load()),
> attr='version_info', ctx=Load()), slice=Index(value=Num(n=0)),
> ctx=Load()))])"
>
> The unique aspect for f-strings that means they don't permit some
> otherwise valid Python expressions is that it also does the initial
> pre-tokenisation based on:
>
> 1. Look for an opening '{'
> 2. Look for a closing '!', ':' or '}'  accounting for balanced string
> quotes, parentheses, brackets and braces
>
> Ignoring the surrounding quotes, and using the `atom` node from
> Python's grammar to represent the nesting tracking, and TEXT to stand
> in for arbitrary text, it's something akin to:
>
>     fstring: (TEXT ['{' maybe_pyexpr ('!' | ':' | '}')])+
>     maybe_pyexpr: (atom | TEXT)+
>
> That isn't quite right, since it doesn't properly account for brace
> nesting, but it gives the general idea - there's an initial really
> simple tokenising pass that picks out the potential Python
> expressions, and then those are run through the AST parser's
> equivalent of eval().
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
>


​Hi Nick and Eric,

Just wanted to say thanks for the feedback and point to a grammar I ended
up doing on my side (in JavaCC), just in case someone else decides to do a
formal grammar later on it can probably be used as a reference (shouldn't
be hard to convert it to a bnf grammar):

https://github.com/fabioz/Pydev/blob/master/plugins/org.python.pydev.parser/src/org/python/pydev/parser/grammar_fstrings/grammar_fstrings.jjt
​

Also, as a feedback, I found it a bit odd that there can't be any space nor
new line between the last format specifiers and '}'

I.e.:

f'''{
dict(
  a = 10
)
!r
}
'''

​is not valid, whereas ​

​
f'''{
dict(
  a = 10
)
!r}
'''​
is valid -- as a note, this means my grammar has a bug as both versions are
accepted -- and I currently don't care enough about that change from the
implementation to fix it ;)

Cheers,

Fabio​
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20161109/6f54ede3/attachment.html>


More information about the Python-Dev mailing list