[Python-ideas] Draft PEP on string interpolation

Mon Aug 24 07:00:45 CEST 2015

On Sun, Aug 23, 2015 at 9:31 PM, Wes Turner <wes.turner at gmail.com> wrote:

>
>
> On Sun, Aug 23, 2015 at 8:41 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>
>> On 24 August 2015 at 10:35, Eric V. Smith <eric at trueblade.com> wrote:
>> > On 08/22/2015 09:37 PM, Nick Coghlan wrote:
>> >> The trick would be to make interpolation lazy *by default* (preserving
>> >> the triple of the raw template string, the parsed fields, and the
>> >> expression values), and put the default rendering in the resulting
>> >> object's *__str__* method.
>> >
>> > At this point, I think PEPs 498 and 501 have converged, except for the
>> > delayed string interpolation object (which I realize is important) and
>> > how expressions are identified in the strings (which I consider less
>> > important).
>> >
>> > I think the string interpolation object is interesting. It's basically
>> > what Petr Viktorin and Chris Angelico discussed and suggested here:
>> > https://mail.python.org/pipermail/python-ideas/2015-August/035303.html.
>>
>> Aha, I though I'd seen that idea go by in one of the threads, but I
>> didn't remember where :)
>>
>> I'll add Petr and Chris to the acknowledgements section in 501.
>>
>> > My suggestion would be to add both f-strings (PEP 498) and i-strings (as
>> > they're currently called in PEP 501), but with the exact same syntax to
>> > identify and evaluate expressions. I don't particularly care what the
>> > prefixes are. I'd add the plain f-strings first, then i-strings maybe
>> > later. There are definitely some issues with delayed interpolation we
>> > need to think about. An f-string would be shorthand for str(i-string).
>>
>> +1, as this is the point of view I've come to as well.
>>
>> > I think it's hyperbolic to refers f-strings as a new string formatting
>> > language. With one small difference (detailed in PEP 498, and with zero
>> > usage I could find in the stdlib outside of tests), f-strings are a
>> > strict superset of str.format() strings (but not the arguments to
>> > .format of course). I think f-strings are no more different from
>> > str.format strings than PEP 501 i-strings are to string.Template
>> strings.
>>
>> Yeah, that's a fair criticism of my rhetoric, so I'll stop saying that.
>>
>> > From what I can tell in the stdlib and in the wild, str.format() has
>> > hundreds or thousands of times more usage that string.Template. I
>> > realize that the reasons are not necessarily related to the syntax of
>> > the replacement strings, but you can't say most people aren't familiar
>> > with str.format().
>>
>> Right, and I think we can actually make an example driven decision on
>> that front by looking at potential *target* formats for template
>> rendering. After all, one of the interesting discoveries we made in
>> having both str.__mod__ and str.format available is that %-formatting
>> is a great way to template str.format strings, and vice-versa, since
>> the meta-characters don't conflict, so you can minimise the escaping
>> needed.
>>
>> For use cases like writing object __repr__ methods, I don't think the
>> choice of $-substitution or {}-substitution matters - neither $ nor {}
>> are likely to appear in the desired output (except as part of
>> interpolated values), so escaping shouldn't be common regardless of
>> which we choose. (Side note: __repr__ and _str__ implementations are
>> likely worth highlighting as a good use case for the new syntax!)
>>
>> I think things get more interesting once we start talking about
>> interpolation targets other than "human readable text".
>>
>> For example, one of the neat (/scary, depending on how you feel about
>> this kind of feature) things I realised in working on the latest draft
>> of PEP 501 is that you could use it to template *Python code*,
>> including eagerly bound references to objects in the current scope.
>> That is:
>>
>>     a = b + c
>>
>> could instead be written as:
>>
>>     a = eval(str(i"$b + $c"))
>>
>> That's not very interesting if all you do is immediately call eval()
>> on it, but it's a lot more interesting if you instead want to do
>> things like extract the AST, dispatch the operation for execution in
>> another process, etc. For example, you could use this capability to
>> build eagerly bound closures, which wouldn't see changes in name
>> bindings, but *would* see state changes in mutable objects.
>>
>> With $-substitution, that "just works", as $ generally isn't
>> syntactically significant in Python code - it can only appear inside
>> strings (and potentially interpolation templates). With
>> {}-substitution, you'd have to double all the braces for dictionary
>> displays, dictionary comprehensions and set comprehensions. In example
>> form:
>>
>>     data = {k:v for k, v in source}
>>
>> becomes:
>>
>>     data = eval(str(i"{k:v for k, v in $source}"))
>>
>> rather than:
>>
>>     data = eval(f"{{k:v for k, v in {{source}}}}"))
>>
>> You hit a similar problem if you're targeting Django or Jinja2
>> templates, or any content that involves l20n style JavaScript
>> translation strings: the use of braces for substitution expressions in
>> the interpolation template conflicts with their use in the target
>> format.
>>
>> So far, the only target rendering environments I've come up with where
>> $-substitution would create a conflict are shell commands and
>> JavaScript localisation using Mozilla's l20n syntax, and in both of
>> those, I'd actually *want* the Python lookup to take precedence over
>> the target environment lookup (and doubling the prefix to "$$" for
>> target environment lookup seems quite reasonable when you actually do
>> want to do the name lookup in the target environment).
>>
>> >> That description is probably as clear as mud, though, so back to the
>> >> PEP I go! :)
>> >
>> > Thanks for PEP 501. Maybe I'll add delayed interpolation to PEP 498!
>> >
>> > On a more serious note, I'm thinking of adding i-strings to my f-string
>> > implementation. I have some ideas that the format_spec (the :.3f stuff)
>> > could be used by the code that eventually does the string interpolation.
>> > For example, sql(i-string) might want to interpret this expression using
>> > __sql__, instead of how str(i-string) would use __format__. Then the
>> > sql() machinery could look at the format_spec and pass it to the value's
>> > __sql__ method.
>>
>> Yeah, that's the key reason PEP 501 is careful to treat them as opaque
>> strings that it merely transports through to the renderer. The
>> *default* renderer would expect them to be str.format format
>> specifiers, but other renderers may either disallow them entirely, or
>> expect them to do something different.
>>
>> > For example:
>> > sql(i'select {date:as_date} from {tablename}'
>> >
>> > might call date.__sql__('as_date'), which would know how to cast to the
>> > write datatype (this happens to me all the time).
>> >
>> > This is one reason I'm thinking of ditching !s, !r, and !a, at least for
>> > the first implementation of PEP 498: they're not needed, and are not
>> > generally applicable if we add the hooks I'm considering into i-strings.
>>
>> +1 from me. Given arbitrary expression support, it's both entirely
>> possible and more explicit to write the builtin calls directly (obj!a,
>> obj!r, obj!s -> ascii(obj), repr(obj), str(obj))
>>
>
> IIUC, to do this with SQL,
>
> > sql(i'select {date:as_date} from {tablename}'
>
> needs to be
>
>   ['select ', unescaped(date, 'as_date'), 'from ', unescaped(tablename)]
>
> so that e.g. sql_92(), sql_2011()
> would know that 'select ' is presumably implicitly escaped
>
> * https://en.wikipedia.org/wiki/SQL#Interoperability_and_standardization
> * http://docs.sqlalchemy.org/en/rel_1_0/dialects/
> * https://docs.djangoproject.com/en/1.7/ref/models/queries/#f-expressions
> "Django F-Expressions"
>
>
For reference, the SQLAlchemy Expression API solves for
(safer) method-chaining, nesting *Python* expression API;
or you can reuse a raw SQL connection from a ConnectionPool.

Django F-Objects are relevant because they are deferred
(and compiled in context to the query context);
similar to the objectives of a given SQL syntax
templating, parameterization, and serialization
library.

Django Q-Objects are similar,
in that an f-string is basically
an iterator of AND-ed expressions
where AND means string concatenation.

Personally,
I'd pretty much always just reflect the tables
or map them out
and write SQLAlchemy Python expressions
which are then compiled to a particular dialect
(and quoted appropriately, **avoiding CWE-89**
surviving across table renames,
managing migrations).

Is it sometimes faster to write SQL by hand?

* I'd write the [SQLAlchemy], serialize to SQL, [and modify]
  (because I should have namespaced Python table attrs for those attrs
anyway,
  even if it requires table introspection and reflection at (every/pool)
instantiation)
* you can always execute query with a raw connection with an ORM
  (and then **refactor (REF) string-ified table and column names**)

Each ORM (and DBAPI) have parametrization settings
(e.g. '%' or '?' or configuration_setting)
which should not collide with the f-string syntax.

* DBAPI v2.0
  https://www.python.org/dev/peps/pep-0249/
* SQLite DBAPI
  https://docs.python.org/2/library/sqlite3.html
  https://docs.python.org/3/library/sqlite3.html

http://docs.sqlalchemy.org/en/rel_1_0/core/tutorial.html#conjunctions

>>> s = select([(users.c.fullname +...               ", " + addresses.c.email_address)....                label('title')]).\...        where(users.c.id == addresses.c.user_id).\...        where(users.c.name.between('m', 'z')).\...        where(...               or_(...                  addresses.c.email_address.like('%@aol.com'),...                  addresses.c.email_address.like('%@msn.com')...               )...        )>>> conn.execute(s).fetchall() SELECT users.fullname || ? || addresses.email_address AS titleFROM users, addressesWHERE users.id = addresses.user_id AND users.name BETWEEN ? AND ? AND(addresses.email_address LIKE ? OR addresses.email_address LIKE ?)(', ', 'm', 'z', '%@aol.com', '%@msn.com')[(u'Wendy Williams, wendy at aol.com',)]

http://docs.sqlalchemy.org/en/rel_1_0/core/tutorial.html#using-textual-sql

>>> from sqlalchemy.sql import text>>> s = text(...     "SELECT users.fullname || ', ' || addresses.email_address AS title "...         "FROM users, addresses "...         "WHERE users.id = addresses.user_id "...         "AND users.name BETWEEN :x AND :y "...         "AND (addresses.email_address LIKE :e1 "...             "OR addresses.email_address LIKE :e2)")SQL <http://docs.sqlalchemy.org/en/rel_1_0/core/tutorial.html#>>>> conn.execute(s, x='m', y='z', e1='%@aol.com', e2='%@msn.com').fetchall() [(u'Wendy Williams, wendy at aol.com',)]

SQLAlchemy is not async-compatible
(besides, most drivers block);
it's debatable whether async would be faster, anyway:
https://bitbucket.org/zzzeek/sqlalchemy/issues/3414/asyncio-and-sqlalchemy

>> Regards,
>> Nick.
>>
>> --
>> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150824/6be0acb8/attachment-0001.html>