[Python-ideas] String interpolation for all literal strings

Wes Turner wes.turner at gmail.com
Sat Aug 8 00:33:01 CEST 2015


On Fri, Aug 7, 2015 at 5:24 PM, Nikolaus Rath <Nikolaus at rath.org> wrote:

> On Aug 07 2015, Barry Warsaw <
> barry-+ZN9ApsXKcEdnm+yROfE0A at public.gmane.org> wrote:
> > * Literals only
> >
> > I've described elsewhere that accepting non-literals is useful in some
> > cases.
>
> Are you saying you don't want f-strings, but you want something that
> looks like a function (but is actually a special form because it has
> access to the local context)? E.g. f(other_fn()) would perform literal
> interpolation on the result of other_fn()?
>
> I think that would be a very bad idea. It introduces something that
> looks like a function but isn't and it opens the door to a new class of
> injection vulnerabilities (every time you return a string it could
> potentially be used for interpolation at some point).
>

glocals(), format_from(), lookup() (e.g. salt map.jinja stack of dicts)

Contexts:

 * [Python-ideas] String interpolation for all literal strings

   * 'this should not be a {cmd}'.format(cmd=cmd)
   * 'this should not be a {cmd}'.format(globals() + locals() +
        {'cmd':cmd'})
   * 'this should not be a \{cmd}'
   * f'this should not be a \{cmd}'

 * [Python-ideas] Briefer string format
 * [Python-ideas] Make non-meaningful backslashes illegal in string literals

   * u'C:\users' breaks because \u is an escape sequence

     * How does this interact with string interpolation
       (e.g. **when**, in the functional composition
       from string to string (with parameters),
       do these escape sequences get eval'd?

       * See: MarkupSafe (Jinja2)


Justification:

* "how are the resources shared relevant to these discussions?"
* TL;DR

  * string interpolation is often dangerous
    (OS Command Injection and SQL Injection are the #1 and #2
    according to the CWE/SANS 2011 Top 25)
  * string interpolation is already hard to review
    (because there are many ways to do it)

    * it's a functional composition of an AST?

* Shared a number of seemingly tangential links
  (in python-ideas) in regards to
  proposals to add an additional string interpolation syntax
  with implicit local then global context / scope
  tentatively called 'f-strings'.

  * Bikeshedded on the \{syntax} ({{because}} {these} \{are\} more
    readable)
  * Bikeshedded on the name 'f-string',
    because of visual disambiguability
    from 'r-string' (for e.g. raw strings (and e.g. ``re``))

    * Is there an AST scanner to find these?

      * Because a grep expression for ``f"`` or ``f'`` is not that
        helpful.

        * Especially as compared to ``grep ".format("``



Use Cases:
----------
As a developer, I want to:

* grep, grep for string interpolations
* include parameters in strings (and escape them appropriateyl)

  * The safer thing to do is
    should *usually* (often) be tokenized
    and e.g. quoted and serialized out

    * OS Commands, HTML DOM, SQL parse tree, SPARQL parse tree,
      CSV, TSV,
      (*injection* vectors with user supplied input
      and non-binary string-based data representation formats)

      * "Explicit is better than implicit" -- Zen of Python

        * Where are the values of these variables set?

          With *non* f-strings (str.format, str.__mod__)
          the context is explicit;
          and I regard that as a feature of Python.

          * If what is needed is a shorthand way to say

            * ``glocals(**kwargs) / gl()``
            * ``lookup_from({}, locals(), globals())``,
            * ``.formatlookup(`` or ``.formatl(``
            and/or not add a backwards-incompatible shortcut
            which is going to require additional review
            (as I am reviewing things that are commands or queries).

      * These are usually trees of tokens which are serialized
        for a particular context;
        and they are difficult because
        we often don't think of them
        in the same terms as say the Python AST;
        because we think we can just use string concatenation here
        (when there should/could be typed objects
        with serialization methods e.g
        * __str__
        * __str_shell__
        * __str_sql__(_, with_keywords=SQLVARIANT_KEYWORDS)

        With this form, the proposed f-string method would be:
        * __interpolate__

          * [ ] Manual review

            * Which variables/expressions are defined or referenced here,
              syntax checker?

              * There are 3 other string interpolation syntaxes.

                * ``glocals(**kwargs) / gl()``

* **AND THEN**, "so I can just string-concatenate these now?"

  * Again, MarkupSafe __attr

    * Types and serialization over concatenation




>
> Best,
> -Nikolaus
>
> --
> GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
> Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
>
>              »Time flies like an arrow, fruit flies like a Banana.«
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150807/1af708f1/attachment.html>


More information about the Python-ideas mailing list