Suggestion for a "data" object syntax

Ian Kelly ian.g.kelly at gmail.com
Fri May 11 02:39:38 EDT 2018


On Mon, May 7, 2018 at 9:45 PM, Mikhail V <mikhailwas at gmail.com> wrote:
> Here is an idea for 'data object' a syntax.
> For me it is interesting, how would users find such syntax.
> I personally find that this should be attractive from users
> perspective.
> Main aim is more readable presenting of typical data chunks
> and some typical data types (tuples/lists) directly in code.
> Further this should significantly simplify typing.
>
> *Example 1. Multi-line strings*
>
> Here is a 3 lines multi-line string, it will be automatically 'dedented'
> by the parser by one level of indent:
>
> data === S :
>     this is multi-line string
>     escape chars: same as in strings (\\, \\n, \\t ...) ,
>     but "no need to 'escape' quotes"

My reaction #1: why 'S'? Python strings have a name: it's 'str'. They
don't need an additional, even more opaque name on top of that. If I
could go back in time and change the name of the type from 'str' to
'string', I would.

My reaction #2: what's the point of this construct? Python already has
multi-line strings that can be used as expressions, not as statements
only. Why do we need this syntax also? This is also not really
homomorphic with the proposed tuple syntax since that uses
tab-separated fields whereas this is just freeform text.

My reaction #3: Not a fan of adding an '===' operator. We already have
'=' and '=='. Javascript is another language that uses '===' to mean
something completely different from this proposal (it's another
equality operator) and it's a mess there. Come up with something else.

My reaction #4: Come to think of it, in keeping with other assigment
operators like '+=' and '*=' and '//=', if 'a === b' must exist then
it should mean 'a = a == b'. :-)

> *Example 2. Tuples*
>
> Tuple is a data block with normal Python elements.
>
> data === T :
>     1    2    3    "hello"
>     a    b    c + d
>
> Here the separators of elements are : tab character,
> new line (and maybe some white-space character like em-space?)
> The above construct will be direct synonym for:
>
> data = (1, 2, 3, "hello", a, b, c + d)
>
> Benefits are easy to see: say I want a tuple of strings:
>
> data === T :
>     "foo bar"
>     "hello world"
>     "to be continued..."
>
> VS current:
>
> data = (
>     "foo bar" ,
>     "hello world" ,
>     "to be continued..." ,
>     )
>
> Main problem with the latter are commas, it is not easy
> to type

In what bizarro world are commas harder to type than tabs? At least if
I type a comma I know I'll get a comma. If I type a tab then it
depends on my editor settings what I'll actually get.

> and, whats more important - hard to notice missing commas.

But it's totally easy to notice missing tabs, right? Not.

> And brackets of course does not make it nicer either.
>
> The above are typical cases where I see clear win, and
> IMO they are quite common constructs.
>
> More complicated examples :
>
>
> *Example 3. Two-dimensional tuple.*
>
> data === T/T :

What about T/S? S/T? S/S? Are these allowed? How would they work? I
hope you see my point from above about how these syntaxes are not
really homomorphic.

Also, why is there a division operator here?

>     1    2    3    "hello"
>     a    b     c + d    e     f
>
> is a synonym for:
>
> data = (
>     (1, 2, 3, "hello") ,
>     (a, b, c + d, e, f ) )

If this is supposed to be a tabular format then why is the second row
allowed to have a different number of elements than the first row?

What about named fields? Is there a proposal for extending this syntax
to allow construction of dicts or namedtuples?

> The rule here is: TAB character is inner elements' separator, and the
> new line is outer elements' separator. Line continuation
> character is  \  (to help with long lines).

So brackets and commas are too ugly, but line continuation characters
are just fine?

> *The benefits is just as in above examples :
> readability and 'typeability' boost.*

Sorry, I don't see it. For presentation of data this may be fine, but
for code inside a program I'm much more interested in the organization
of the data than the data itself. Commas and brackets make the
organization clear. Tabs obscure it.

> *Main rules: *
> - the definition is allowed  only as a statement (no passing as argument)

Point for the existing syntax, which can be used as an expression.

> - (imo) data starts always as a new line after the header
> - implicit string concatenation disallowed

So this would be a syntax error?

foo === T:
    "one" "two" "three"

That seems reasonable, but what about this?

foo === T/T:
    "this is a very " \
    "long string " \
    "needing to be " \
    "broken up over " \
    "several lines"

> So the question is, how do you like such syntax

I see a lot of disadvantages and not many real advantages.



More information about the Python-list mailing list