Suggestion for a "data" object syntax

Mikhail V mikhailwas at gmail.com
Mon May 7 23:45:05 EDT 2018


Here is an idea for 'data object' a syntax.
For me it is interesting, how would users find such syntax.
I personally find that this should be attractive from users
perspective.
Main aim is more readable presenting of typical data chunks
and some typical data types (tuples/lists) directly in code.
Further this should significantly simplify typing.

*Example 1. Multi-line strings*

Here is a 3 lines multi-line string, it will be automatically 'dedented'
by the parser by one level of indent:

data === S :
    this is multi-line string
    escape chars: same as in strings (\\, \\n, \\t ...) ,
    but "no need to 'escape' quotes"

(Of course, assuming this feature will be added to the lexers
of your editor, otherwise it may mess up syntax highlighting)

So the proposed statement header is like:

variable === "type" :
    ...

The "===" token and type id's are of course subject to discussion.
"type" is type of data, which tells the parser to
convert data to specific type of Python object.
The above is type "S", a multi-line string.


*Example 2. Tuples*

Tuple is a data block with normal Python elements.

data === T :
    1    2    3    "hello"
    a    b    c + d

Here the separators of elements are : tab character,
new line (and maybe some white-space character like em-space?)
The above construct will be direct synonym for:

data = (1, 2, 3, "hello", a, b, c + d)

Benefits are easy to see: say I want a tuple of strings:

data === T :
    "foo bar"
    "hello world"
    "to be continued..."

VS current:

data = (
    "foo bar" ,
    "hello world" ,
    "to be continued..." ,
    )

Main problem with the latter are commas, it is not easy
to type and, whats more important - hard to notice missing commas.
And brackets of course does not make it nicer either.

The above are typical cases where I see clear win, and
IMO they are quite common constructs.

More complicated examples :


*Example 3. Two-dimensional tuple.*

data === T/T :
    1    2    3    "hello"
    a    b     c + d    e     f

is a synonym for:

data = (
    (1, 2, 3, "hello") ,
    (a, b, c + d, e, f ) )

The rule here is: TAB character is inner elements' separator, and the
new line is outer elements' separator. Line continuation
character is  \  (to help with long lines).

*The benefits is just as in above examples :
readability and 'typeability' boost.*

To present nesting of elements of higher than 2 levels,
normal Python syntax can be used for deeper nesting:

data === T/T :
    1    2    (3    4    5)
    a    b     c

Or maybe even allow commas:
data === T/T :
    1    2    (3, 4, 5)
    a    b     c

VS normal:

data = (
    (1, 2, (3, 4, 5) ) ,
    (a, b, c ) )

So the idea is to cover only first two levels of
nesting of course.

Further, common types may be defined, like e.g. L/L (list of lists)
but I don't want to go far into details and want to stop
on common cases to help evaluate the idea in general.


*Main rules: *
- the definition is allowed  only as a statement (no passing as argument)
- (imo) data starts always as a new line after the header
- implicit string concatenation disallowed


So the question is, how do you like such syntax, and
if so, what common examples can be fit in here.
Any ideas, comments are of course welcome.
Suggest some data and I'll try to express it in this syntax.



M



More information about the Python-list mailing list