[Python-ideas] new format spec for iterable types

Wed Sep 9 16:32:08 CEST 2015

Well, here it is:

def unpack_format (iterable, format_spec=None):
     if format_spec:
         try:
             sep, element_fmt = format_spec.split('|', 1)
         except ValueError:
             raise TypeError('Invalid format_spec for iterable formatting')
         return sep.join(format(e, element_fmt) for e in iterable)

usage examples:

# '0.00, 1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00'
'{}'.format(unpack_format(range(10), ', |.2f'))

# '0.001.002.003.004.005.006.007.008.009.00'
'{}'.format(unpack_format(range(10), '|.2f'))

# invalid syntax
'{}'.format(unpack_format(range(10), '.2f'))

Best,
Wolfgang

On 09.09.2015 16:02, Eric V. Smith wrote:
> At some point, instead of complicating how format works internally, you
> should just write a function that does what you want. I realize there's
> a continuum between '{}'.format(iterable) and
> '{<really-really-complex-stuff}'.format(iterable). It's not clear where
> to draw the line. But when the solution is to bake knowledge of
> iterables into .format(), I think we've passed the point where we should
> switch to a function: '{}'.format(some_function(iterable)).
>
> In any event, If you want to play with this, I suggest you write
> some_function(iterable) that does what you want, first.
>
> Eric.
>
> On 9/9/2015 9:41 AM, Wolfgang Maier wrote:
>> Thanks for all the feedback!
>>
>> Just to summarize ideas and to clarify what I had in mind when proposing
>> this:
>>
>> 1)
>> Yes, I would like to have this work with any (or at least most)
>> iterables, not just with my own custom type that I used for illustration.
>> So having this handled by the format method rather than each object's
>> __format__ method could make sense. It was just simple to implement it
>> in Python through the __format__ method.
>>
>> Why did I propose * as the first character of the new format spec string?
>> Because I think you really need some token to state unambiguously[1]
>> that what follows is a format specification that involves going through
>> the elements of the iterable instead of working on the container object
>> itself. I thought that * is most intuitive to understand because of its
>> use in unpacking.
>>
>> [1] unfortunately, in my original proposal the leading * can still be
>> ambiguous because *<, *> *= and *^ could mean element joining with <, >,
>> = or ^ as separators or aligning of the container's formatted string
>> representation using * as the fill character.
>>
>>
>> Ideally, the * should be the very first thing inside a replacement field
>> - pretty much as suggested by Oscar - and should not be part of the
>> format spec. This is not feasible through a format spec handled by the
>> __format__ method, but through a modified str.format method, i.e.,
>> that's another argument for this approach. Examples:
>>
>> 'foo {*name:<sep>} bar'.format(name=<expr>)
>> 'foo {*0:<sep>} bar {1}'.format(x, y)
>> 'foo {*:<sep>} bar'.format(x)
>>
>>
>> 2)
>> As for including an additional format spec to apply to the elements of
>> the iterable:
>> I decided against including this in the original proposal to keep it
>> simple and to get feedback on the general idea first.
>> The problem here is that any solution requires an additional token to
>> indicate the boundary between the <separator> part and the element
>> format spec. Since you would not want to have anyone's custom format
>> spec broken by this, this boils down to disallowing one reserved
>> character in the <separator> part, like in Oscar's example:
>>
>> 'foo {*name:<sep>:<fmt>} bar'.format(name=<expr>)
>>
>> where <sep> cannot contain a colon.
>>
>> So that character would have to be chosen carefully (both : and | are
>> quite readable, but also relatively common element separators I guess).
>> In addition, the <separator> part should be non-optional (though the
>> empty string should be allowed) to guarantee the presence of the
>> delimiter token, which avoids accidental splitting of lonely element
>> format specs into a "<sep>" and <fmt> part:
>>
>> # format the elements of name using <fmt>, join them using <sep>
>> 'foo {*name:<sep>:<fmt>} bar'.format(name=<expr>)
>> # format the elements of name using <fmt>, join them using ''
>> 'foo {*name::<fmt>} bar'.format(name=<expr>)
>> # a syntax error
>> 'foo {*name:<fmt>} bar'.format(name=<expr>)
>>
>> On the other hand, these restriction do not look too dramatic given the
>> flexibility gain in most situations.
>>
>> So to sum up how this could work:
>> If str.format encounters a leading * in a replacement field, it splits
>> the format spec (i.e. everything after the first colon) on the first
>> occurrence of the <sep>|<fmt> separator (possibly ':' or '|') and does,
>> essentially:
>>
>> <sep>.join(format(e, <fmt>) for e in iterable)
>>
>> Without the *, it just works the current way.
>>
>>
>> 3)
>> Finally, the alternative idea of having the new functionality handled by
>> a new !converter, like:
>>
>> "List: {0!j:,}".format([1.2, 3.4, 5.6])
>>
>> I considered this idea before posting the original proposal, but, in
>> addition to requiring a change to str.format (which would need to
>> recognize the new token), this approach would need either:
>>
>> - a new special method (e.g., __join__) to be implemented for every type
>> that should support it, which is worse than for my original proposal or
>>
>> - the str.format method must react directly to the converter flag, which
>> is then no different to the above solution just that it uses !j instead
>> of *. Personally, I find the * syntax more readable, plus, the !j syntax
>> would then suggest that this is a regular converter (calling a special
>> method of the object) when, in fact, it is not.
>> Please correct me, if I misunderstood something about this alternative
>> proposal.
>>
>> Best,
>> Wolfgang
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>