[Python-ideas] Add list.join() please

Jamesie Pic jpic at yourlabs.org
Tue Jan 29 08:27:16 EST 2019


Thanks for the advice Jonathan, can you clarify the documentation
topic you think should be improved or created ? "Assembling strings"
or "inconsistencies between os.path.join and str.join" ?

I've written an article to summarize but I don't want to publish it
because my blog serves my lobbying for python and not against it. Also
I don't feel confident about it because I never had the luck to work
closely with core-devs or other people with a lot more experience than
me like I can so easily find on internet (thank you all, I love you
!). So, I deliver it here under WTFPL license.

The mistake I'm still doing after 10 years of Python

I love Python really, but there's a mistake I've been doing over and
over again while assembling strings of all sorts in Python and that I
have unconsciously ignored until now. Love it or hate it, but when you
start with python it's hard to be completely indifferent to:

    '\n'.join(['some', 'thing'])

But then you read the kilometers of justifications that the python
devs have already had for the past 20 years about it and, well, grow
indifference about it "that's the way it's gonna be if I want to use
python".

But recently, I started to tackle one of the dissatisfaction I have
with my own code: I think how I assemble strings doesn't make me feel
great compared to the rest of what I'm doing with Python.

However, it strikes me that assembling strings in python is something
I do many times a day, for 10 years, so, taking some time to question
my own doing could prove helpful on the long run. The little story of
a little obsession...

## `os.path.join(*args)` vs. `str.join(arg)`

I'm living a dream with os.path.join:

    >>> os.path.join('some', 'path')
    'some/path'

But then I decide that cross platform is going to be to much work so
why not join with slashes directly and only support free operating
systems:

    >>> '/'.join('some', 'path')
    TypeError: join() takes exactly one argument (2 given)

"Well ! I forgot about this for a minute, let's "fix" it and move on":

    >>> '/'.join(['some', 'path'])
    'some/path'

Ohhh, I'm not really sure in this case, isn't my code going to look
more readable with the os.path.join notation after all ?

Ten years later, I still make the same mistake, because 2 seconds
before doing a str join I was doing a path join. The fix is easy
because the error message is clear, so it's easier to ignore the
inconsistency and just fix it and move on. But, what if, this was an
elephant in the room that it was so easy to look away from ?

## Long f-strings vs. join

The new python format syntax with f-strings is pretty awesome, let's
see how we can assemble a triple quoted f-string:

    foo = f'''
    some
    {more(complex)}
    {st.ri("ng")}
    '''.strip()

Pretty cool right ? In a function it would look like this:

    def foo():
        return f'''
    some
    {more(complex)}
    {st.ri("ng")}
    ''').strip()

Ok so that would also work but we're going to have to import a module
from the standard library to restore visual indentation on that code:

    import textwrap

    def foo():
        return textwrap.dedent(f'''
        some
        {more(complex)}
        {st.ri("ng")}
        ''').strip()

Let's compare this to the join notation:

    def foo():
        return '\n'.join('some', more(complex), st.ri('ng'))

Needless to say, I prefer the join notation for this use case. Not
only does it fit in a single line but it doesn't require to dedent the
text with an imported function, nor does it require to juggle with
quotes, but also it sorts of look like it would be more performant.
All in all, I prefer the join notation to assemble longer strings.
Note that in practice, using f-strings for the "pieces" that I want to
assemble and that works great:

    def foo():
        return '\n'.join('some', more(complex), f'_{other}_')

Anyway, ok good-enough looking code ! Let's see what you have to say:

    TypeError: join() takes exactly one argument (2 given)

Oh, that again, kk gotfix:

    def foo():
        return '\n'.join(['some', more(complex), f'_{other}_'])

I should take metrics about the number of times were I make this
mistake during a day, cause it looks like it would be a lot (i switch
between os.path.join to str.join a lot).

## The 20-yr old jurisprudence

So, what looks more ergonomic between those two syntax:

    [
        'some',
        more(complex),
        f'_{other}_'
    ].join('\n')

    '\n'.join([
        'some',
        more(complex),
        f'_{other}_'
    ])

It seems there is a lot of friction when proposing to add a
convenience join method to the list method. I won't go over the
reasons for this here, there's already a lot to read about it on
internet, that's been written during the last 20 years.

## Conclusion

I have absolutely no idea what should be done about this, the purpose
of this article was just to share a bit of one of my obsessions with
string assembling.

Maybe it strikes me assembling strings multiple times a day with a
language I've got 10 years of full-time experience and still repeating
the same mistakes.

Not because I don't understand the jurisprudence, not because I don't
understand the documentation, or because the documentation is wrong,
but probably just because i switch from os.path.join and str.join
which take different syntax, i think.

Perhaps the most relevant proposal here would be to extend str.join
signature, which currently supports this notation:

    str.join(iterable)

To support also this notation:

    str.join(arg1, ...argN)

So at least, people won't be doing mistakes when switching over from
os.path.join and str.join. Perhaps, something else ?

Have a great day


More information about the Python-ideas mailing list