[Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='', rdelim='')

Boris Borcic bborcic at gmail.com
Wed Oct 27 15:16:46 CEST 2010


spir wrote:
> On Wed, 27 Oct 2010 03:09:23 +1100
> Steven D'Aprano<steve at pearwood.info>  wrote:
>
>> Boris Borcic wrote:
>>
>>> And let's then propagate that notion, to a *coherent* definition of
>>> split that makes it as well a method on the separator.
>>
>> Let's not.
>>
>> Splitting is not something that you on the separator, it's something you
>> do on the source string. I'm sure you wouldn't expect this:
>>
>> ":".find("key:value")
>> =>  3
>>
>> Nor should we expect this:
>>
>> ":".split("key:value")
>> =>  ["key", "value"]
>>
>>
>> You perform a search *on* the source string, not the target substring.
>> Likewise you split the source string, not the separator.
>
> I completely share this view.

Pack behavior ! Where's the alpha male ? :)

> Also, when one needs to split on multiple seps, repetitive seps, or even
> more complex separation schemes, it makes even less sense to see split
> applying on the sep, instead of on the string.

Now that's a mighty strange argument, unless you think of /split/ as some sort 
of multimethod. I didn't mean to deprive you of your preferred swiss army knife :)

Obviously the algorithm must change according to the sort of "separation 
scheme". Isn't it then a natural anticipation to see the dispatch effected along 
the lines of Python's native object orientation ? Maybe though, this is a case 
of the user overstepping into the private coding business of language implementors.

But on the user's own coding side, the more complex the "separation scheme", the 
most likely it is that code written to achieve it using /split/, applies 
multiply on *changing* input "source string"s. What in turn would justify that 
the action name /split/ be bound more tightly to the relatively stable 
"separation scheme" than to the relatively unstable "source string".

> Even less when splitting should remove empty parts generated by seps at both
> end or repeted seps. Note that it's precisely what split() without sep does:
>
>>>> s = " some \t little   words  "
>>>> s.split()
> ['some', 'little', 'words']
>>>> s.split(' ')
> ['', 'some', '', 'little', '', '', 'words', '', '']

/split/ currently behaves as it does currently, sure. If it was bound on the 
separator, s.split() could naturally be written ''.split(s) - so what's your 
point ? As I told Johnson, deeming ''.join(seqofstr) better-looking than 
sum(seqofstr) entails promotion of aesthetic sense in favor of ''.split...

>
> Finally, in any of such cases, join is _not_ a reverse function for split.
> split in the general case is not reversable because there is loss of information.
 > It is possible only with a pattern limited to a single sep, no (implicit) 
repetition,
 >and keeping empty parts at ends. Very fine that python's split semantics
> is so defined, one cannot think at split as reversible in general (*).


Now that's gratuitous pedantry ! Note that given

f = sep.join
g = lambda t : t.split(sep)

it is true that

g(f(g(x)))==g(x)

and

f(g(f(y)))==f(y)

for whatever values of sep, x, and y that do not provoke any exception. What 
covers all natural use cases with the notable exception of s.split(), iow 
sep=None. That is clearly enough to justify calling, as I did, /split/ the 
"converse" of /join/ (note the order, sep.join applied first, which eliminates 
sep=None as a use case)

And iirc, the mathematical notion that best fits the idea, is not that of

http://en.wikipedia.org/wiki/Inverse_function

but that of

http://en.wikipedia.org/wiki/Adjoint_functors

Cheers, BB




More information about the Python-ideas mailing list