[Python-ideas] itertools recipes: why not add them to the stdlib *somewhere*?

Nick Coghlan ncoghlan at gmail.com
Mon Jul 9 06:58:06 CEST 2012


On Mon, Jul 9, 2012 at 2:16 PM, alex23 <wuwei23 at gmail.com> wrote:
> So in the extremely likely outcome that a piece of functionality has
> multiple implementations with different interpretations of edge cases,
> who gets to decide which one is "more standard" than the other? If
> such a decision has to be made, then it's not really "standard".

This is often what lies at the heart of "not every 3 line function
needs to be <a builtin>/<in the standard library>".

It's very easy to hit a point of diminishing returns where the number
of possible alternatives mean that attempting to create an abstraction
layer ends up creating a UI that is *more complicated* than just
writing your own utility function that does exactly what you want.
Grouping is one such API (fixed length? Variable length with a
delimiter? Pad last group? Drop last group? Error early? Accept
arbitrary iterables? Accept sequences only? Support overlapping
windows?). Recursive descent into arbitrary collections is another
(Error on reference loops? Treat strings/bytes/bytearray as atomic
types? Treat any implementor of the buffer interface as an atomic
type? Support only hashable objects?).

There are a few key reasons why things end up in the standard library:

1. They're closely coupled to the language definition and
implementation and should be updated in sync with it (e.g. sys, imp,
gc, importlib, dis, inspect)
2. We want to use them in other parts of the standard library or its
test suite (e.g. collections, many of the unittest enhancements)
3. They solve a common problem that is otherwise prone to being
handled with incomplete solutions that lead to incorrect code (e.g.
ipaddress instead of regex based recipes for processing IP addresses)
4. It's an old module that would probably be left to PyPI these days,
but was added in earlier times when stdlib inclusion criteria were
less strict (but isn't broken so there's no harm in keeping it
around).

Sometimes a problem is sufficiently common, and has few enough
variations, that it's worth creating and providing a standard version.
That's the general aim of the itertools module. Other times, the core
problem is difficult enough that it's worth providing a standard
solution, even if it does mean including a vast array of configuration
options (e.g. subprocess. Popen).

However, sometimes, the correct answer to "Hey, this is a really
common pattern" is not "We should provide an API that uses that
pattern internally" but "we should document this pattern, so people
know it's a common idiom and can tailor it to their specific use case
and preferences".

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



More information about the Python-ideas mailing list