[stdlib-sig] Evolving the Standard Library

Wed Sep 16 15:27:23 CEST 2009

[snip]
> with shared state on module level, web applications are not.  It is true
> that Python currently has some issues with high concurrency and people
> try to fix that by forking and spawning new processes which certainly
> hides away the problem of shared state, but that does not solve it.

FWIW: Multiprocessing doesn't care about shared state; nor is it an
attempt to "get around" the shared state within the standard library.
Conflating concurrency issues with shared state within standard
library modules is not quite right. I do agree, however, that there
are some modules whose shared state is undesirable.

You do have to understand though; while a large portion of the world
is moving into the web; there are many of us still, who simply don't
do "the online thing" - we should strive to improve the web-story, but
we can not do so in a way which cripples or makes the lives of people
who are *not* web-heads more difficult.

> Now if we look at the standard library, we can see many modules that
> just do not work in such environments because they have some sort of
> shared state.  The most obvious ones are certainly the `locale` module
> and all the other modules that change behavior based on the locale
> settings.  Did you know that every major Python framework reimplements
> time formatting even for something as simple as HTTP headers, because
> Python does not provide a way to format the time to english strings
> reliably?  But there are certainly more modules that have this sort of
> problem.

Part of my motivation in starting the other thread are issues such as this.

> Also we have many modules in the standard library that in my opinion
> just do not belong there.  From my point of view, stuff like XML does
> not belong into the standard library.  But it appears that not many
> people agree with me on this one.  But even if everybody would,
> backwards compatibility would still be a good reason to keep these
> modules around.

Each of us comes from a different problem domain - You might be
focused on the web, but I'm focused on daemons, tools, and networking
and glue. This difference between us exemplifies the problem of a
common, objective "smell test" for what really belongs in the standard
library. Take your example - XML parsing. I would prefer One Way To Do
It in the standard library. I feel XML parsing (and JSON, and YAML)
are critical things to have in the standard library for a variety of
reasons.

> Besides modules that do not work in every environment or modules that
> were probably a mistake to include, we also have modules in the standard
> library with a hideous implementation or no reusability, forcing people
> to reinvent what's already there.
[snip]

And SimpleHTTPServer, and logging, and... Armin, some of us agree with
you, and again, this was part of my driving force in starting the
other thread proposing the logical break out and subsequent cleanup.
Fred, I and Brett have gone off to write PEPs outlining these tasks.
If you would like to contribute to those peps, email me off list and I
will give you access. But you have to be nice ;)

> I wonder if the solution to this problem wouldn't be a largely improved
> packaging system and some sort of standardized reviewing process for the
> standard library.  Currently there is not even an accepted style for
> modules ending up in the Python distribution.  That, and a group of
> people, dedicated to standard library refactoring.  The majority of
> libraries in the standard library are small and easy to understand, I'm
> sure they are perfectly suited for students on projects like GSOC or
> GHOP to work on.  They could even be used as some sort of "playground"
> for new Python developers.

This was another point in the other thread; we need maintainers for
all of the modules. While there is not "guideline" for the code which
goes in per-se, the process by which something gets in is outlined,
and the code is typically reviewed prior to inclusion by Python-Dev.

As for the packaging system: Tarek and Company are working on this,
and it is outside of the boundaries of the discussions on this list so
far. If you really want to help with packaging, you need to go over to
disutils-sig (and report back to us the traffic levels there ;)) or
contact Tarek directly.

> Ubuntu recently started the "100 paper cuts" project.  There people work
> on tiny little patches to improve the system, rather to replace
> components.  Even though a large place of the standard library appears
> to be broken by design they could still be redesigned on the small
> scale, without breaking backwards compatibility.

We have over 170 patches in the tracker needing reviews. We have more
issues with patches that need docs and tests. More patches, while
welcome, still need someone to review them, apply them, and ensure
that they don't side-effect everything else, conceptually break
everything, and so on.

> Of course libraries like `locale` and `logging` are hard to change, but
> it would still be possible.  For `locale` it would probably a useful
> idea to go into the direction of datetime, where the timezone
> information is left to a 3rd party library.  `locale` could provide some
> hooks for libraries like `babel` to fill the gap.  On the other hand
> `Cookie` would be very easy to fix by moving the parsing code into a
> separate function and refactoring the cookie objects.

And a 3rd party library adds a dependency to all the build bots,
consumers, apps, etc out there. That dependency may not work on
windows, OS/X, or IRIX. This is partially the reason something like an
libxml dependency is right on out (sadly).

Again, agreed - but these modules need maintainers, people who care
enough about them to do the things you talk about. That's why I
started this tempest in a mailpot in the first place. It's not like I
enjoy replying to emails - I don't even get paid for it.

> We could probably also start a poll out there with well-selected
> questions of what users think about parts of the library.  And for that
> poll it would make a lot of sense to not just ask the questions and
> evaluating the results, but also track the area the user is coming from
> (small size company, open / closed source, web development etc.).
> Because we all are biased and seeing results grouped by some of these
> factoids could be enlightening.  That said, it could tell us that I'm
> completely wrong with my ideas of how the state of the standard library.

There are two things conflated here. One is "what do the users want"
and "what can we maintain". They are not the same thing. Brett already
tried an informal poll:

http://sayspy.blogspot.com/2009/07/results-of-informal-poll-about-standard.html

While not entirely representative of the hundreds of companies, and
thousands of people out there using Python, it's a good place to
start. In fact, it's one of the data points I'm using in my "cleanup
PEP". Would you like to help?

> But how realistic is it to refactor the standard library?  I don't know.
> For a long time people were pretty sure Python will not get any faster
> and yet Unleaden Swallow is doing some really amazing progress.

refactoring of the standard library, and it's continued evolution are
requirements for Python 's survival. This is why I started the other
thread, and others contributed to it.

> Any maybe we should have some elected task forces for things like the
> standard library.  Judging from the mailinglist it appears that far too
> many people are discussing *every detail* of it.  It is a good idea to
> ask as many people as possible, but I am not sure if the mailinglist is
> the way to do that.  It is currently very hard to see the direction in
> which development is heading.

Those of us who care about this are off writing PEPs. If you want to
help, you can. The discussion of every detail is a necessary "evil" -
and it comes with the territory. There is a time for discussion
though, and a time for work. David, Georg, Brett, Frank, and I are all
taking action items to go off and do, because you're right: actions
speak louder.

>
> Please think of this email just as a suggestion.  I don't have too much
> trust into myself to follow the discussions on this list camely enough
> to become a real part of a solution, but I would love to help shifting
> the development into a better direction, no matter which one it will be.

If you can not follow this mailing list calmly to find the good
information, filter the fluff, and ultimately cherry pick and extract
the work necessary to move forward, you're going to dread the PEP
process.

Changes affect everyone, we can not go and do them in a smokey dimly
lit room. It runs counter to who and what we are. It's fine to be a
dictator when it's your own project (Jinja vs. Jinja2 come to mind)
but discussion is needed, and healthy. You just need to filter the
good from the bad.

Armin, I agree with your sentiment, the feeling that is contained
within it is the motivation for me starting the original discussion in
the *first place*. Yes, it caused a fair amount of discussion, some
good, some bad, some circular. But we also got some people working on
solid deliverables, which was the point.

If you would like to help write some PEPs, I'm open to collaborating.

Jesse