Difficulty Finding Python Developers

Paul Rubin http
Thu Apr 15 20:37:47 EDT 2004


Roy Smith <roy at panix.com> writes:
> > I think anyone using Python in a serious production project has to be
> > ready to extend or debug the library modules and/or write new C
> > extensions.
> 
> I guess it depends on what you mean by "serious production project".  
> I've played with several of the extension APIs just to get a feel for 
> what they're like, but I've never actually used any of them in anger.

Yes, the point is you never know when you're going to need to use it,
and having a deadline breathing down your neck isn't the right time to
learn about it.  Especially in a startup company, there are just too
many situations where the fate of the whole company depends on your
making something work RIGHT NOW, sometimes even with an investor or
customer looking over your shoulder.

> Of course, I've used plenty of add-on modules (DB, LDAP, SNMP, etc), but 
> I just downloaded them and started using them.  Certainly, *somebody* 
> had to muck about in the C API (or SWIG, or Boost, or whatever) to get 
> that done, but I've never felt the need.

Oh yes, that brings up another matter, there's tons of stuff that's
just plain *missing* from the library, like DB, LDAP, SNAP, web
templates, GUI components that look less crude than TKinter,
cryptography, etc.  Sure, that stuff is around in various third party
libraries, but you have to get a sense of what kind of stuff is out
there and know where to find it, and that takes more acclimation to
Python than you can get in a weekend.  And now, instead of just
deciding that Python is stable enough for your needs, you have to make
a similar decision about each of those third party projects (see below).

> Of course, I learned C in the days when it was assumed that anybody 
> doing any "serious production project" in C would have to be able to 
> dive into assembler once in a while :-)

I think this is still true.  It's certainly happened in projects that
I've worked on recently.

> > There are just too many surprising little gaps in the
> > standard library and you run into them when you least expect it.
> 
> I'm sure there are, but can you give some examples of ones you've
> run up against?

Here are a few, besides the missing modules described earlier:

1) Python's regexp module has no way to search backwards in a string
   for a regexp, i.e. find the last occurence of the regexp before
   location x.  This is useful for things like text editors and web
   robots (at one point I was writing a lot of web robots that wanted
   it). The underlying C library takes a direction flag, but the
   Python wrapper doesn't give any way to set the flag.  Python's string
   module has an rfind operation for substrings, but it doesn't do
   regexps.  I've had an SF bug open for this for at least a year.  I
   haven't bothered doing a patch, since I've been able to find kludgy
   workarounds for my specific requirements, but I if were trying to
   deliver a product to customers, then the kludges wouldn't be
   acceptable and I'd have to fix the library.

2) The socket module doesn't support ancillary messages for AF_UNIX
   sockets.  Ancillary messages are used to pass file descriptors
   between processes (e.g. you can have a daemon that gives
   unprivileged processes access to privileged network ports) and for
   passing credentials around (you can have a server check the login
   ID of a client process).  I opened an SF bug and was invited to
   submit a patch, which I might get around to doing sometime, but I
   instead just didn't bother implementing the feature I was thinking
   about that needed it.  In a project with more urgency, I would
   again have had to stop what I was doing and code that patch.  I
   will have to do it if I ever release that particular piece of code
   to the public, since without the feature, the code can only run in
   some constrained ways that I don't want to impose on users other
   than myself.

3) There's no access to the system random number generator
   (CryptGenRandom) in Windows, needed for all kinds of security
   purposes (not just cryptography).  There are no published or
   supported third party modules for it either, AFAIK.  There's a nice
   one that's been floating around informally, but I know about that
   only by having been on the right mailing lists at the right time.
   It does look like that may make it into the library soon; however,
   in a production environment you can't rely on such luck.  When
   things need to happen, you have to be able to *make* them happen.

4) There's no built-in cryptography module, for partly technical and
   partly political reasons.  I found and downloaded a third-party AES
   module and it seemed to work, so I put it in my program.  But of
   course, any assurances you (or your manager) might feel about
   Python based on its code stability and QA process don't apply to
   third party modules.  And sure enough, the AES module had a memory
   leak that made it unusable in a long-running server.  Finding and
   fixing the leak took something like a whole day of figuring out not
   only how the C API worked but also how the SWIG wrapper worked.  I
   don't blame the module developer, since he had coded it as a
   personal project and released it for free, and he never made any of
   the kind of marketing claims for it that are sometimes made for
   Python.  But there's a case where you can deploy code into
   production, see it seem to work fine until your site is getting a
   lot of traffic, and then your site starts crashing and you have to
   fix it immediately and there's NO way to fix it except by messing
   with the C API while you're under a lot of pressure.

There are also any number of missing things for which there's a
workaround, but you're not necessarily likely to discover that
workaround so easily.  For example, in cryptography you often have to
convert 1024 bit integers to 128-byte character strings and vice
versa.  How would you do it in Python?  It turns out there's a pretty
fast way (I'll let you figure it out for yourself) but at least in my
case, it took a fair amount of head scratching to hit on it.  I think
with just a weekend of Python exposure, it would have been quite hard
to spot something like that.

> > And the C API is quite cumbersome and not something a typical 
> > programmer can really come up to speed on in a weekend.
> 
> No debate about that, but the C API is (IMHO) a very advanced topic and 
> something that only a small fraction of Python programmers would ever 
> need even know exists.  You certainly don't need to know about it to get 
> useful work done with a basic subset of the language.

I have no doubt that you can get useful work done with a basic subset,
but being a solid developer calls for a much higher standard than "get
useful work done".  You're trying to do things that nobody else is
doing, which pretty often means you're trying to push something to its
limits, and you can't always wait around for answers from other people.



More information about the Python-list mailing list