[Python-Dev] Import hassle

Martin Sjögren martin@strakt.com
Fri, 27 Jul 2001 10:34:13 +0200


On Thu, Jul 26, 2001 at 11:25:51AM -0400, Guido van Rossum wrote:
> > I've been writing quite a few mails lately, all concerning import
> > problems. I thought I'd write a little longer mail to explain what I'=
m
> > doing and what I find strange here.
>=20
> Martin,
>=20
> Why does this interest you?  This never happens in reality unless your
> memory allocator is broken, and then you have worse problems than
> "leaks".

Short answer: I want to do it Right<tm>
Long answer: I'm curious about how it works, and I found the import
statement very odd, what with importing "broken" modules and reloading
them and so on.  I could easily get python to leak lots and lots of memor=
y
by catching the import and then catch the reload() in an infinite loop.
Basically, I like Python but Python could be better ;)

> Also, why are you posting to python-dev?

Good question.  I never seemed to get the answers I wanted from
python-list, so my first thought was to mail you personally seeing as "he
created the thing, surely he knows" but then I thought that it'd be a bit
rude so I thought "who else would know a lot about this?" and I came up
with the answer python-dev.  If I shouldn't have done this, I apologize,
but after fooling around with import and trying to figure out where and
what I should free when the init failed, I was mildly confuddled.

I posted this to python-dev too since I started out there, making my erro=
r
worse I guess.  Feel free to flame me :-)

[snip]

> > Even more interesting, say that I create a submodule and throw in a
> > bunch of PyCFunctions in it (I stole the code from InitModule since
> > I don't know how to fake submodules in a C module in another way, is
> > there a way?). I create the module, fail on inserting it into the
> > dictionary and DECREF it.  Now, that ought to free the darn
> > submodule, doesn't it? Anyway, I wrote a simple "mean" script to
> > test this:
> >=20
> > try: import Foo
> > except: import Foo
> > while 1:
> >   try: reload(Foo)
> >   except: pass
> >=20
> > And this leaks memory like I-don't-know-what!
> > What memory doesn't get freed?
>=20
> Memory leaks are hard to find.  I prefer to focus on memory leaks that
> occur in real situations, rather than theoretical leaks.

Agreed, though it's nice to do it Right<tm>, especially when I get asked
on the code review at work "shouldn't you free memory here?" and the only
thing I can reply is "nobody else does", and my boss says "just because
nobody else does it Right<tm>, there's no reason you shouldn't"

But, what IS the Right<tm> way to do this anwyay?

> > Now to my questions: What exactly SHOULD I do when loading my module =
fails
> > halfway through? Common sense says I should free the memory I've used=
 and
> > the module object ought to be unusable.
>=20
> You should free the memory if you care.  "Disabling" the module is
> unnecessary -- in practice, the program usually quits when an import
> fails anyway.

Okay, so how about the situation where an import fails halfway through bu=
t
the things you need are initialized "before" that.  Say that you catch th=
e
exception on import and check wether the things you need are there.  If
they are, fine.  If they aren't, fail.  I don't see this situation as
something that's show up all the time, but it certainly is possible, isn'=
t
it?  In that situation it would be nice if there were no memory leaks...

Then again, maybe I'm just foolish.

> > Why-oh-why can I import Foo, catch the exception, import it again and=
 it
> > shows up in the dictionary? What's the purpose of this?
> >=20
> > How do I work with submodules in a C module?
> >=20
> > I find the import semantics really weird here, something is not quite
> > right...

> Consider two modules, A and B, where A imports B and B imports A.
> This is perfectly legal, and works fine as long as B's module
> initialization doesn't use names defined in A.
>=20
> In order to make this work, sys.module['A'] is initialized to an empty
> module and filled with names during A's initialization; ditto for
> sys.modules['B'].
>=20
> Now suppose A triggers an exception after it has successfully loaded
> and imported B.  B already has a reference to A.  A is not completely
> initialized, but it's not empty either.  Should we delete B's
> reference to A?  No -- that's interference with B's namespace, and we
> don't know whether B might have stored references to A elsewhere, so
> we don't know if this would be effective.  Should we delete
> sys.modules['A']?  I don't think so.  If we delete sys.modules['A'],
> and later someone attempts to import A again, the following will
> happen: when A imports B, it finds sys.modules['B'], so it doesn't
> reload B; it will use the existing B.  But now B has a reference to
> the *old* A, not the new one.
>=20
> There are now two possibilities: either the second import of A somehow
> succeeds (this could only happen if somehow the problem that caused it
> to trigger an exception was repaired before the second attempted
> import), or the second import of A fails again.  If it succeeds, the
> situation is still broken, because B references the old, incomplete
> A.  If it fails, we my end up in an infinite loop, attempting to
> reimport A, failing, and catching the exception forever.  Neither is
> good.

Ah-hah.  Now I get it, thank you!

Martin

--=20
Martin Sj=F6gren
  martin@strakt.com              ICQ : 41245059
  Phone: +46 (0)31 405242        Cell: +46 (0)739 169191
  GPG key: http://www.strakt.com/~martin/gpg.html