Module/package hierarchy and its separation from file structure

Peter Schuller peter.schuller at infidyne.com
Thu Jan 24 02:16:51 EST 2008


>> I do *not* want to simply break out X into org.lib.animal.x, and
>> have org.lib.animal import org.lib.animal.x.X as X.
>
> Nevertheless, that seems the best (indeed, the Pythonic) solution to
> your problem as stated. Rather than just shooting it down, we'll have
> to know more about ehat actual problem you're trying to solve to
> understand why this solution doesn't fit.

That is exactly what my original post was trying very hard to
explain. The problem is the discrepancy that I described between the
organization desired in terms of file system structure, and the
organization required in terms of module hierarchy. The reason it is a
problem is that, by default, there is an (in my opinion) too strong
connection between file system structure and module hierarchy in
Python.

>> While this naively solves the problem of being able to refer to X as
>> org.lib.animal.X, the solution is anything but consistent because
>> the *identity* of X is still org.lib.animal.x.X.
>
> The term "identity" in Python means something separate from this
> concept; you seem to mean "the name of X".

Not necessarily. In part it is the name, in that __name__ will be
different. But to the extent that calling code can potentially import
them under differents names, it's identity. Because importing the same
module under two names results in two distinct modules (two distinct
module objects) that have no realation with each other. So for
example, if a module has a single global protected by a mutex, there
are suddenly two copies of that. In short: identity matters.

>> Examples of way this breaks things:
>> 
>>   * X().__class__.__name__ gives unexpected results.
>
> Who is expecting them otherwise, and why is that a problem?

Depends on situation. One example is that if your policy is that
instances log using a logger named by the fully qualified name of the
class, than someone importing and using x.y.z.Class will expect to be
able to grep for x.y.z.Class in the output of the log file.

>>   * Automatically generated documentation will document using the
>>   "real" package name.
>
> Here I lose all track of what problem you're trying to solve. You want
> the documentation to say exactly where the class "is" (by name), but
> you don't want the class to actually be defined at that location? I
> can't make sense of that, so probably I don't understand the
> requirement.

You are baffled that what I seem to want is that the definition of the
class (file on disk) be different from the location inferred by the
module name. Well, this is *exactly* what I want because, like I said,
I do not want the strong connection beteween file system structure and
module hierarchy. The fact that this connection exists, is what is
causing my problems.

Please note that this is not any kind of crazy-brained idea; lots of
languages have absolutely zero relationship between file location and
modules/namespaces.

I realize that technically Python does not have this either. Like I
said in the original post, I do realize that I can override __import__
with any arbitrary function, and/or do magic in __init__. But I also
did not want to resort to hacks, and would prefer that there be some
kind of well-established solution to the problem.

Although I was originally hesitant to use an actual example for fear
of giving the sense that I was trying to start a language war, your
answer above prompts me to do so anyway, to show in concrete terms
what I mean, for those that wonder why/how it would work.

So for example, in Ruby, there is no problem having:

File monkey.rb:

module Org
  module Lib
    module Animal
      class Monkey ...
        ..
      end
    end
  end
end

File tiger.rb:

module Org
  module Lib
    module Animal
      class Tiger ...
        ..
      end
    end
  end
end

This is possible because the act of addressing code to be loaded into
the interpreter is not connected to the namespace/module system, but
rather to the file system.

Some languages avoid (but does not eliminate) the problem I am having
without having this disconnect. For example, Java does have a strong
connection between file system structure and class names. However the
critical difference is that in Java, everything is modeled around
classes, and class names map directly to the file system structure. So
in Java, you would have the class

   org.lib.animal.Monkey

in

   <wherever>/org/lib/animal/Monkey.java

and

   org.lib.animal.Tiger

in

   <wherever>/org/lib/animal/Tiger.java

In other words, introducing a separate file does not introduce a new
package. This works well as long as you are fine with having
everything related to a class in the same file.

The problem is that with Python, everything is not a classes, and a
file translates to a module, not a class. So you cannot have your
source in different files without introducing as many packages as you
introduce files.

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <peter.schuller at infidyne.com>'
Key retrieval: Send an E-Mail to getpgpkey at scode.org
E-Mail: peter.schuller at infidyne.com Web: http://www.scode.org




More information about the Python-list mailing list