Best Practices for Internal Package Structure

Sven R. Kunze srkunze at mail.de
Tue Apr 5 13:38:50 EDT 2016


On 05.04.2016 03:43, Steven D'Aprano wrote:
> The purpose of packages isn't enable Java-style "one class per file" coding,
> especially since *everything* in the package except the top level "bidict"
> module itself is private. bidict.compat and bidict.util aren't flagged as
> private, but they should be, since there's nothing in either of them that
> the user of a bidict class should care about.
>
> (utils.py does export a couple of functions, but they should be in the main
> module, or possibly made into a method of BidirectionalMapping.)
>
> Your package is currently under 500 lines. As it stands now, you could
> easily flatten it to a single module:
>
> bidict.py

I don't recommend this.

The line is blurry but 500 is definitely too much. Those will simply not 
fit on a 1 or 2 generous single screens anymore (which basically is our 
guideline). The intention here is to always have a bit more of a full 
screen of code (no wasted pixels) while benefiting from switching to 
another file (also seeing a full page of other code).

This said, and after having a look at your packages code, it's quite 
well structured and you have almost always more than 1 name defined in 
each submodule. So, it's fine. _frozen and _loose are a bit empty but 
well don't let's stretch rules here too far.

I remember us having some years ago file that regularly hit the 3000 or 
4000 lines of code. We systematically split those up, refactored them 
and took our time to name those module appropriately. Basically we 
started with:

base.py << trashcan for whatever somebody might need

to

base.py << really the base
domain_specific1.py  << something you can remember
domain_specific2.py  << ...
domain_specific3.py
domain_specific4.py


> Unless you are getting some concrete benefit from a package structure, you
> shouldn't use a package just for the sake of it.

I agree.

> Even if the code doubles
> in size, to 1000 lines, that's still *far* below the point at which I
> believe a single module becomes unwieldy just from size. At nearly 6500
> lines, the decimal.py module is, in my opinion, *almost* at the point where
> just size alone suggests splitting the file into submodules. Your module is
> nowhere near that point.

I disagree completely. After reading his package, the structure really 
helped me. So, I see a benefit.


I agree with Steven that hiding where a name comes from is a bit 
problematic. Additionally, as we use PyCharm internally, 1) we don't see 
imports regularly 2) we don't create/optimize them manually anymore 3) 
we just don't care if the import is too long. So, it's fine to us and as 
PyCharm tried not to be overly clever when it comes to detecting names, 
we like the direct way.

In case of our PyPI module, usability is really important for newbies 
and people not using sophisticated IDEs. So, making it really easy for 
them is a must. :)

Best,
Sven



More information about the Python-list mailing list