A design problem I met again and again.

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Thu Apr 2 10:30:55 EDT 2009


On Thu, 02 Apr 2009 07:45:46 -0400, andrew cooke wrote:

> Lawrence D'Oliveiro wrote:
>>> What are the average size of source files in your project?   If it's
>>> far lower than 15,000,  don't feel it's a little unbalance?
>>
>> Why?
> 
> one reason is that it becomes inefficient to find code.  if you
> structure code as a set of nested packages, then a module, and finally
> classes and methods, then you have a tree structure.  and if you divide
> the structure along semantic lines then you can efficiently descend the
> tree to find what you want.  if you choose the division carefully you
> can get a balanced tree, giving O(log(n)) access time.  in contrast a
> single file means a linear scan, O(n).

What's n supposed to be? The number of lines in a file? No, I don't think 
so -- you said it yourself: "if you divide the structure along semantic 
lines then you can efficiently descend the tree to find what you want". 
Not "arbitrarily divide the files after n lines". If one semantic 
division requires 15,000 lines, and another semantic division requires 15 
lines, then the most efficient way to divide the code base is 15,000 
lines in one module and 15 lines in another.

Admittedly, I'd expect that any python module with 15,000 lines 
(approximately 900KB in size) could do with some serious refactoring into 
modules and packages, but hypothetically it could genuinely make up a 
single logical, semantic whole. That's "only" four and a half times 
larger than decimal.py.

I can't imagine what sort of code would need to be that large without 
being divided into modules, but it could be possible.



-- 
Steven



More information about the Python-list mailing list