[Tutor] Python structure advice ?

Fri Dec 17 21:29:00 CET 2004

Sorry for the delay, real world work took me away ...

>>everything was global, ....how you guys handle a modern structured
>>language
>>    
>>
>
>Don't worry this is one of the hardest bad habits to break.
>You are not alone. The easiest way is to just pass the data
>from function to function in the function parameters. Its not
>at all unusual for functions to have lots of parameters, "global"
>programmers tend to panic when they have more than a couple,
>  
>
yep !

>but its not at all bad to have 5 or 6 - more than that gets
>unweildy I admit and is usually time to start thinking about
>classes and objects.
>
>  
>
>>I have ended up with my application in several separate directories.
>>    
>>
>
>Separate modules is good. Separate directories for anything
>other than big programs (say 20 or more files?) is more hassle
>than its worth. The files are better kept in a single directory
>IMHO. The exception being modules designed for reuse...
>It just makes life simpler!
>  
>
Ive tried to be hyper organized and added my dirs in
/usr/lib/python2.3/site-packages/mypath.pth

/home/dave/mygg/gg1.3/live_datad
/home/dave/mygg/gg1.3/logger
/home/dave/mygg/gg1.3/utils
/home/dave/mygg/gg1.3/datacore
/home/dave/mygg/gg1.3
/home/dave/mygg/gg1.3/configs

This works OK but I sometimes have to search around a bit to find where 
the modules are.

Probarby part of the problem is I tend to write lots of small modules, 
debug them & then import them into one controlling script, It works OK 
but I start to drown in files, eg my live_datad contains ...

exact_sleep.py   garbage_collect.py   gg ftsed.e3p  html_strip.py   
live_datad.py  valid_day.pyc
exact_sleep.pyc  garbage_collect.pyc  gg ftsed.e3s  html_strip.pyc  
valid_day.py

When I get more experienced I will try & write fewer, bigger modules :-)

>  
>
>>My problem is that pretty much all the modules need to fix where
>>    
>>
>they
>  
>
>>are when they exit and pick up from that point later on,
>>    
>>
>
>There are two "classic" approaches to this kind of problem:
>
>1) batch oriented - each step of the process produces its own
>output file or data structure and this gets picked up by the
>next stage. Tis usually involved processing data in chunks
>- writing the first dump after every 10th set of input say.
>This is a very efficient way of processing large chuinks of
>data and avoids any problems of synchronisation since the
>output chunks form the self contained input to the next step.
>And the input stage can run ahead of the processing or the
>processing aghead of the input. This is classic mainframe
>strategy, ideal for big volumes. BUT it introduces delays
>in the end to end process time, its not instant.
>  
>
I see your point, like a static chain, one calling the next & passing 
data, the problem being that the links of the chain will need to 
remember their previous state when called again, so their output is a 
function of previous data + fresh data. I guess their state could be 
written to a file, then re-read.

>2) Real time serial processing, typically constructs a
>processing chain in a single process. Has a separate thread
>reading the input data 
>
Got that working live_datad ...

>and kicks off a separate processing
>thread (or process) for each bit of data received. Each
>thread then processes the data to completion and writes
>the output.
>
OK

> A third process or thread then assembles the
>outputs into a single report.
>
>  
>
Interesting ...

>This produces results quickly but can overload the computer
>if data starts to arrive so fast that the threads start to
>back up on each other. Also error handling is harder since
>with the batch job data errors can be fixed at the
>intermediate files but with this an error anywhere means
>that whole data processing chain will be broken with no way
>to fix it other than resubmitting the initial data.
>
>  
>
An interesting idea, I had not thought of this approach as an option 
even with its stated drawbacks. Its given me an idea for some scripting 
I have to do later on ...

>>With my code now running to a few hundred lines
>>(Don't laugh this is BIG for me :-D )
>>    
>>
>
>Its big for me in Python, I've only writtenone program with
>more than a thousand lines of Python wheras I've written
>many C/C++ programs in ecess of 10,000 lines 
>

Boy am I glad I chose to learn Python rather than C++, probarbly still 
be at 'hello world' ;-)

>and worked
>on several of more than a million lines. But few if any
>Python programs get to those sizes.
>
>HTH,
>
>Alan G
>Author of the Learn to Program web tutor
>http://www.freenetpages.co.uk/hp/alan.gauld
>
>
>
>  
>