[IronPython] Kamaelia ETL - was ([Kamaelia-list] Kamaelia and IronPython)

Matt Clinton mclinton at procard.com
Thu Jul 12 18:15:33 CEST 2007


Michael Sparks said: 

>>Googling for that acronym, do you mean in ETL in a data processing
>>context, ala http://en.wikipedia.org/wiki/Extract,_transform,_load ?

Exactly - filling a DB from various sources, QAing the values, etc.

>>>...
>>I don't see any reason why Kamaelia couldn't be used in that way - 
>>indeed it sounds like a natural fit. 

Lovely.  
Thank you - it seemed so: I wanted to hear if anyone else had thought of
it (and where those thoughts went).

My current context is through-put for encryption of financial data:
avoiding cleartext data-at-rest, with a minimal speed impact while an
order of a million records are processed daily.  

Calling the encryption module as part of a stream seems to be an elegant
approach, rather than as a distinct step otherwise.  Parallelism is good
(fast).

The rest of the solution architecture and the team's lack of Python
familiarity will likely prevent using Kamaelia directly now (but I'll
try), but that's what was in my head prompting the question.

I often end up implementing value-add data analysis, and the performance
(speed) of doing full pieces discretely often ends up making what could
otherwise be a feed into a daily-load batch process.  
Your techniques could make changing that elegant - there'd be a lot of
retooling to change existing products over on the front end, 
but this would be a sweet way for the back end of news-service style
data-slinging to be designed for new products.  

Thanks!
-- Matt




More information about the Ironpython-users mailing list