[Numpy-discussion] data transit

Mon Dec 10 09:44:04 EST 2007

Hello there,

indeed, the tasks you described correspond to what I'm seeking to
implement. The thing is, for the sake of encapsulation (and laziness
in the programming sense), I'm keeping responsibilities well defined
in several objects. I guess this type of coding is pretty much
ordinary for an OO person - it's just me having trouble with the
philosophy.

So, upon much thought, it breaks down like this:
- Crunchers: lowest level objects that encapsulate stuff I do with
Numpy/Scipy functions, on Numpy objects. Say, get data from arguments,
unbias the data, zero-stuff, fft the set, etc. They are meant to be
written as needed.

- DataContainers: abstraction layer to data sources (DB, files, etc)
and to other data objects still in memory. Data returned by Crunchers
is stored inside - in practice, piped here by an Analysis object. So
far, I see no need for nesting DCs inside other DCs.

- Analysis: these are the glue between Crunchers, DataContainers and
the user (batch, GUI, CLI). An Analysis is instanciated by the user,
and directs both data flow into DCs, as well as out of them. While
each Analysis has one and only one 'results' attribute, which points
to some place within a DataContainer, I imagine Analysis made by
concatenating several Analysis - just call Analysis.result() to access
data at a certain stage of processing.

Well, so it is. Hopefully this setup will lend a good degree of
flexibility to my application - the crunchers are hard to develop,
since I haven't seen the data yet.

Nadav: I had looked into pytables while investigating low-level
interfaces that would have to be supported. It's a lot below what I
was looking for - my DataContainers do obtain their nature from other
classes which are responsible for talking to DBs, files and the like -
but it is the design of these containers that's hard to conceive!

Cheers,

Renato

On Dec 7, 2007 6:48 PM, Alan Isaac <aisaac at american.edu> wrote:
> It sounds like you want a new class X that does three things:
> knows where the data is and how to access  it,
> knows how the data are to be processed and can do this when asked,
> is able to provide a "results" object when requested.
> The results object can store who made it and with what
> processing module, thus indirectly linking to the data and techniques
> (which the X instance may or may not store, as is convenient).
>
> fwiw,
> Alan Isaac
>
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>