2.6, 3.0, and truly independent intepreters

Michael Sparks ms at cerenity.org
Tue Oct 28 05:34:31 EDT 2008


Glenn Linderman wrote:

> so a 3rd party library might be called to decompress the stream into a
> set of independently allocated chunks, each containing one frame (each
> possibly consisting of several allocations of memory for associated
> metadata) that is independent of other frames

We use a combination of a dictionary + RGB data for this purpose. Using a
dictionary works out pretty nicely for the metadata, and obviously one
attribute holds the frame data as a binary blob.

http://www.kamaelia.org/Components/pydoc/Kamaelia.Codec.YUV4MPEG gives some
idea structure and usage. The example given there is this:

Pipeline( RateControlledFileReader("video.dirac",readmode="bytes", ...),
          DiracDecoder(),
          FrameToYUV4MPEG(),
          SimpleFileWriter("output.yuv4mpeg")
        ).run()

Now all of those components are generator components.

That's useful since:
   a) we can structure the code to show what it does more clearly, and it
      still run efficiently inside a single process
   b) We can change this over to using multiple processes trivially:

ProcessPipeline(
          RateControlledFileReader("video.dirac",readmode="bytes", ...),
          DiracDecoder(),
          FrameToYUV4MPEG(),
          SimpleFileWriter("output.yuv4mpeg")
).run()

This version uses multiple processes (under the hood using Paul Boddies
pprocess library, since this support predates the multiprocessing module
support in python).

The big issue with *this* version however is that due to pprocess (and
friends) pickling data to be sent across OS pipes, the data throughput on
this would be lowsy. Specifically in this example, if we could change it
such that the high level API was this:

ProcessPipeline(
          RateControlledFileReader("video.dirac",readmode="bytes", ...),
          DiracDecoder(),
          FrameToYUV4MPEG(),
          SimpleFileWriter("output.yuv4mpeg")
        use_shared_memory_IPC = True,
).run()

That would be pretty useful, for some hopefully obvious reasons. I suppose
ideally we'd just use shared_memory_IPC for everything and just go back to
this:

ProcessPipeline(
          RateControlledFileReader("video.dirac",readmode="bytes", ...),
          DiracDecoder(),
          FrameToYUV4MPEG(),
          SimpleFileWriter("output.yuv4mpeg")
).run()

But essentially for us, this is an optimisation problem, not a "how do I
even begin to use this" problem. Since it is an optimisation problem, it
also strikes me as reasonable to consider it OK to special purpose and
specialise such links until you get an approach that's reasonable for
general purpose data.

In theory, poshmodule.sourceforge.net, with a bit of TLC would be a good
candidate or good candidate starting point for that optimisation work
(since it does work in Linux, contrary to a reply in the thread - I've not
tested it under windows :).

If someone's interested in building that, then someone redoing our MiniAxon
tutorial using processes & shared memory IPC rather than generators would
be a relatively gentle/structured approach to dealing with this:

   * http://www.kamaelia.org/MiniAxon/

The reason I suggest that is because any time we think about fiddling and
creating a new optimisation approach or concurrency approach, we tend to
build a MiniAxon prototype to flesh out the various issues involved.


Michael
--
http://www.kamaelia.org/Home




More information about the Python-list mailing list