About alternatives to Matlab

Thu Dec 14 18:10:39 EST 2006

Jon Harrop wrote:
> Filip Wasilewski wrote:
> > Jon Harrop wrote:
> >> Filip Wasilewski wrote:
> >> > Jon, both Python and Matlab implementations discussed here use the
> >> > lifting scheme, while yours is a classic convolution based approach.
> >>
> >> I've done both in OCaml. The results are basically the same.
> >
> > Have you tried taking advantage of the 50% reduction of arithmetic
> > operations for the lifting scheme?
>
> Yes. The time taken is dominated by memory accesses. The amount of
> arithmetic is almost irrelevant.

I can say from my experience, that this depends much on the input size
and the CPU cache size. For arrays of about n^18 or less and 2MB cache
the speedup for the straightforward C lifting implementation is around
50%, while all operations are done in-place. For bigger arrays
traversing memory several times becomes too expensive, at least until
the input size is so big that the memory gets swapped and the inplace
method becomes faster again. I suppose that such behavior should be
quite similar for other low-level implementations as well.

Actually I started to wonder if I could automate conversion of
multi-loop lifting schemes [1] into one-pass loop. This could be done
by delaying computation of the "coming next" step just by few
units, until values from the preceding step are ready to use. This
should reduce the random memory access effect for several cases and
make the timings quite stable.

[1] http://en.wikipedia.org/wiki/Lifting_scheme

> >> It makes sense because they solve the same problem. When you're comparing
> >> non-trivial problems between disparate languages you are not likely to
> >> use the same algorithms. Using built-in hash functions is an obvious
> >> example.
> >
> > I don't even ask if you have checked the output result coefficients
> > because I'm pretty sure you have not.
>
> I checked the coefficients against other DWTs from the web.
>
> > They solve similar problem but in
> > different ways and give a bit different results, for example the
> > layout of coefficients in the memory is different as well as border
> > distortion handling. Please don't tell me it's better to do something
> > else instead of presenting a relevant solution.
>
> I used a more efficient approach because I could. Using a compiled language
> gave me that choice. We should compare both approaches in both languages.

I am getting a bit tired of this discussion. I am not arguing which
approach is faster, because it very much depends on the use case. I
just ask you to do fair comparison using the original algorithm
presented here instead of forcing your arguments with the
not-so-relevant variant convenient for you.

Every language (or at least vast majority of them) that has loops and
basic arithmetic gives you such choice, no matter compiled or not.
These are pretty simple algorithms when limited to single D4 case, but
the D4 is not the only wavelet transform known to mankind and there are
situations when you do not have such choice, because some problems just
can't be expressed using the classic approach. If the OCaml is so
superior as you claim, please show me how to do the lifting scheme
(preferably in the
http://groups.google.com/group/comp.lang.python/msg/a71a5bf00a88ad57
form with features described in my previous post) so I could actually
learn something new and maybe even start to consider it as a possible
alternative to C-ish language family.

As far as the DWT only is concerned, well, If I were to choose the most
efficient approach (in both the CPU time and the development time
terms) for Python I would use a ready extension instead of developing
the wheel once again. Let me think, that should go something like this:

import pywt
coeffs = pywt.wavedec(x, 'db2', 'per')

> >> No. The effects you are talking about are swamped by the interpreted vs
> >> compiled effect.
> >
> > Have I already mentioned it was observed with low-level C
> > implementation on x86? I would really appreciate you took some time on
> > exploring the unfortunate topic of this benchmark before posting
> > another comment like that.
>
> We are talking about this Python implementation, not some "low-level C
> implementation on x86". With Python, it is the overhead of the interpreter
> that makes it slow and encourages you to obfuscate your code in order to
> make it run in a reasonable amount of time.

I was talking about why you should not limit your benchmarks to the
very single case and gave example how the choice of parameters can
influence the results. It has nothing to do with the interpreted vs.
compiled harangue. Would you argue that some O(n^2) algorithm is always
better than it's O(n) alternative just because it has lower call
overhead and performs better under some limited circumstances?

> > What I would really like to see is a lifting scheme
> > implementation that can take 1- to 10-dimensional arrays
>
> I believe that is just a case of abstracting get/set.
>
> > (single and double precision floats,
>
> That is also a case of abstracting get/set.
>
> > convert on the fly from integer types when necessary
>
> That is dynamic typing.
>
> > and handle strided, non-contiguous arrays),
>
> That is also an abstraction of get/set.

Oh my, are all these abstractions to be put into reality every time
from a scratch?

> > axis and maximum decomposition level as input
>
> I don't know what that is.

When doing 1D operations on a multidimensional array it is often
convenient to be able to specify along which axis these operations
should be performed, for example:

>>> x = numpy.ones((2,4))
>>> x.sum(axis=0)
array([ 2.,  2.,  2.,  2.])
>>> x.sum(axis=1)
array([ 4.,  4.])

The maximum level denotes how many decomposition steps should be
performed for multilevel transform. Just like described on
http://www.mathworks.com/access/helpdesk/help/toolbox/wavelet/wavedec.html

> > and return list of views (or arrays if necessary)
>
> Probably just a closure.
>
> > with transform coefficients for every level of
> > decomposition. Can do?
>
> Sure. Sounds like a lot of work though.

See. It's for sure pretty much work to create a general solution in C.
It is a lot easier to focus on the real problem when using Python (and
NumPy or other package from a wide range of available extensions) and
save yourself getting much into implementation details while still
maintaining reasonable performance. That is one of the most important
reasons I prefer to use high-level language for most of my tasks. I
also try to avoid doing premature optimizations that could spare some
CPU cycles, just to see in a few moments that the performance gain is
completely swallowed by I/O overhead on some remote data access point.
For the rest of intensive number crunching, if there is no specialized
extension available yet, I still prefer to use the Python/Pyrex/C or
similar combo to get solution that can be easily integrated with other
parts of a bigger system (like networking, databases, reports, web,
etc. - wow, how many things the transition from C++ and Java few years
ago made just so accessible) or just used to play in the interactive
mode.

> This would be best done in F#
> because you can make your own IList derivatives with get_Item and set_Item
> that provide views but look like arrays.
>
> > Are you telling I have to define a different operator for every
> > arithmetic operation for every two types?
>
> In OCaml, yes. Not in F#.
>
> > That definitely does not look
> > like a quick, flexible and clear solution to start with. Any working
> > example I could type into OCamlWinPlus of the following would be very
> > appreciated. I'm fairly new to this and would really like to see how to
> > handle differently shaped arrays and data types in the interactive mode
> > without much hassle:
> >
> > from numpy import ones, arange, reshape, int32
> > a = ones((2,3,4,5))
> > b = ones(a.shape, dtype=int32)
> > c = ones((3,4,5))
> > d = 2*a + 2.5*(3*b + 3.3)
> > d[0] = d[0] + 5
> > d[1] *= c * (2+3*1.2)
> > d[:,2,...] = reshape(arange(d[:,2,...].size), d[:,2,...].shape)
> > print d[...,1]
>
> I can't get that to work in Python. What do I need to do?

Type exactly as presented above, starting from the very first line. The
`from module import ...` imports chosen identifiers into current
namespace, so you don't have to prefix them with module name.

> >>> import numpy
> >>> a = ones((2,3,4,5))
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> NameError: name 'ones' is not defined
>
> > The ray tracer example is really nice, however the sudoku solver is not
> > so impressive when compared to the K thingy ;-) -
> > http://markbyers.com/moinmoin/moin.cgi/ShortestSudokuSolver.
>
> Well, OCaml does ok on the sudoku but K does awfully on the ray tracer.

That's pretty natural. Different languages are designed to solve
different tasks. I hope your point is not to convince people to use
OCaml to solve (or complicate) every possible problem.

> > Where can I find
> > documentation or tutorial on using Bigarrays?
>
> http://caml.inria.fr/pub/docs/manual-ocaml/libref/Bigarray.html

Yes, I found this before but it is not very helpful to start with. Any
doc with basic use cases and examples would be much better. Think of it
as a possible restraint for people trying to learn the language.

> You probably want to use F# though.

On the other hand, I wonder how this integrates with other .net
languages like C# or IronPython.

cheers,
fw