Review: Canny

Mon Apr 4 22:56:48 EDT 2011

Hey Dan,

I just had a look at this code, and there are some things I would do before
pulling in the changes:

   - Document which dtype is expected of the input image. It appears to work
   for floats, int, uint8 etc, but each gives a different output result. Since
   the tests are using floats, I assume a floating point image is expected.
   This should be documented in the docstring along with the expected range of
   the image. i.e. is it a 0.0 - 1.0 float image or a 0.0 - 255.0 float image.
   Also, the docs should state it only works on 2D grayscale images, so the
   user needs to convert their color image beforehand.
   - Document the valid ranges of values for the low threshold, and high
   threshold.
   - There is a lot of use of np.logical_* functions. These are around 20%
   slower than numpy's overloaded bitwise & and | operators. It seems these
   logical_* functions are being applied to boolean images so the two
   operations (logical and bitwise) are equivalent. I would use the faster of
   the two. It's faster and easier to read.
   - ndimage has a sobel function, was there a particular reason you chose
   to do a 2d convolution instead? 2 separable 1d convolutions as done by
   ndimage.sobel should be faster than a 2d convolution (especially since
   ndimage does optimizing cache manipulations under the covers).
   - Does the mask operate over any generic area, or is the mask expected to
   be rectangular? If it's a rectangular mask, is there really a need for it
   when the user could just pass in a slice to their image instead.
   - A minor pedantic gripe; I like to have spaces after commas in array
   indices and tuples. i.e. this: (1, 2), arr[:, 5:6], instead of: (1,2),
   arr[:,5:6]. A space immediately after a comma is recommended in PEP8.
   - Variable naming. Lots of single and two letter variables like 'c', 'c1,
   'c2', 'w', 'm', 'cc', etc... Let's give these descriptive names.
   - For computing magntitude, use np.hypot(isobel, jsobel), instead of the
   manual computation you use. np.hypot is 2x faster on a 1000x1000 image since
   it doesn't create any temporaries.
   - inline the smoothing the function, unless you're using it elsewhere.

Lastly, thanks for working and spending time on this code!

Cheers!

Chris

2011/4/2 Dan Farmer <dfarmernv at gmail.com>

> I've pushed the suggested changes :
>
> https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-canny
>
> Thanks,
> Dan
>
> 2011/3/31 Dan Farmer <dfarmernv at gmail.com>:
> > I will try to address your comments tonight and also see about
> > removing smooth.py and just incorporating the one function we're using
> > into canny.py for now as Thouis suggested. Thanks to both of you for
> > looking at it. I'll message back when I've pushed the changes.
> >
> > -Dan
> >
> > 2011/3/31 Stéfan van der Walt <stefan at sun.ac.za>:
> >> Hi Dan
> >>
> >> On Thu, Mar 31, 2011 at 1:02 AM, Dan Farmer <dfarmernv at gmail.com>
> wrote:
> >>>
> https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-canny
> >>
> >> I read through your patch and made some preliminary comments.
> >>
> >>> Mostly just trying to follow procedure. I already mentioned my
> >>> concerns in the previous thread. I made one stab at introduced a
> >>> "None" default for the mask, but I got hung up and reverted it. The
> >>> default I was going to propose was np.ones(img.shape,bool) (and after
> >>> the fact I even noticed that's how it is used in one of the unit
> >>> tests). But I started thinking that that could be quite wasteful of
> >>> memory if you were working with large images (on my test use case with
> >>> ~512x512 images it's about 300 KB for the "fake" mask).
> >>
> >> It seems as though this specific implementation of the algorithms
> >> relies on creating the mask, so I don't think you can get away from
> >> it.  The typical way to do it would be:
> >>
> >> def canny(..., mask=None, ...):
> >>    if mask is None:
> >>        mask = np.ones(x.shape, dtype=bool)
> >>
> >>> The problem I had was that if I don't allocate the emask array I get
> >>> run-time errors starting at line 129 (in the diff of canny.py) because
> >>> the arrays all have different lengths if they aren't logical_and'd
> >>> with emask above.
> >>
> >> Yes, I think the only way to avoid allocating the mask explicitly is
> >> to rewrite the algorithm in Cython, where you can modify behaviour
> >> inside the for-loop.
> >>
> >> Regards
> >> Stéfan
> >>
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-image/attachments/20110404/8f830ef9/attachment.html>