[scikit-image] Memory consumption of measure.label (compared to matlab)

Martin Fleck martin.fleck at uni-konstanz.de
Thu Jul 13 10:03:31 EDT 2017


Hi again,

attached is a file "matlab_memory_info" and again the same
"skiamge_memory_profiler.out" that I showed before.
in the matlab_memory_info file, I added for every matlab call the
equivalent that I do in skimage.

I don't think it will be needed - the attached files should be enough -
but if someone wants to see the full memory report of matlab, you can
download it here:
https://drive.google.com/open?id=0BzmlODsuIIz0dVRsMk9sT3RuU0E
(it's an html file)

Cheers,
Martin


On 07/13/2017 02:09 PM, Martin Fleck wrote:
>
> Hi again,
>
> here you can download a minimal example:
>
> https://drive.google.com/open?id=0BzmlODsuIIz0elpIcU1kdmpNTlE
> (download button is the arrow on the top right)
>
> In order to run it and get the memory_profiler output you have to
> install memory_profiler
> e.g. with
>
> pip3 install memory_profiler
>
> and run the file with
>
> python3 -m memory_profiler minimal_test.py
>
> If you just want to run the example without memory profiling and
> installing memory_profiler, you have to comment out or remove line 8
> "@profile"
>
> Cheers,
> Martin
>
>
>
> On 07/13/2017 01:21 PM, Martin Fleck wrote:
>>
>> Hi Juan, hi Greg,
>>
>> quoting Greg:
>> > I think the main reason for the increased memory usage is that the
>> output type of the label function is int64 while your input is most
>> likely uint8.
>>
>> Indeed, this could be the complete problem already! For the analysis
>> I use a binary image - so only one bit per pixel.
>>
>> Greg: Regarding your PR and my analysis: My analysis using a 1.2GB
>> file stops due to memory problems already in
>> skimage.morphology.remove_small_objects() even if the major memory
>> blowup happens with skimage.morphology.label().
>> So there are problems at multiple steps that hopefully can be improved.
>>
>> Quoting Juan:
>> > For example, what are the data types of the outputs in Matlab?
>>
>> the first steps of my analysis are to convert the 8 bit input image
>> to a meaningful binary image. The whole analysis is done on binary
>> images. So all inputs and outputs in Matlab are of Matlab Class
>> "logical".
>>
>> I will provide you with a minimal example script and data for the
>> skimage case.
>> I will try to create equivalent memory inofrmation in Matlab.
>>
>> I'll both post it here as soon as I'm done with that.
>>
>> Thanks so far!
>>
>> Martin
>>
>> On 07/13/2017 03:05 AM, Juan Nunez-Iglesias wrote:
>>> Hi Martin,
>>>
>>> No one on this list wants to push you to more Matlab usage, believe
>>> me. ;)
>>>
>>> Do you think you could provide a script and sample data that we can
>>> use for troubleshooting? As Greg pointed out, the optimization
>>> approach *might* have to be data-type dependent. We could, for
>>> example, provide a dtype= keyword argument that would force the
>>> output to be of a particular, more memory-efficient type, if you
>>> know in advance how many objects you expect.
>>>
>>> If you can provide something similar to a memory profile, and
>>> diagnostic information, for your equivalent Matlab script, that
>>> would be really useful, so we know what we are aiming for. For
>>> example, what are the data types of the outputs in Matlab?
>>>
>>> Juan.
>>>
>>> On 13 Jul 2017, 9:59 AM +1000, Gregory Lee <grlee77 at gmail.com>, wrote:
>>>> Hi Martin,
>>>>
>>>>     My problem my analysis uses much more memory than I expect.
>>>>     I attached output from the memory_profiler package, with which
>>>>     I tried
>>>>     to keep track of the memory consumption of my analysis.
>>>>     You can see that for an ~8MiB file that I used for testing,
>>>>     skimage.measure.label needs to use 56MiB of memory, which
>>>>     surprised me.
>>>>
>>>>
>>>> I haven't looked at it in much detail, but I did find what appear
>>>> to be some unnecessary copies in the top-level Cython routine
>>>> called by skimage.morphology.label.  I opened a PR to try and avoid
>>>> this here:
>>>> https://github.com/scikit-image/scikit-image/pull/2701
>>>> <https://github.com/scikit-image/scikit-image/pull/2701>
>>>>
>>>> However, I think that PR is going to give a minor performance
>>>> improvement, but not help with memory use much if at all.  I think
>>>> the main reason for the increased memory usage is that the output
>>>> type of the label function is int64 while your input is most likely
>>>> uint8.  This means that the labels array requires 8 times the
>>>> memory usage of the uint8 input.  I don't think there is much way
>>>> around that without making a version of the routines that allows
>>>> specifying a smaller integer dtype.
>>>>
>>>> - Greg
>>>> _______________________________________________
>>>> scikit-image mailing list
>>>> scikit-image at python.org
>>>> https://mail.python.org/mailman/listinfo/scikit-image
>>>
>>>
>>> _______________________________________________
>>> scikit-image mailing list
>>> scikit-image at python.org
>>> https://mail.python.org/mailman/listinfo/scikit-image
>>
>>
>>
>> _______________________________________________
>> scikit-image mailing list
>> scikit-image at python.org
>> https://mail.python.org/mailman/listinfo/scikit-image
>
>
>
> _______________________________________________
> scikit-image mailing list
> scikit-image at python.org
> https://mail.python.org/mailman/listinfo/scikit-image

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-image/attachments/20170713/d09303c6/attachment.html>
-------------- next part --------------
Line #    Mem usage    Increment   Line Contents
================================================
   154  102.523 MiB    0.000 MiB   @profile
   155                             def run():
   156  110.500 MiB    7.977 MiB       image = data.imread(image_filename)
   157  110.500 MiB    0.000 MiB       image = color.rgb2gray(image)
   158                                 
   159                                 # make binary image with threshold using Li's method
   160  117.328 MiB    6.828 MiB       bwSource = image<filters.threshold_minimum(image)
   161                                 
   162                                 # Filter bwSource image -> remove small specks
   163  131.715 MiB   14.387 MiB       bwFiltered = morphology.remove_small_objects(bwSource, min_size=minSingleDislocationArea, connectivity=2)
   164                                 
   165                                 # analyze regions:
   166  187.551 MiB   55.836 MiB       label_img = label(bwFiltered)
   167  187.809 MiB    0.258 MiB       regions = regionprops(label_img)

-------------- next part --------------
FunctionName	EquivalentCallInSkimage					Calls	TotalTime	SelfTime*	AllocatedMemory	FreedMemory	SelfMemory	PeakMemory	TotalTimePlot
imread		skimage.data.imread()					1	0.178 s		0.003 s		8770.72 Kb	612.52 Kb	42.67 Kb	7598.20 Kb	
imbinarize	IMAGE<TRESHOLD_VALUE					1	0.035 s		0.001 s		6705.84 Kb	58.42 Kb	14.33 Kb	6472.17 Kb	
bwareaopen	skimage.morphology.remove_small_objects()		1	0.720 s		0.032 s		17420.83 Kb	7763.23 Kb	5011.38 Kb	7147.20 Kb	
regionprops	skimage.measure.regionprops(skiamge.measure.label)	1	19.697 s	0.018 s		93376.56 Kb	96408.75 Kb	-11470.08 Kb	1602.05 Kb


More information about the scikit-image mailing list