[Image-SIG] Filtering out all but black pixels for OCR

Mike Meisner mikem at blazenetme.net
Wed Jul 2 16:58:21 CEST 2008


Thank you all.

This is very helpful.

Mike
----- Original Message ----- 
From: "Ned Batchelder" <ned at nedbatchelder.com>
To: "Karsten Hiddemann" <karsten.hiddemann at mathematik.uni-dortmund.de>
Cc: "Mike Meisner" <mikem at blazenetme.net>; <image-sig at python.org>
Sent: Wednesday, July 02, 2008 7:43 AM
Subject: Re: [Image-SIG] Filtering out all but black pixels for OCR


> If your image is single-channel (mode "L"), then you can use the eval 
> function:
> 
>    img = Image.open("onechannel.png")
>    # at each pixel, if it isn't zero, make it 255..
>    better = Image.eval(img, lambda p: 255 * (int(p != 0)))
>    better.save("bilevel.png")
> 
> --Ned.
> http://nedbatchelder.com
> 
> Karsten Hiddemann wrote:
>> Mike Meisner schrieb:
>>> I'd like to use PIL to prep an image file to improve OCR quality.
>>>  
>>> Specifically, I need to filter out all but black pixels from the 
>>> image (i.e., convert all non-black pixels to white while retaining 
>>> the black pixels).
>>
>> You could do something like the following:
>>
>> from PIL import Image
>>
>> img = Image.open("sample.png")
>> (xdim, ydim) = img.size
>> # this assumes that no alpha-channel is set
>> black = (0, 0, 0)
>> white = (255, 255, 255)
>>
>> if Image.VERSION >= "1.1.6":
>>     data = img.load()
>>     for y in range(ydim-1, 0, -1):
>>         for x in range(xdim):
>>             if data[x,y] != black:
>>                 data[x,y] = white
>> else:
>>     data = img.getdata()
>>     for y in range(ydim-1, 0, -1):
>>         for x in range(xdim):
>>             if data[x+y*xdim] != black:
>>                 data[x+y*xdim] = white
>>
>> img.save("sample-filtered.png")
>> _______________________________________________
>> Image-SIG maillist  -  Image-SIG at python.org
>> http://mail.python.org/mailman/listinfo/image-sig
>>
>>
>>
> 
> -- 
> Ned Batchelder, http://nedbatchelder.com
> 
> 
> 
>



More information about the Image-SIG mailing list