Rendering text question (context is MSWin UI Automation)

Chris Mellon arkanes at gmail.com
Tue Jan 23 10:24:22 EST 2007


On 1/23/07, Boris Borcic <bborcic at gmail.com> wrote:
> Hello,
>
> I am trying to use UI Automation to drive an MS Windows app (with pywinauto).
>
> I need to scrape the app's window contents and use some form of OCR to get at
> the texts (pywinauto can't get at them).
>
> As an alternative to integrating an OCR engine, and since I know the fonts and
> sizes used to write on the app's windows, I reasoned that I could base a simple
> text recognition module on the capability to drive MSWindows text rendering - eg
> to generate pixmaps of texts I expect to find in the driven app's windows, exact
> to the pixel.
>
> The advantage of that approach would be exactitude and self-containment.
>
> I've verified manually inside an Idle window, that indeed I could produce
> pixmaps of expected app texts, exact to the pixel (with Tkinter+screen capture
> at least).
>
> I could use help to turn this into a programmable capability, ie : A simple -
> with Tkinter or otherwise - way to wrap access to the MS Windows UI text
> rendering engine, as a function that would return a picture of rendered text,
> given a string, a font, a size and colors ?
>
> And ideally, without interfering with screen contents ?
>
> Thanks in advance for any guidance,
>
> Boris Borcic

There are actually several different text rendering methods (and 2 or
more totally different engines) and they will give different results,
so if you want a fully generic solution that could be quite difficult.
However, it sounds like this is for a specific purpose.

Using the pywin32 modules to directly access the appropriate windows
API calls will be the most accurate. It will be fairly complicated and
you'll require knowledge of the win32 api to do it. You could also use
wxPython, which uses what will probably be the right API and will take
less code than win32 will. I'd suggest this if you aren't familiar
with the win32 API.

PyQt uses it's own text rendering engine, as far as I know, so it is
less likely to generate correct bitmaps. I'm not sure at what level
tkinters text drawing is done.

Using either win32 or wxPython you will be able to produce bitmaps
directly, without needing to create a visible window.


Some quick & dirty wxPython code

def getTextBitmap(text, font, fgcolor, bgcolor):
    dc = wx.MemoryDC()
    dc.SetFont(font)
    width, height= dc.GetTextExtent(text)
    bmp = wx.EmptyBitmap(width, height)
    dc.SelectObject(bmp)
    dc.SetBackground(wx.Brush(bgcolor))
    dc.Clear()
    dc.SetTextBackground(bgcolor)
    dc.SetTextForeground(fgcolor)
    dc.DrawText(text, 0, 0)
    dc.SelectObject(wx.NullBitmap)
    return bmp


Raw win32 code will look similar but will be much more verbose.



More information about the Python-list mailing list