[Tutor] OT: How to automate user interactions with GUI elements of closed-source programs?

boB Stepp robertvstepp at gmail.com
Thu Dec 24 22:46:02 EST 2015


On Thu, Dec 24, 2015 at 5:38 PM, Alan Gauld <alan.gauld at btinternet.com> wrote:


> To quote my recent book:

[...]

> This is a frustrating technique that is very error prone and also very
> vulnerable to changes in the application being controlled—for example,
> if an upgrade changes the screen layout, your code will likely break.
> Because of the difficulty of writing the code, as well as the fragility
> of the solution, you should avoid this unless every other possibility
> has failed.

After reading the docs on PyAutoGUI 0.9.31, I've been playing around
with it and investigating some of these "fragilities".  If I decide to
go this route it appears that I need to have the main window I'd be
working with maximized to full screen.  Otherwise, any coordinates to
position the mouse cursor could very well be incorrect.  It would be
better to use any and all keyboard shortcuts available in the
application I'm working with, as these should have reproducible
behavior.  Trying to use the package's image location functions for
"commonly" named buttons (such as "OK" or "Cancel") could easily be
iffy since it searches the entire monitor screen, which might trigger
the wrong button.  However, PyAutoGUI does have a region-defining tool
that can limit the area of screen real estate being searched.  Since I
don't have a dual-monitor setup at home, I cannot test what would
happen in that situation.  I notice that multiple monitors is on the
package author's TODO list.  Etc.  Definitely not software nirvana
here!

I've been trying to think of what things I can abstract out of the
possible different environments.  The only thing that has occurred to
me so far is do my mapping of a given software application window in
terms of relative coordinates.  This way I can detect the current
monitor viewing size and then compute the needed absolute mouse
coordinates.

> The other things to check are (On Windows):
>
> 1) File export/import (CSV, JSON or XML maybe? or a Windows app
>    such as Word/Excel?))

On one of the software packages I work with every day, there are
limited text file exports.  Unfortunately, they don't contain all of
the information I need, though they do contain most of it.

> 2) COM object model access via PyWin32

I used to have a very little bit of COM knowledge, but that has long
since been forgotten.  I don't know now what possibilities that might
open up for me in accessing one of these commercial applications.

> 3) A C DLL exposed via ctypes

Oscar very briefly mentioned taking this type of approach once upon a
time.  This is also something I currently know little about and don't
know what possibilities it might give me.

> 4) A Web front end or web service

This is a total no go for the software I use at work.

> If you really must go down the screen scraping robotics route and
> you are thinking of upgrading to Windows 10 make sure you do that
> first or you will almost certainly have to rewrite from scratch
> (and after most every other upgrade of OS or app thereafter).
> And don't even think of changing your system fonts - ever!
>

I am at the mercy of my IS department re possible upgrades and they
often don't give notice of coming upgrades!

Thanks, Alan!

Merry Christmas!
boB


More information about the Tutor mailing list