Help me pick an API design (OO vs functional)

Michael Herrmann michael.herrmann at getautoma.com
Mon Mar 25 15:29:23 EDT 2013


Hello everyone, 

my name is Michael, I'm the lead developer of a Python GUI automation library for Windows called Automa: http://www.getautoma.com. We want to add some features to our library but are unsure how to best expose them via our API. It would be extremely helpful for us if you could let us know which API design feels "right" to you.

Our API already offers very simple commands for automating the GUI of a Windows computer. For example:

	from automa.api import *
	start("Notepad")
	write("Hello World!")
	press(CTRL + 's')
	write("test.txt", into="File name")
	click("Save")
	click("Close")

When you execute this script, Automa starts Notepad and simulates key strokes, mouse movements and clicks to perform the required commands. At the moment, each action is performed in the currently active window. 

We do not (yet) have a functionality that allows you to explicitly switch to a specific window. Such a functionality would for instance make it possible to open two Notepad windows using the start(...) command, and copy text between them.

One API design would be to have our start(...) function return a "Window" (say) object, whose methods allow you to do the same operations as the global functions write(...), press(...), click(...) etc., but in the respective window. In this design, the example of operating two Notepad windows could be written as

	notepad_1 = start("Notepad")
	notepad_2 = start("Notepad")
	notepad_1.write("Hello World!")
	notepad_1.press(CTRL + 'a', CTRL + 'c')
	notepad_2.press(CTRL + 'v')

The problem with this design is that it effectively duplicates our API: We want to keep our "global" functions because they are so easy to read. If we add methods to a new "Window" class that do more or less the same, we feel that we are violating Python's principle that "There should be one - and preferably only one - obvious way to do it."

An alternative design would be to make the window switching an explicit action. One way of doing this would be to add a new global function, say "switch_to" or "activate", that takes a single parameter that identifies the window to be switched to. We could still have start(...) return a Window object, that could then be passed to our function:

	notepad_1 = start("Notepad")
	notepad_2 = start("Notepad")
	switch_to(notepad_1)
	write("Hello World!")
	press(CTRL + 'a', CTRL + 'c')
	switch_to(notepad_2)
	press(CTRL + 'v')

Maybe our Window objects could also be used as context managers:

	notepad_1 = start("Notepad")
	notepad_2 = start("Notepad")
	with notepad_1:
		write("Hello World!")
		press(CTRL + 'a', CTRL + 'c')
	with notepad_2:
		press(CTRL + 'v')

As a final idea, switching could also be done as a method of the Window class:

	notepad_1 = start("Notepad")
	notepad_2 = start("Notepad")
	notepad_1.activate()
	write("Hello World!")
	press(CTRL + 'a', CTRL + 'c')
	notepad_2.activate()
	press(CTRL + 'v')

It would be extremely helpful for us if you could let me know which way of using the API you would prefer. If you opt for an explicit version, how would you call the respective method? "activate" / "switch_to" / "focus" or something else?

Thank you so much!

Best wishes,
Michael



More information about the Python-list mailing list