VERY basic browser question

cribeiro at mail.inet.com.br cribeiro at mail.inet.com.br
Sun Feb 23 13:58:34 EST 2003


Grant,

Humm... now I see. You're not trying to 'just open a browser and do some
stuff'. You're trying to automate stuff, which is something a little
bit harder.

There are a few ways you can do it, using _completely_ different
approaches to the problem:

1) The 'blind remote control' approach: programs such as VNC and
PCAnywhere allows one to remotely control another program. They do it by
sending bessages on behalf of the actual devices. In other words: they
'fake' the same messages that are sent when the user press a key, or
move the mouse. As simple as that.

In Windows, there is a primitive to stuff things into the keyboard
buffer that can be used for this purpose. I sincerely don't remember the
name. I used it a few years ago to do something similar - I had to
convince Vantive (a very popular high-end CRM system) to open up with
some particular data displayed, and as Vantive didn't support
automation, (at that time - I don't know about the situation now)

FOr you to do what you want, proceed like this: define the sequence of
keys that you want to send to the operating system; take care to
determine if some pause is needed; when running your program, just stuff
the keys in the keyboard buffer (respecting any pause if needed).

This approach is called 'blind' because you have no way to know if your
automation was sucessfull or not. For example, if something fails in the
middle of the process you may end up sending the wrong keys, which can
even cause problems... but for simple automation tasks, this approach
has the advantage of being relatively simple, because you don't need to
study the automation interface provided by the programs being
controlled.

2) The 'automation interface' approach. Most programs today have
automation interfaces, that can be used to remotely control the program in
much more advanced ways than possible using the technique described above.
On the other hand, the use of the automation interface requires much more
knowledge about the internal structure, as exposed in the automation API,
of the application being controlled.

Most web browsers today do have automation interfaces. However, I don't
know how difficult it is to solve your particular problem using this
approach. You say that you have to fill in a form; to do it with
automation, it would require that the 'form' object to be exposed, and I
don't know if it is possible (that's my opinion: it should be possible,
but probably it's not easy).

In short, this is a very elegant and extensible approach, but I think that
it probably requires a lot of knowledge of the automation APIs. I may be
wrong, though.

3) The 'smart browser' approach.

Another approach is to implement your own automated browser object. This
is relatively easy in Python thanks to urllib and other advanced
libraries. The basic idea is as follows:

- using the urllib, open the page you want to access;
- when you get the answer (in plain HTML, or even XML), parse it, and look
for the form you want to fill in. The form contains all information
needed: the name of the fields, and also the method to be used (GET or
POST).
- build your own http request to send the data as if it were filled in by
the user. This is not as hard as it may seem. There are two ways to send
data (GET and POST); it's simply a matter of building the correct request,
and to send it using http.

This is not as hard as it may seem, specially if you already know in
advance the field names. Then you can just write the request, and send it.
In some cases (depending on the web application) you can evenskip the
first step and send in the filled form; of course, this depends on the
application being controlled.

I hope my explanation will be helpful. Please forgive any spelling errors
(or even a few factual ones) - it's Snday, you know, and it's possible
that a few neurons are still sleeping while I write this.

Best regards,


Carlos Ribeiro
cribeiro at mail.inet.com.br








More information about the Python-list mailing list