Wednesday, February 20, 2008

zope.testbrowser: Automating the Web

Speaking of little known python modules, if you haven't used zope.testbrowser, please raise your hand. Okay, all of you that raised your hand and have ever wanted to automate web forms, check a web site's functionality, or perform screen-scraping related tasks, pull up a chair. This won't take long, because zope.testbrowser's API rocks so hard and is so easy to use.

I first started using zope.testbrowser sometime before consulting for Zenoss. However, when I implemented Zenoss' "Synthetic Transactions" (I believe this was originally part of their open source offering, but has since been moved into the Enterprise Edition), I used it again and profusely. It provided me with a wonderful Python API (a mechanize wrapper and then some), one so intuitive that I could use not only without thinking, but for which I didn't even have to read the docs. This solution proved to be friendly to programmers, but we were really targeting non-programming network administrators and IT managers with the Zenoss plugin, so I completed the plugin using Twill and TestGen4Web (whose python runner I rewrote and to which I added support for zope.testbrowser).

I used it again at Zenoss later when building the community software, enabling users to set their Mailman preferences from their Plone settings. Most recently, I've been using it to publish news articles to various sites from a single file used by a script.

Here are a couple of standard use cases with sample code:

Visit a page and follow some links


That gives us the actual link object... but what if we want to follow that link?

Simple enough :-)

Check for content

Let's continue on the Google link trail, and check for content:


Sign into a site

Signing into a site requires filling in form information and submitting that data.


A couple things to note here:
  1. you need to look at the HTML so that you know what the form elements are named;
  2. there are two form elements on the page that are named loginpage_email, and we want the first one; and
  3. I omitted the part of the code where I pulled my credentials from the file system.
Submit form data

Now that we're logged in, let's change some out-dated information in my profile (my old blog's link has been there for too long):


Now that we saved it, let's check our results:

Excellent; just what we expected to see.

These are all really simple examples, but they should be helpful in providing some insight and inspiring you to use it :-) Working with radio buttons is a little more complex in that you have to get the containing object first and use getControl on that object in order to get the selection you want. However, if you've worked with HTML more than any sane person should have to (I think that exact amount is "any") , this will all be quite expected.

So, who's up for making an async version to work with Twisted? ;-)

A tangential caveat: I made a comment above about the zope.testbrowser API rocking, and that needs some clarification. As far as I am concerned, there are two major ways in which API's can be graded:
  1. how easy is the API to use, how well is it documented, how much of the problem domain does it cover? (developer as user)
  2. how elegantly/efficiently was the solution implemented (developer as contributor)
zope.testbrowser rocks from the perspective of a coder using the API. I've never looked at how it is implemented, because I haven't needed to. It does everything I require without me having to dig around in its guts. As such, I can make no assertions as to the quality of its implementation. However, it's packages, modules, classes and methods are all named just as I would expect them to be named and do what I would expect them to do. That may not sound like a big deal, but it really is.


Technorati Tags: , , , ,

1 comment:

Anonymous said...

About your first note, you can always do:

In [10]: f= b.getForm ()

In [11]: mf= f.mech_form

In [14]: mf.controls
Out[14]:
[ClientForm.HiddenControl instance at 0x9cd834c,
ClientForm.TextControl instance at 0x9cd840c,
ClientForm.PasswordControl instance at 0x9cd846c,
ClientForm.CheckboxControl instance at 0x9cd84ac,
ClientForm.SubmitControl instance at 0x9cd850c,
ClientForm.SelectControl instance at 0x9cd854c]

In [15]: [ c.name for c in mf.controls ]
Out[15]: ['loginOp', 'username', 'password', 'zrememberme', None, 'client']

Unluckily in my case (a Zimbra login page) the submit button does not have a valid name, but I can select it by label:

In [26]: s= mf.controls[4]

In [27]: s.attrs
Out[27]: {'class': 'zLoginButton', 'type': 'submit', 'value': 'Log In'}

In [28]: f.getControl (label='Log In')
Out[28]: SubmitControl name=None type='submit'

In [29]: f.getControl (label='Log In').click ()