Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Channels ▼


Android: Open-Source Scripting for Testing and Automation

Source Code Accompanies This Article. Download It Now.

Jul01: Programmer's Toolchest

Cameron works for PhaseIt, a consulting firm. Larry is a contracting programmer and writer with more than 20-years experience in software development. They can be contacted at [email protected] and [email protected], respectively.

Android is a tool for recording and playing back scripts of X11 events. Created by Compaq developers who were testing GUI-based programs in various languages, Android watches you interact with a program, transcribing a script that replicates everything you do. It even records your corrections and hesitations. You can also direct it to take snapshots of a window for later comparison. In short, it is a personal assistant that's smart enough to "do what you mean" when drudge work is waiting.

You can play back such a script to replicate your previous session. Android also knows how to take snapshots of a new version of your program and compare them with those saved from your "baseline" scripting run. It issues warnings when any of the windows have changed. As a regression test tool, Android is invaluable — and it's good for more than just testing.

Of course, a number of other test-automation tools do a similar job. However, most others are limited to driving programs coded with just one particular widget set. Android works at the level of the X protocol using the XTEST extension. This extension lets Android create synthetic events indistinguishable from those generated by live users, so Android can do anything live users can — and Android does it with any widget set, or even no widget set.

For instance, one of us (Larry) was once faced with the onerous task of transferring 40-some files from a Windows system to a PalmPilot. The PalmPilot software, sadly, imports memo files only one at a time. Larry fired up Android, had it start vncclient, and went through the motions of importing the first file. Then he wrote a loop around the relevant code, adding one additional cursor down to select the next file for insertion. Then he left for lunch. The files were all properly downloaded when he returned. Android is the first practical tool we know to script such "quick-n-dirty" hacks for GUI programs.

Android is an extension of John Ousterhaut's Tcl language. It is written in — and runs with — Expect, Don Libes's Tcl extension (http://starbase.neosoft.com/~claird/comp.lang.tcl/expect.html) for controlling command-line-driven programs. However, the scripts Android generates run properly with wish, the standard Tcl/Tk shell, and they dynamically load Android's library for the XTEST extension, so they need no special environment and only one sharable library. Full source code and documentation for Android is available at http://server.open-hardware.org/download/contrib/Compaq/ and http://www.smith-house.org/open.html. The tool is provided under the standard GNU Programmers License (GPL) (with other licenses available by contract).

A First Example

Suppose, for instance, you are working on xcalc and you're happy with the primary keyboard window, and want to protect it and the logic that drives it while you tinker with other areas. We'll create a simple script that we run periodically to verify the main window works properly.

To create a test script, you only need to bring up Android in record mode: $ android -record xcalc.tst. Android is indifferent to the .tst extension. We make it a habit to use this extension, though, as a reminder that the content is a test script.

Android comes up in its Edit/Control window in Figure 1. This is the skeleton of the script we're developing. The first line invokes wish from the shell. The following two lines are housekeeping: They close the default "wish" window, which isn't needed, and initialize the count for the number of comparison failures. The next line loads the Tcl library; this lets the script communicate with your X server using the XTEST protocol. The cursor is located at the blank line just before the code for the final report. This is where your real script goes.

To start your test program, use "Test/Start Program...". The first thing you see is a wait dialog announcing "starting monitor." Android uses the Xscope utility to watch your actions on the keyboard and mouse, and this is what is being started.

Xscope is a virtual X server. When run, it connects to the local X server and provides a new display — usually :1 — that X clients can connect to. Xscope merely routes events sent to it to the real X server it is connected to, but in the process it spools text descriptions of each event to stdout.

Android's big value-add is the collection of regular expressions it uses to parse useful data out of Xscope's rather voluminous output. This information is distilled down into the list of send_xevents commands that can exactly replicate your session.

Xscope is also the source of one of Android's limitations — it can only track things that are using Xscope. Your window manager, already started and talking directly to the X server, doesn't do this, so Android has problems tracking anything not started from Android itself.

Once Android properly launches Xscope, it presents you with Figure 2. At this point you may specify the command line that starts your program. You should always remember to specify a -geometry option, or Android will add one for you. One drawback of working at such a low level is that Android works entirely with screen — not window — coordinates. If the window is not placed in the same place on the screen each time Android or one of its test scripts is run, the results probably won't match, generating bogus regression failures. Fortunately, these are easy to detect, since it is rare that a given test run will report 100 percent failures in the real world.

Sadly, Android cannot tell if you move a window — the mouse actions that move a window are routed to the window manager rather than Xscope. Choose your geometry carefully because you'll have to live with it.

For the example test, enter: /usr/X11R6/ bin/xcalc -geometry +400+100. Immediately, the line: exec /usr/X11R6/bin/xcalc -geometry +400+100 & appears in the script and xcalc's own window opens on the screen.

Remember that Android uses a monitor (Xscope) to watch you interact with the test program. Shouldn't there be a "-display" flag also? No. This line is for playback, and no monitor is needed then. Rest assured that the version of xcalc you are looking at was started with Xscope.

You can do whatever you like with the program at this point, but Android takes no action. Android's design default is not to record, to give you an opportunity to start and configure other programs that might be necessary for your test — servers, for example. Once you're ready to record, choose Test/Track Events.

Once you do this, you will see send_ xevents commands appear in the Edit window as you perform xcalc actions. When you type "1+2=" using the mouse, the script window looks like Figure 3.

While each send_xevent takes an arbitrary number of subcommands, Android usually generates them in triplets of wait, @, and an input event. Wait simply pauses for a time measured in milliseconds. @ moves to specified screen coordinates. btnup/btndn are common input events; the number following these tells which mouse button has traveled up or down.

Suppose you're concerned about a particular segment of source code exercised by a particular esoteric calculation. You want to confirm that the calculation survives changes you're making to xcalc. That's the function of Test/Take Snapshot. This selection brings up another dialog that names the test, then offers a crosshair cursor so the Android recorder can choose a specific window for the snapshot. The new code (see Listing One) appears in the Script window. This is one complete test, as far as Android is concerned. Upon playback, Android takes a new snapshot and compares it to the one it just took. If they match, you're gold: The program still does what you expect it to for that input. If it changed, you get a failure.

Suppose you've finished the test sequence and it's time to tidy up by exiting the program and saving the script. Unfortunately, xcalc has no exit. If you click on the window decoration to close it, you bypass the monitor. This kind of gotcha is one reason Android simply extends Tcl — you have the Tcl interpreter at your fingertips. It also illustrates the value of having Android write the script right in front of you. You can click in the Script window, add exec killall xcalc, and the script is done. Save it and quit in the usual way.

Now run the script with ./xcalc.tst. The xcalc window pops up and the mouse moves itself about magically as the various commands are executed. When the snapshot is taken, you see a report line in the executing window telling you "Test 0: succeeded." After this, xcalc disappears.

Do not use the keyboard or mouse while the script is running. Android sends events to your X11 server. If you mess with the mouse or keyboard, your events will be mixed in with those that Android is producing. The result will likely not be what you intended.

Scripting Options

That's the Android story. The documentation file android.html details a few more bells and whistles. One enabled by default is "compress motion events," under the Options menu. This means that multiple motion events — @here, @there, @someplace else, and so on — are collapsed down to @(wherever the next significant event takes place). If you disable this option, Android mimics your every move, every hesitation, as you move about and type into the test program. A drawing package is one example of a program that needs this exact mimicry and should not "compress motion events."

"Auto Save" saves the script you are working on when you close or exit the program. Otherwise, you are prompted if you try to exit without saving.

The options "Compress Time in Playback" and "Real Time in Playback" set or clear the compress_time flag. The former sets an upper limit on the wait command, so large pauses (like a break for coffee) won't be seen. The "Real Time..." selection resets this variable, with the consequence that Android's delays are just as long as yours.

Writing Scripts by Hand

Android's ability to record user keyboard and mouse actions is its claim to fame. There are times, though, when you'll want to write scripts by hand. Well-designed scripts can be far more maintainable than those that Android typically generates.

Define, for instance, the button in Example 1(a). To invoke it, you need only Example 1(b). Android has all the facilities of Tcl as a scripting language, so it can "prettify" Example 1(c). Human script authors use techniques like this to abstract out meaning in a script where Android proper would see and record only specific mouse motions.

Keep in mind that send_events can manage arbitrary subcommands; it is not limited to triplets. You can create long lists of commands or break them up as you see fit. Example 2(a) has precisely the effect of Example 2(b). This particular sequence can't be decomposed farther; it doesn't make sense to request send_ xevents click; send_xevents 1.

Android recognizes several commands beyond those it uses for its own recordings. The aforementioned click command is one of them; it sends btndn/btnup events. Other timesavers for people writing or editing scripts include keydn/keyup commands, which send keysyms by name such as keydn Insert; keyup Insert. The keysyms supported on your system are usually in the /usr/include/X11/keysymdef.h file. Android uses keysym names in the form without the XK_ prefix.

keydn/keyup pairs can become tiresome. Android simply abbreviates these with key [keysym name], which sends the keydn/keyup events for you, in an example like key Insert.

You can send events to any X server with Android's display command. For instance, send_xevent display :0.1 click 1 sends a mouse click of button 1 to your :0.1 display. You can freely intermix display commands in send_xevents lists and control any number of displays.

Android also provides a handy shorthand for typing strings — the type command: send_xevent type {Now is the time for all good men...} which, as you can see, would save a lot of keyup/keydn events, as well as being much easier to edit and maintain. Type converts each of the characters in the given string (which should be delimited by "" or {}) to their equivalent keycodes and generates key press/release events for each in sequence. Strings may consist of upper- and lowercase letters, and the basic set of punctuation. Esoteric punctuation, special characters, and so on are not supported; these must be accessed by keysym. Android does recognize \n, \t and \\ sequences. type {ls\n} in an xterm window will execute an ls command, for example.

This ability to work at such a low level makes Android very handy for testing programs in various languages. It doesn't care about your locale, keyboard, or input method. All it cares about is grabbing your event stream and recording it, and comparing X window dumps. It's internationalized by nature. The Android extension adds a few other useful commands to Tcl beside send_xevents. Each of these is a command at the Tcl level, not a subcommand of send_xevents.

The dispinfo command provides information about the state of the keyboard and mouse at the present instant. dispinfo modkeys delivers a list of modifier keys presently in force. If users were to hold down the Shift-Ctrl keys at the same time, dispinfo modkeys returns {Shift Control}.

Find out where the mouse is using dispinfo mouse x and dispinfo mouse y. These commands could theoretically return inconsistent pairs of x and y if the mouse were to move in between calls; that is, it could give the 20 of 20,100 and the 130 of 30,130, providing the fictitious coordinate 20,130. To avoid this problem, dispinfo mouse caches coordinates and returns the x or y associated with the cache. You can refresh the cache with the current mouse coordinates using the fresh specifier: dispinfo fresh mouse x gives you the latest x-coordinate, and dispinfo mouse y the corresponding y-coordinate.

dispinfo buttons returns a list of all the mouse buttons that are pressed at the current moment. Query the state of any given mouse button using dispinfo button n; this returns a 1 if button n is pressed, and 0 if it is not, where n is the number of the mouse button you are interested in.

A final use of dispinfo is to find out what extensions you have installed in your X server. A simple call of dispinfo installed returns a blank-delimited list of all X11 extensions installed on the currently selected display.

Calling dispinfo installed with a list of extension names returns 1 if all the named extensions are present, and 0 if they are not all present; see Example 3.


Android was written as a testing tool. It's flexible enough to lend itself to other uses, though. It makes a good marriage, for example, with the widely used open-source VNC "remote control" utility. Android can act through VNC (virtual network computing; see http://starbase.neosoft.com/~claird/comp.windows.misc/VNC.html) to drive even the GUI-based programs of a Windows system. This is especially helpful for remote administration — you can write a script that changes a parameter on your UNIX systems and restarts your daemons, and then have it turn around and use Android to perform the same manipulation on Windows, up to and including starting the reboot. This is an exceptionally handy way of restoring the command-line power of yesteryear to modern GUI interfaces.


Listing One

exec /usr/X11R6/bin/xwd -silent -out snapshot.tmp &
after 3000
send_xevents @596,305 click 1
after 3000
set err 0
set result [ catch {
     exec cmp snapshot.tmp xcalc.tst.ss.0
} err ]
if { $result == 0 } {
  puts "Test 0: succeeded First Test"
} else {
  puts "Test 0: failed    First Test ($err)"
  incr failures

Back to Article

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.