Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Channels ▼

Web Development

Selenium: Cross-browser Website Testing

Alexander (Sasha) Sirotkin works on the LTE (Long Term Evolution) Project at Comsys Mobile. Alexander can be reached via e-mail at [email protected]

A Web developer's life has never been easy. Browser market fragmentation, poor standards compliance, and the sheer number of web programming languages and tools all add up to challenges when trying to satisfy user demands and keeping up with ever changing web technologies and trends. And with what appears to be a second browser war underway, the situation is probably not going to get better any time soon. As if this isn't, the advent of cellular broadband access brought to the collection of browsers that you have to test against a handful of new ones, sometimes with limited functionality and non-standard screen resolution. In short, these days you can no longer afford to test your site with just your favorite browser.

Enter Selenium Remote Control (RC), a tool that lets you programmatically simulate user behavior, launch a browser, open a URL, type some text, click on a button, wait for the web site response and check the browser state. Although implemented in Java and Javascript, Selenium RC supports a variety of languages for test programming, including Java, C#, Perl, Python, and PHP. And last but not least, Selenium is a free open source project with a large and active community of developers and website testers.

Figure 1 shows a typical Selenium test system in which the Selenium client library communicates with the Selenium server via TCP/IP.

Figure 1: Selenium Remote Control Architecture

A single client can drive a number of browsers via one or more Selenium servers. The number of servers a single client can work with is limited by HW resources and Selenium scalability issues, the rule of thumb is up to 5. Since the Selenium client and the server communicate via TCP/IP they can run on different machines with potentially different operating systems -- Selenium runs on Windows, Linux, Mac OS and probably pretty much every OS that has Java support. The most important part of Selenium is the Selenium core which is implemented in Javascript and runs inside the browser.

Now that you have some understanding of what Selenium is and how it works, it is time to write some code. The following Selenium RC example makes a simple Internet search with Google. It is implemented in Java (as all the rest of the examples in this article):

    Selenium selenium = new DefaultSelenium("localhost", 4444, "*iexplore", "http://www.google.com");
    selenium.type("q", "42");
// Do some stuff

The code is rather self-explanatory, so I will only describe the important arguments of each command. This example assumes that both the client and the server (Selenium server, not the web server) run on the same machine, i.e. localhost and the server is configured to listen on the port number 4444. The *iexplore parameter tells Selenium to use a certain Internet Explorer profile (more on this later). Commands type and click accept HTML element name or id as their first argument, which you, as a web site developer of your own site under test, should know.

The "BrowserString" parameter (*iexplore) is very important and unfortunately poorly documented. If you only have to test with Firefox and Internet Explorer you can simply use *firefox and *iexplore respectively and forget about this. However, if you need to test with additional browsers, read on.

This parameter not only specifies the browser Selenium will work with, but also the method Selenium uses to control the browser and the mode of communication between the Selenium server and the Selenium core running inside the browser. And yes, there are multiple modes for some browsers, not all modes are implemented for every browser and some Selenium commands will not work with certain modes. To add to the confusion the default modes sometimes change between different Selenium versions; currently (Selenium 1.0) *firefox is an alias to *chrome and *iexplore to *iehta. Both *chrome and *iehta browser profiles implement a "native" approach in which Selenium uses whatever method is best for each browser to "inject" the Selenium Core Javascript code and to control the browser, as opposed to a "proxy injection mode" which is generic and, at least in theory, should work with all browsers.

When Proxy Injection (PI) mode is enabled, the Selenium RC which has a built in HTTP proxy server configures the custom profile of the browser under test to work with this local proxy. To every http request returned by this proxy server it "injects" the Selenium Core Javascript code into the <head> html element. If you take a look at you web site content (using the "view page source" option of your browser) you will see, in addition to your code, the following code along with other scripts, which comprise the Selenium Core.

<script type="text/javascript" src="/selenium-server/core/scripts/selenium-browserbot.js"></script>

By design the Proxy Injection mode works with all browsers, even these that are were not tested with Selenium, as long as the browser supports Javascript and HTTP proxy. Not only that, but it also allows to circumvent the "same origin policy", i.e. test different sites on different domains during the same Selenium session for browsers for which the "native" mode is not complete; Opera, for instance.

Same Origin Policy

Same origin policy is the security concept of most modern browsers which defines a clear separation of scripts and content originating from different sites. In a nutshell, it prevents Javascript code downloaded from one domain to interact with the script from a different domain. In particular, it prevents Javascript from navigating your browser to a different domain.


At this point you may think that this the mode you should use for you tests, but unfortunately during the Selenium development people discovered that some important functionality is very hard to implement in this mode and gradually most developers switch to the so called "native" mode, even though it requires a separate implementation for every browser. As a result, the PI mode is poorly maintained and quite buggy. For instance, during my experiments I was surprised to discover that the PI mode does not handles correctly encoding other than Unicode!

If you decide to try the PI mode anyway you may think that the way to enable it would be to use the Proxy Injection mode profiles *pifirefox and for Firefox and IE, respectively. In pratice it should be enabled in the server and the profile argument actually has no effect -- Selenium would switch from *firefox to *pifirefox and vice versa, depending on the server configuration. Did I already say that Selenium mode configuration is confusing? The fact that PI mode switch is a server run time argument intoduces another significant limitation -- the Selenium server working in the PI mode can only handle one concurrent session.

To summarize: As a rule of thumb remember that you should try to avoid using the Proxy Injection mode and use *firefox and *iexplore modes instead. However, if you have to test with less popular browser for which there is no good native mode implementation and the same origin policy is an issue. Be prepared that some of Selenium functions will simply not work.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.