Channels ▼
RSS

Java@Work | Parlez-Vous Java? (Web Techniques, Sep 2000)


Java@Work | Parlez-Vous Java? (Web Techniques, Sep 2000)

Parlez-Vous Java?

By Al Williams

Nothing has made the world smaller than the Web. Today more than ever, you're likely to develop programs that will serve people in countries around the world. However, Esperanto notwithstanding, there's no international language, and some percentage of your users won't be native English speakers. In fact, many may not speak or read English at all. To make sure these visitors can use your programs with ease, you need to take internationalization issues into account.

Fortunately, if you're writing your program in Java, there are a number of internationalization features available. For example, if you're accustomed to creating programs with user interfaces displayed in English, or in any single language, you'll find that Java has several built-in classes for supporting multiple languages.

With Honors

Let's say you had an applet that displayed some text that happens to read "Web Techniques Honors...."

Of course, in the United Kingdom, "honors" would be spelled "honours." And in other languages, the entire phrase might be different. AltaVista's Babelfish service tells me that in French, the phrase should be "d'honneurs de Web Techniques." So how can you ensure that the text appears in the correct translation to the person viewing it?

One way to approach this would be to store the various phrases in a file (perhaps a property file), deduce which language the user has set on his or her computer, and then read in the correct strings accordingly. This can be tedious to program and you might also have to duplicate phrases for similar languages (such as U.S. English vs. U.K. English) in your file.

Java provides a better method to select the correct language's strings automatically (and other objects, as well). The mechanism is a resource bundle. A resource bundle stores strings and other objects for a particular language, as shown in Listing 1. Java applets can automatically detect a user's language settings and select the correct resource bundle for that particular language. For example, applets on a Canadian Web site might have an English bundle and a French bundle. Any text in the applet would appear in the user's preferred language. You could localize many strings including label text (as I've done in Listing 1), menu text, or error messages. (See " ASCII, Unicode, and UTF-8" to learn how Java stores characters.)

Locales

Before you can put resource bundles to use, it's important to understand locales. A locale (represented by the java.util.Locale class) indicates a language, a country, and a dialect or variant. For example, your user's language might be English. Knowing the user's particular country could help specify the difference between American English and British English. But isn't English always the same, you ask? No way.

First, Britain uses a different currency symbol, and different date formats from those used in the U.S. Besides that, some words are spelled differently (honor and honour, for example). Even words for common objects are different (gasoline and petrol; kerosene and paraffin). If you're still not convinced, consider this British headline: "British Left Waffles on Falklands." To make matters more complex, there are some countries that have even further subdivisions or variants on a single language, called dialects. Java's Locale class also accounts for such dialects.

Beyond words, languages can be different in other seemingly innocuous, but important, ways. For example, in Spanish (I'm told) ch is treated as one letter when it's alphabetized. Thai and Lao vowels have peculiar sorting rules, as well. The java.util.Collation class is sensitive to these rules. You can read more than you ever wanted to know about international collation at Unicode's site (see " Online").

Most modern operating systems also support the idea of locale. In fact, your Java programs can query a site visitor's operating system for the default locale. All you have to do is have your program call the static function java.util.Locale.getDefault, which returns a locale object. You can use member functions of the locale object to get the specific country and language information. If you want to change the current locale, you can use the setDefault function.

Applets and Locales

Of course, applets—at least, applets that don't have special security permissions—can't change the locale. They can still read it, however. If you really need to change the locale from an applet, you'll have to create a secure applet. Unfortunately, the process for this is different depending on whether the applet will run in Internet Explorer or Netscape.

This makes it a bit harder to test applets that rely on locale information. You'll need to change the default locale of your entire operating system to see how that affects the applet. In Windows you have to use the Control Panel's Regional Settings option to change the operating system's locale information. Of course, a normal user's information will already be set correctly. Only a developer will want to change the locale setting often. Windows will ask you to restart, but I have found you can ignore the restart prompt and simply restart the Web browser.

Resource Bundles

Many operating systems support the notion of resources. Resources are files that contain strings and other items you might want to change depending on the default language or other constraints. For example, a Windows resource file contains the text it will display in menus, dialog boxes, and other messages. Java also handles the idea of a resource, although it does so in a different way from most other systems. For example, with Windows, resources appear in an executable file. Not surprisingly, Java takes an object-oriented approach: A resource is an object, nominally derived from the java.util.ResourceBundle class.

Because ResourceBundle is an abstract class, the class you derive from it must provide the code for the handleGetObject and getKeys methods. The handleGetObject method receives a key and returns an object that corresponds to that key. The getKeys method returns an enumeration for all the keys. Of course, you could easily use a hash table to implement the resource bundle.

How does this help with internationalization? The trick is the static member, getBundle. This function takes a base name for the resource bundle. Using the base name, the function searches for an appropriate class that will handle the specific locale in use. It first considers the three parts of the locale: the language, the country, and the dialect (or variant) if there is one. The function joins these parts with the base name and searches for a class or property file with the correct name. Next, it tries just the base name, language, and the country followed by the base name and just the language.

For example, suppose the base name is REZ (you could use any name you wish), the current language is en (English) and the country is us (the United States). The getBundle function tries to find a resource bundle named REZ_en_us that tells the program which formats and text to use. If the function can't find an object named REZ_en_us, it looks for REZ_en. Finally, if it can't find anything, it simply uses the REZ object. Although this example doesn't use a dialect, keep in mind that the search can also take into account a dialect or variant.

Using this technique you can load a resource bundle and use it to supply objects (often in the form of text strings) that your program uses. The program will load the strings you've set for the language in use at the time. So instead of having to hard-code things like menu text, you can load the string from a resource bundle. Because of the way bundles work, the strings you load will be different depending on the language in use. Of course, you still have to create the localized bundles—Java isn't smart enough to translate your text automatically (at least, not yet).

It's repetitive to keep creating subclasses of ResourceBundle and writing more or less the same code to use a hash table. That's why Java provides the ListResourceBundle class. You can subclass this object and provide a single function named getContents. This function returns a two-dimensional array of keys and objects.

The ListResourceBundle class provides its own getKeys and handleGetObject functions that work with the array you provide. This greatly simplifies making resource bundles. But this still seems like a lot of work if you're only providing strings.

Property Files

Because the most common way to use resource bundles is to store localized strings (as opposed to storing other types of objects), Java provides an even easier way to use them in this case. You can simply place a file in the Java class path with the name of the resource bundle and a .properties extension.

If the getBundle method finds a file with a matching name, it will automatically instantiate an object using the data from the property file (in conjunction with the PropertyResourceBundle object). For example, suppose getBundle doesn't find a REZ_en_us class. However, there's a file named REZ_en_us.properties. The getBundle function will create and return the REZ_en_us object. The file must have the same format as any other property file (see Example 1).

Property files are convenient because you can simply give the file to a translator who knows nothing about Java or programming. Once the translation is complete, you simply put the file in your class path and you're ready to go. You don't have to worry about reintegrating the translations into a Java source file and rebuilding.

Resource Bundle Searching

Earlier you saw that when getBundle executes, it loads the most appropriate resource bundle depending on the current language in use. As mentioned, in the case that it can't find any resource bundle objects, the function looks for a property file with the base name. One of these will always exist, whether it's one that the program author created, or the default file for the author's native language. For example, I would not have to provide a specific object or property file for U.S. English. In locales that I haven't planned for, the program will at least work (albeit with English text).

The search process doesn't stop when the getBundle function finds a bundle. Instead, the function keeps searching. Each less-specific bundle becomes a pseudo parent of the more specific bundle. If a resource isn't found in a bundle, Java searches parent bundles automatically. This lets you put most of your Spanish resources in one file, but have specific customizations for each country in which Spanish is spoken. Note, though, that although you might consider Bundle_es the parent of Bundle_es_MX, both will extend ListResourceBundle. So they aren't parent and child objects in the Java object hierarchy.

Honors Revisited

How can you use locales in practice? If you're using only strings, you might as well stick with property files. However, if you want to store objects other than strings, making use of the ListResourceBundle object in your program is a good bet. I'll show you how to do both in Listing 1, but usually you'd use just one or the other.

Because it's such an inconvenience to change your country settings to test an applet, I also included a small test main function. This lets you run the class as an applet or an ordinary Java program. As with an ordinary Java program, you can change the locale easily. This makes it easy to test the program before you try it as an applet.

The default resource bundle is American English in the IntlRes.properties file (see Example 1). For the United Kingdom, I created a similar properties file named IntlRes_en_GB. You can create as many property files as necessary, translating the msg1 string as appropriate. I also placed msg2 only in the default file. That means that msg2 is always the same because, regardless of language, Java will always search this file if it can't find a more specific replacement for the string.

For French, I decided to use ListResourceBundle directly. The resulting class appears in Listing 2. The array never changes, so you might as well make it static and final. Again, because this array contains only strings, you might as well use a property file. However, if you want to store other object types, you can easily do so by placing the objects in the array—something you can't do with a property file.

Why Bother?

It's easy to take a limited world view and assume that if a program works in your native language, that's good enough. But to remain competitive, smart companies are making their sites and applications friendly to the world.

In truth, following good programming practices will make many internationalization (I18N) issues easy to deal with. Java's support for Unicode and resource bundles makes it easier to localize strings than many other languages. However, you still need to pay close attention to international language issues early in the design phase. For an interesting case study, check out IBM's implementation of JavaScript (see " Online"). Sun also has a free tool (JILT) that can simplify some I18N tasks. You'll find a review of JILT, and a similar commercial product named Multilizer, in this month's programming product review.

One day soon, programs will be internationalized in the way that many are now "Web enabled." Luckily, Java has many features to help you accomplish this task without waiting for that day.

(Get the source code for this article here.)


Al is the author of many popular programming books, including Active Server Pages Solutions. You'll find Al around the country teaching Web development courses in conjunction with Wintellect (www.wintellect.com). He's on the Web at www.al-williams.com.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video