Automatically Generating Java Documentation

Because Java doesn't require a header file, documentation is really important in a Java environment. Gary examines some of the documentation issues related to Java programming and discusses how to automatically generate documentation for Java classes using javadoc.


July 01, 1996
URL:http://www.drdobbs.com/windows/automatically-generating-java-documentat/184409917

July 1996: Automatically Generating Java Documentation

Automatically Generating Java Documentation

javadoc and the doc comment

Gary Aitken

Gary has been technical lead and chief architect for a large commercial UNIX-based C++ toolkit for the past seven years. He is now an engineer at Integrated Computer Solutions, where he is working on Java-based technology. He can be reached at [email protected].


In my article "Moving from C++ to Java" (DDJ, March 1996), I pointed out that everything about a Java class is contained in a single file-there are no separate header files as in C and C++. Since the code to implement a method must appear with the method definition, it is difficult to look at any Java source and get an impression of how a class should be used. If you have a binary Java class library, there is no associated source code to look at. The information conveyed by a traditional C/C++ header file is encoded in the compiled Java binary, and although the compiler can find it, that doesn't do you much good. Since you can no longer get a concise summary of the interface to a class, you have to scan the entire source file to find each little piece. Consequently, documentation is of paramount importance in the Java environment. In this article, I'll examine some of the documentation issues related to Java programming and discuss how to automatically generate documentation for Java classes.

You can think of code as many different things:

Based on these categories, how you think of code determines what your documentation will explain:

Java introduces an additional kind of comment, known as a "doc comment," to help automate the process of producing programmer documentation and to ensure that the documentation matches the actual code. The standard Java environment includes a tool named "javadoc" that processes Java source files and produces a set of HTML documentation files-Web pages-from the doc comments. The doc comment, in combination with javadoc, addresses the documentation issues involved in exporting classes and using methods for each class. If you provide the proper comments in the code, at least some of the documentation can be produced automatically, in a form suitable for online use. While not required by the Java environment, if you are going to produce classes that others can easily reuse, you should start getting into the habit of including appropriate doc comments in every Java class you write. You should also keep the documentation up to date whenever you change the signature of a method.

Javadoc produces HTML documentation in a particular format. Other processors will probably appear that produce output in different forms, all from the same embedded doc comments. (In fact, javadoc currently can also produce FrameMaker .MIF format, an undocumented capability.)

For each public class or interface (.java file), javadoc produces HTML pages with the following components, in the order listed:

    1.Header.

    2.Class hierarchy diagram.

    3.Class description.

    4.Index of public data fields, sorted alphabetically, with static and instance variables intermixed.

    5.Index of constructors.

    6.Index of public methods, sorted alphabetically, with static and normal methods intermixed.

    7.Description of public variables, in the order they appear in the source file.

    8.Description of public constructors, in the order they appear in the source file.

    9.Description of public methods, in the order they appear in the source file.

Javadoc will include an entry for all public classes, variables, and methods, whether or not they have a doc comment associated with them. The descriptions-public variables, constructors, and methods-appear in the same order in which the items appear in the source file (on the assumption that you have tried to group related functions together).

Machine-generated documentation is going to become commonplace as a result of programs like javadoc. This documentation is a reflection of the quality of your professional work, and since no one is going to edit the automatically produced documentation, it behooves you to include the best possible doc comments you can. Besides, it makes good sense-adequate documentation means coworkers won't waste time deciphering code they aren't particularly interested in, simply to learn how to use it.

Doc Comment Syntax

The doc comment uses the form /** ... */. Whether the comment is used by javadoc to produce documentation is context sensitive. Doc comments are only significant to javadoc in the following circumstances:

A single doc comment should be used for each situation; if multiple doc comments are encountered preceding a particular definition, only the last one is used.

Within any particular line of a doc comment, leading space and a sequence of one or more consecutive asterisk characters are ignored.

Depending on the context, a doc comment may contain additional keywords, or tags, that flag components to be specially treated. All tags start with @. To be recognized, tags are supposed to be at the beginning of the line on which they appear. Any leading spaces followed by asterisks are ignored, so the sequence

* @sometag is recognized as a tag. Note that horizontal tab characters are not allowed in this leading sequence.

Table 1 lists the special tags and the circumstances in which they may appear. Lines containing unknown tags are included as normal text in the preceding item; case is important.

Each doc comment is composed of the following:

Summary line. The first "nonempty" line of a doc comment has particular significance. It is used as a summary, and is inserted as a one-line synopsis in the

appropriate index (components 4-6 of the generated HTML). "Nonempty," in this case, means a line containing nonwhitespace other than "/**" (the first line of the doc comment, which may also contain text) or a series of spaces followed by a series of one or more asterisks.

Normal text lines. Lines containing normal text follow the summary line. These lines are considered part of the detailed textual explanation. They are treated as HTML, so line breaks, tabs, and extra spaces do not normally have any significance. They may contain normal HTML tags, which allow special formatting for things like code examples. They should not, however, include HTML tags used for header information, since javadoc inserts its own header information throughout the document.

Tagged lines. These lines should always appear after the detail lines. Lines containing tags are not sorted by tag, so you should keep all tags of the same type together-all @param lines together, all @exception lines together, and so on. The arbitrary-text portion of a tagged line may extend over multiple lines. Any lines following the last tagged line will contain the formatting style of the last tagged line and will appear to be part of the documentation for the last tagged line. What this means, for practical purposes, is that your doc comments should look like Figure 1(a) or 1(b), and order is important.

Generating HTML

The documentation available for the javadoc tool is somewhat sparse. Javadoc is invoked by entering javadoc [options] packageName1 packageName2... classPath1.java classPath2.java.... Allowable options are as follows:

Javadoc generates output with relative pathnames for the hypertext links. In theory, this means the destination documentation directory could be anywhere. However, because all of your Java classes will have links to the base Java language documentation, and these links also are relative, the destination directory must be the same as the directory that contains the API docs for the standard Java distribution.

If you invoke javadoc and specify path names to classes, the generated HTML files will contain correct pages for the individual classes, but any index and package pages will be incomplete. For this reason, I suggest you always generate the documentation for a complete system using the package arguments instead of individual .java class paths. Note the differences in the way the two types of arguments are entered-individual classes are specified using a normal file name (for example, AllPkgs\PkgA\MyClass.java) while packages are specified using package names, such as AllPkgs.PkgA.

To illustrate, assume you have a directory structure like Figure 2. Your Java code is in two packages, AllPkgs.PkgA and AllPkgs.PkgB. The JDK 1.0 distribution is under JDK_1_0, with the API documentation in the API subdirectory. If you position at java\MyProgs (UNIX users switch all occurrences of \ in pathnames to /), you would generate the complete documentation for all of your packages by invoking javadoc as: javadoc -version -authors -d java\JDK_1_0\api AllPkgs.PkgA AllPkgs.PkgB. This would produce the files in Figure 3. Unfortunately, you can't just point javadoc at the top-level directory (AllPkgs); it does not recurse.

The generated HTML will contain references to images in a subdirectory named "images." The images already exist in the "images" subdirectory for the standard Java API documentation from the distribution. If you install the HTML in the same directory as the standard Java distribution, everything will work fine. Otherwise, you will need to either copy the subdirectory containing the images to the directory where you install your documentation, or make a symbolic link to the images subdirectory (UNIX users). Note again, however, that if you do not install in the same directory as the Java API documentation, the links from your documentation to base language pages will be incorrect.

Make sure all classes compile properly before attempting to generate any documentation. Javadoc spits out all sorts of errors when working with incorrect code.

The @see tag produces references to the indicated classes and functions. Unfortunately, if the reference is to a class or method that is not part of the class in which the @see occurs, Javadoc produces HTML that displays only the final class or method name instead of the fully qualified name. You may become confused if there are similarly named methods in different classes. The reference is correct, and one can decipher it by examining the tag if the HTML browser displays it, but it is not obvious when reading the text.

The @version and @author tags cause entries to appear in the generated HTML files only if the undocumented command line options -version and -authors are used.

If you use some form of source control, you will probably want to include the source control information on the @version line.

Some (Hopefully Productive) Design Criticism

Unfortunately, doc comments are only relevant immediately preceding a class, variable, or method definition; and the javadoc tool only deals with these comments. It is well known that documentation removed from the immediate source line to which it applies is more prone to error than documentation directly associated with the code. When the source line changes, the related comment line is frequently overlooked. It would have been much better if the doc comment had been allowed in the source line of interest, as well as before it. Consider the method definition and associated doc comment in Figure 4.

During the development cycle, arguments, return values, and method names change. This virtually guarantees that some of the documentation will be wrong because it is not located on the same physical line as the quantity to which it refers. The number of incorrect comments will increase over time in most situations. With the current doc comments, there is no guarantee that the parameter name listed in the doc comment matches the name used in the function prototype. "Out of sight, out of mind" applies to these situations, and the doc comment will rapidly be scrolled off the page as the programmer concentrates on the task at hand-the code below the definition.

If doc comments were allowed in the definition itself, this problem could be avoided. As Figure 5 shows, the doc comment for each quantity of interest is on the same line as the item of interest; as a result, a programmer is much more likely to see it and update it. The hypothetical @method tag could be automatically replaced by the method name. Perhaps a future revision will allow this. It requires slightly more intelligent parsing of the source, but certainly nothing particularly difficult, and it would increase the reliability of the automatically generated documentation. An obvious enhancement would be to allow //* to introduce a single-line doc comment.

Other Problems with Doc Comments

A related problem comes from the current lack of good supporting development editors. The most likely source for the original text of a doc comment is an existing doc comment. Perhaps 99 percent of all comments of this type are generated by cutting and pasting the text from an existing method and then editing it. Unfortunately, there are too many situations where the text gets cut and pasted, but not edited. Smart editors could automatically insert appropriate doc comments when a new public function is introduced and could make sure each element is at least modified (or flagged) before the file is saved. They could also force a check when the type of a parameter, exception, or return value is modified or a new one is added.

Nonpublic Classes in a Source File

While a Java source file is limited to containing only one public class definition, it may contain many private class definitions that are not exported. These private class definitions may also contain doc comments. When javadoc is invoked and passed the name of a Java source file, it creates a separate HTML documentation file for each class defined in the source file, regardless of whether the classes are public or private. When invoked with a package name, javadoc does not generate the HTML files for private classes.

Javap: A Last Resort and Sanity Check

What happens if you get a binary distribution of a set of Java classes and the documentation doesn't match the binary? Or if there is no documentation at all? The standard Java distribution includes a disassembler called "javap" to address these problems. When invoked with the name of a class, javap searches the CLASSPATH for a corresponding compiled class definition and prints out the class definition showing a partial signature of all public methods and all fields. Unfortunately, this is seldom useful, since the formal parameter names are missing, and only their types remain. You can find out the method names and their parameter and return value types, but you have no argument names to give you any hint of what the parameters mean. The exceptions are not listed either, although this is not a big problem because the compiler will tell you if you forget about one of them.

It would have been useful if the compiled class file contained the doc comments as well as the code; then the binary version of a class could be a truly self-contained unit. Doing so would have significantly increased the size of a binary file, with corresponding increases in the time required to download a class over the Net; though it would be easy to strip out doc comments on a download for the purposes of execution. It would have made sense to allow inclusion of doc-comment information at least as an option to the compiler.

A Complete Documentation Set

By taking a little care and using appropriate doc comments instead of normal comments in your Java source code, you can get two jobs done at once-documenting your code for future reference by yourself and your coworkers and producing real, usable online documentation for other users of your classes through the use of javadoc.

Acknowledgment

Many thanks to Peter Parnes for his help deciphering some of the undocumented options for javadoc.

Figure 1: Two proper format styles for doc comments.

(a)   /**
 * Summary Line
 * Zero or more lines of detail text; may
     include HTML and multiple paragraphs
 * @xxx  zero or more tag lines, as appropriate
 */

(b)   /**
Summary line
Zero or more lines of detail text; may include
  HTML and multiple paragraphs
@xxx  zero or more tag lines, as appropriate
*/

Table 1: Doc comment tags: (a) class-definition tags; (b) variable-definition tags; (c) method-definition tags

      Tag                                      Example

      (a) @see class-name                      @see MyClass

      @see class-name#method                   @see MyClass#func1

      @see fully-qualified-class-name          @see MyPkg.MyClass

      @see fully-qualified-class-name#method   @see MyPkg.MyClass#func1

      @version arbitrary-text                  @version 1.0

      @author arbitrary-text                   @author Your Name

      (b)                                      @see (any of the above forms)

      (c)                                      @see (any of the above forms)

      @return arbitrary-text                   @return Number of gophers

      @param parameter-name arbitrary-text     @param n Number of items needed

      @exception exception-class-name          @exception MyException When 

      arbitrary-text                           something happens...



.

Figure 3: Typical output files.

java\JDK_1_0\api\AllPkgs.PkgA.C1.html          AllPkgs.PkgA.C1 doc
java\JDK_1_0\api\AllPkgs.PkgB.C2.html          AllPkgs.PkgB.C2 doc
java\JDK_1_0\api\AllPkgs.PkgB.C3.html     AllPks.PkgB.C3 doc
java\JDK_1_0\api\AllNames.html     Index of all classes, methods, and variables
java\JDK_1_0\api\tree.html     Class hierarchy
java\JDK_1_0\api\packages.html     Index to all packages
java\JDK_1_0\api\Package-AllPkgs.PkgA.html     Index to individual class doc in PkgA
java\JDK_1_0\api\Package-AllPkgs.PkgB.html     Index to individual class doc in PkgB

Figure 4: Typical doc comment and associated method definition.

/**
 * Locate a complete user description by user name.
 * @return     Complete description of a user.
 * @param     last     Last name, arbitrary text
 * @param     first     First name, arbitrary text
 * @param     middle     Middle name(s), arbitrary text
 * @exception     NoUserException     User is not in the database
 */
public User findUser( String last, String first, String middle ) 
    throws NoUserException
{
    code to implement method
}

Figure 5: Placing the doc comment for each quantity of interest on the same line as the item of interest would encourage comment updating.

/**
 * @method attempts to locate a user in the global user database.
 */
public User     /** Complete description of a user. */
  f indUser(     /** Locate a complete user description by user name */
  String  last,     /** Last name, arbitrary text */
  String  f irst,     /** First name, arbitrary text */
  String  middle)     /** Middle name(s), arbitrary text */
  throws  NoUserException     /** User is not in the database */
{
  code to implement method
}

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.