Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

JVM Languages

Java and Internet Programming


AUG95: Java and Internet Programming

Similar to C and C++, but much simpler

Arthur is a staff engineer at Sun Microsystems and has been involved in the development of the Java language for two years. He is the author of the Java compiler and one of the architects of the HotJava WWW browser. Arthur can be contacted at [email protected].


In the early days of the Internet, most machines ran some dialect of UNIX. Consequently, most software written in languages such as C or ANSI C was relatively portable, and most programs were distributed with source, so they could be compiled for the user's preferred operating environment.

While the exponential growth of the Internet has opened up new and exciting opportunities, its newly heterogeneous nature hinders the distribution of software in binary format. Also, as more PCs, Macs, and other non-UNIX machines connect to the Internet, porting is becoming more difficult. But porting is not the only barrier to distributing software on the Internet. What about security? Have you ever downloaded a binary from a public ftp site and executed it on your machine? There are no guarantees that such a program won't steal your password or delete a critical file. Even when the software comes from a respected vendor, it can be modified as it is transported over the Internet.

In 1990, a small team of engineers at Sun, headed by James Gosling, started developing software for the consumer-electronics market. Initially the team used C++, but the wide variety of hardware architectures used in consumer electronics--coupled with the requirement for robustness--made this problematic. The team therefore developed a new language called "Java," which, it turns out, also addresses many of the issues of software distribution on the Internet.

Java is a simple, object-oriented, multithreaded, garbage-collected, secure, robust, architecture-neutral, portable, high-performance, dynamic language. The language is similar to C and C++ but much simpler. Java programs are compiled into a binary format that can be executed on many platforms without recompilation. The language contains mechanisms to verify and execute binary Java programs in a controlled environment, protecting your computer from potential viruses and security violations.

To demonstrate the capabilities of Java, Sun has developed a WWW browser called "HotJava," written entirely in Java itself, which has the unique capability to execute interactive content embedded in HTML pages. These interactive programs are automatically downloaded with the HTML document and enable the extension of the browser in a natural way. Existing examples of embedded applications include animations, simulations, teaching tools, spreadsheets, and the like. Both HotJava and Java are available at http://java.sun.com/ (or by sending mail to [email protected]).

Java and HotJava are free for noncommercial use. The source for the Java interpreter, compiler, and HotJava browser are also freely available, but there are some restrictions on incorporating Java into commercial products. Check http://java.sun.com/ for licensing terms.

HotJava is an example of extensible Internet software. The browser downloads code from the Internet, allowing the browser to extend its functionality gradually. This is similar to Visual Basic's VBX extensions, except that Java is a real programming language--it is not tied to a particular processor, and it is secure.

The alpha version of Java/HotJava for Solaris was released this spring. Ports for Solaris x86, SunOS 4.x, Windows 95, Window NT, and Macintosh System 7.5 are underway at Sun. Since the release was announced, we have received inquiries about porting Java to Windows 3.1, Linux, AIX, SCO, SGI, HP/UX, DEC Alpha, NextStep, NetBSD, UnixWare, OS/2, Plan 9, OS9, Acorn, Taligent, Amiga, Sega, and vt100s.

Java-Language Overview

The Java syntax and semantics for expressions and statements are almost identical to those of ANSI C. This makes the language easy to learn for someone familiar with C or C++. Listing One is a simple hash-table implementation in Java.

Java's types and semantics are very well defined. Because Java code can run on many platforms, it is important that small differences not be introduced by the underlying hardware. Unlike C and C++, the precision of numeric types is always the same (see Table 1), the order of evaluation of function arguments is always left to right, and the rules for selecting overloaded methods are clear. All of these are poorly defined in C and C++.

We've introduced features that make the language robust. The language supports C++-style exceptions that can be thrown and caught by Java programs. An exception is also generated when dereferencing a null pointer, accessing outside the bounds of an array, or when running out of memory. Java is garbage collected, which means that you never have to explicitly free heap-allocated objects. All this makes it very easy to write robust code that can recover safely from run-time errors.

Java is object oriented. Except for numbers, everything is an object. The language is statically typed, but every object has a type that can be examined at run time. You can cast from one type to another, but an exception is thrown if the run-time types are not compatible.

Java supports a single-inheritance class hierarchy. However, a class can implement multiple interfaces in a way similar to IDL. All Java methods are virtual unless declared static, giving Java a much simpler method-invocation model than C++. Java supports method overloading, but unlike C++, the rules for method matching are much better defined. All this allows for an efficient implementation of method invocations, with some of the flexibility of multiple inheritance.

Java can be used for writing large programs. Java classes are arranged in a modular way using packages. There are no header files! A class is compiled into a binary representation used in further compilations. This means that there are never any inconsistencies between the declared class and the actual implementation. The access to type information, methods, and instance variables is controlled using the public, protected, and private keywords familiar from C++.

It is, of course, possible to interface Java to other languages. This is important when using existing libraries such as Xlib, Motif, or legacy database software. To facilitate this, methods can be declared native and implemented in C or some other native language. The Java run time can be extended to include new native-method implementations by dynamically linking in software libraries. Tools are provided to generate stub functions that enable calls from Java to C and back. The interface between Java and C is natural because the languages are so similar.

Multithreading

Java is fully multithreaded. Preemptive multithreading usually introduces a whole range of gnarly problems, but Java has features that make multithreaded programming straightforward.

For instance, synchronization is a primitive feature of Java. A method declared to be synchronized automatically locks the object on which it is invoked. When the method returns, the lock is automatically released. This happens even when the method raises an exception or the thread running the method is killed. This feature alleviates explicit locking, a source of many programming errors. The locks are always reentrant, so a thread can grab the same lock several times without causing deadlock.

The Hashtable class in Listing One uses synchronized methods to serialize access to the hash table. Only one thread at a time can manipulate the internal state. If two threads use the same Hashtable instance, they will not corrupt the internal state of the Hashtable because all access is automatically serialized by the synchronized methods.

Compilation and Performance

The Java compiler is written in Java itself. It compiles Java methods into bytecodes, which are stored in a class file. The class file also contains all the type information for a class. Byte codes are essentially very simple instructions for a virtual machine. As the name implies, most bytecodes are only one byte long, so bytecode programs are usually a lot smaller than their native equivalent.

There are several advantages to compiling Java into bytecodes. Bytecodes are portable because they do not require a particular processor architecture or other hardware support. They are byte-order independent so that they can be executed on both Big- and Little-endian machines. And because the bytecodes are so simple, it is possible to write a very efficient interpreter for Java bytecodes.

Unlike ordinary machine instructions, Java bytecodes are typed. Each bytecode specifies the exact type of its operands, which makes it possible to apply a simple data-flow algorithm to verify that the bytecodes obey the language constraints.

The Java virtual machine has both a small stack to store temporary expression values and a set of registers for local variables. There are bytecodes for creating new class instances, accessing arrays, invoking methods, handling exceptions, and the like. Listing Two shows a typical method; Listing Three shows what this method looks like when compiled to bytecodes.

The Java interpreter on SPARC is about 15 times slower than compiled C or C++. This is acceptable for most applications, especially if you consider all the benefits of array-bounds checking, null-pointer checking, garbage collection, and so on. However, this is obviously not fast enough to satisfy the real speed freaks, who want to write JPEG decompressors and 3-D rendering algorithms in Java. To squeeze the last bit of performance out of your machine, it is possible to convert the machine-independent bytecodes to machine instructions. This makes Java almost as fast as C or C++ (there is still some extra overhead for unavoidable run-time checks). Listing Four shows the SPARC instructions generated by the example in Listing Three.

The bytecodes can be converted to machine instructions either in advance or when the bytecodes are loaded. The translation is straightforward, but to make full use of the processor's capabilities it may be necessary to do some data-flow and register allocation. It is usually not necessary to convert all bytecodes in your system to machine instructions. Machine instructions are bulky, so the best performance is usually achieved by compiling only a few performance-critical classes.

Garbage Collection

Garbage collection eliminates the need to free heap-allocated objects explicitly. Memory used by objects that are no longer referenced is automatically reclaimed. This significantly reduces the burden on the programmer. Memory leaks are often hard to find and can affect the performance of applications significantly. Java uses a simple, conservative, mark-and-sweep algorithm for garbage collection.

All objects are internally referenced through a handle (a double indirection). This allows the garbage collector to move objects in the heap to avoid fragmentation. An exception is made for objects pointed to directly. In that case, the object is not moved because a native method, machine code, or the interpreter is accessing it directly.

Garbage collection of a 4-MB heap takes an average of 100 milliseconds on a SPARCstation 10. To reduce its impact, garbage collection is done not only when the system runs out of memory, but also periodically when the system is idle. Since most garbage collection is done when the system is idle, interactive applications, or applications that use I/O are rarely impacted. Unless you are doing real-time computation, it is unlikely that you will ever notice the garbage collector. The language is designed to support incremental, multithreaded, and generational garbage collectors, as well.

Java also supports finalization. This is useful when a Java object is associated with a resource not controlled by the garbage collector, such as a file descriptor or a native UI component. Before the object is garbage collected, the finalizer method of the object is called, giving the object a chance to release its external resources. Finalization is used in the implementation of the FileInputStream class in Listing Five. The file descriptor associated with the stream must be closed before the FileInputStream instance is garbage collected.

Security

Today it is common practice to download binaries from the Internet and execute them on your machine. However, no tools are available to verify that the code is well behaved and free of viruses. Java is designed to allow compiled code to be downloaded without introducing a security hazard. As described earlier, bytecodes contain additional type information that make them verifiable. It is possible to check whether the Java language constraints have been broken. For example, it is illegal to treat an integer as a pointer or to load something from the stack when the stack is empty. We have placed additional restrictions on bytecodes to make sure that the verification is somewhat easier than solving the halting problem.

Once a piece of downloaded code has been verified, it is safe to assume that it does not break any of the language constraints. Thus, if a variable is declared private, it really is private. In C++ the private and protected modifiers are merely hints to the programmer and are easily defeated (just put #define private public before the first #include). Not so in Java, where private methods are really private. Bytecode verification makes sure that this rule is observed.

Once it is possible to prove that downloaded bytecodes do not violate the language constraints, it becomes possible to add the next layer of security. At this level you can control what the downloaded code can and cannot do. The checks for this can be written in Java itself because it is impossible for the downloaded code to bypass the checks.

In the future, Java will also provide features for signing code using public-key encryption. This will allow the secure exchange of Java code with trusted partners over the Internet.

Conclusion

I have been coding in Java for nearly two years. I can't remember the last time I was forced to use C++, but I vaguely remember how painful it was. Java is one of the coolest languages to come along in a long time.

Table 1: Simple Java types and their representations.

<b>Type      Meaning</b>
<I>boolean</I>    1 bit
<I>byte</I>       8 bits
<I>char</I>       16 bits (unsigned)
<I>short</I>      16 bits
<I>int</I>        32 bits
<I>long</I>       64 bits
<I>float</I>      32 bits IEEE-754
<I>double</I>     64 bits IEEE-754

Listing One

package containers;
/*  Hashtable collision list. */
class HashtableEntry {
    int hash;
    Object key;
    Object value;
    HashtableEntry next;
}
/* Hashtable class. Maps keys to values. Any object can 
 * be used as a key and/or value. 
 * To sucessfully store and retrieve objects from a hash 
 * table the object used as the key must implement the hashCode() 
 * and equals() methods.
 * This example creates a hashtable of numbers. It uses the names of
 * the numbers as keys:
 * 
 *       Hashtable numbers = new Hashtable();
 *       numbers.put("one", new Integer(1));
 *       numbers.put("two", new Integer(1));
 *       numbers.put("three", new Integer(1));
 * 
 * To retrieve a number use:
 *       Integer n = (Integer)numbers.get("two");
 *       if (n != null) {
 *             System.out.println("two = " + n);
 *       }
 */
public final class Hashtable {
    private HashtableEntry table[];
    private int count;
    private int threshold;
    private final float loadFactor = 0.75;
    /* Construct a new, empty hashtable. */
    public Hashtable() {
        table = new HashtableEntry[101];
        threshold = (int)(table.length * loadFactor);
    }
    /* Return the size of the hashtable */
    public int size() {
        return count;
    }
    /* Gets the object associated with a key in the hashtable. */
    public synchronized Object get(Object key) {
        HashtableEntry tab[] = table;
        int hash = key.hashCode();
        int index = (hash & 0x7FFFFFFF) % tab.length;
        for (HashtableEntry e = tab[index] ; e != null ; e = e.next) {
            if ((e.hash == hash) && e.key.equals(key)) {
               return e.value;
            }
        }
        return null;
    }
    /* Rehashes the content of the table into a bigger table. */
    private void rehash() {
        int oldCapacity = table.length;
        HashtableEntry oldTable[] = table;
  
        int newCapacity = oldCapacity * 2 + 1;
        HashtableEntry newTable[] = new HashtableEntry[newCapacity];
        threshold = (int)(newCapacity * loadFactor);
        table = newTable;
        for (int i = oldCapacity ; i-- > 0 ;) {
            for (HashtableEntry old = oldTable[i] ; old != null ; ) {
                HashtableEntry e = old;
                old = old.next;
                int index = (e.hash & 0x7FFFFFFF) % newCapacity;
                e.next = newTable[index];
                newTable[index] = e;
            }
        }
    }
    /* Puts the specified element into the hashtable, using the specified
     * key. The element may be retrieved by doing a get() with the same key.
     */
    public synchronized Object put(Object key, Object value) {
        // Make sure the value is not null
        if (value == null) {
             throw new NullPointerException();
       }
       // Makes sure the key is not already in the hashtable.
       HashtableEntry tab[] = table;
       int hash = key.hashCode();
       int index = (hash & 0x7FFFFFFF) % tab.length;
       for (HashtableEntry e = tab[index] ; e != null ; e = e.next) {
           if ((e.hash == hash) && e.key.equals(key)) {
           Object old = e.value;
           e.value = value;
           return old;
         }
      }
      if (count >= threshold) {
          // Rehash the table if the threshold is exceeded
          rehash();
          return put(key, value);
      } 
      // Creates the new entry.
      HashtableEntry e = new HashtableEntry();
      e.hash = hash;
      e.key = key;
      e.value = value;
      e.next = tab[index];
      tab[index] = e;
      count++;
      return null;
   }
}

Listing Two

int sum(int array[]) { 
    int sum = 0;
    for (int i = array.length ; --i >= 0 ; ) {
        sum += array[i];
    }
    return sum;
}

Listing Three

Method int sum(int [])
   0    iconst 0        ;; sum = 0
   1    istore R1
   2    aload R0        ;; i = array.length;
   3    arraylength
   4    istore R2
   5    goto 14
   8    iload R1        ;; sum += array[i]
   9    aload R0
  10    iload R2
  11    iaload
  12    iadd
  13    istore R1
  14    iinc R2 -1      ;; if (--i >= 0) then goto 8
  17    iload R2
  18    ifge 8
  21    iload R1        ;; return sum
  22    ireturn

Listing Four

Method int sum(int [])
     save      %sp, -96, %sp                                                      
     tst       %i0                 ;; is the array null?                 
     mov       0, %l0              ;; set the sum to 0                   
     teq       %0, 16              ;; perform a hardware trap on null    
     ld        [%i0 + 4], %g1      ;; get the array length field into %l1
     ba        test                                                      
     srl       %g1, 5, %l1                                               
loop:                                                                    
     ld        [%i0 + 4], %o0     ;; get the array length into %g4       
     srl       %o0, 5, %g4                                               
     sll       %l1, 2, %g3        ;; convert i into an offset            
     cmp       %g4, %l1           ;; is the index out of range           
     tltu      %0, 17             ;; perform a hardware trap if it is    
     ld        [%g2 + %g3],%g1    ;; get the value                       
     add       %l0, %g1, %l0      ;; add it into the sum                 
test:                                                                    
     addcc     %l1, -1, %l1       ;; add -1 to the index                 
     bge       loop               ;; loop if necessary                   
     nop    
     jmpl      %i7 + 8            ;; return the value now in %l0
     restore   %g0, %l0, %o0

Listing Five

class FileInputStream extends InputStream {
    private int fd = -1;
    /** Open a file for reading. */
    private native int open(String name);
    /** Create an input file given a file name. */
    public FileInputStream(String name) {
         fd = open(name);
    }
    /** Read a byte. */
    public native int read();
    /** Close the input stream. */
    public native void close();
    /** Close the stream when the stream is finalized. */
    protected void finalize() {
        close();
    }
}


Copyright © 1995, Dr. Dobb's Journal


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.