import java.*: File Processing

Java provides much of the support for file I/O that you're used to in C/C++, albeit with a little more typing.


February 01, 2001
URL:http://www.drdobbs.com/import-java-file-processing/184403912

February 2001 Java Solutions/import java.*


In the previous installment of this column (C/C++ Users Journal, November 2000) I illustrated the classes in java.lang.io that provide basic byte and character stream I/O. A distinguishing feature of many of these classes is their tiered relationship, implemented via the Decorator pattern. For example, a low-level class such as FileReader opens a file. To add line-oriented I/O capability, you can wrap the FileReader object in a BufferedReader, like this:

FileReader f =
   new FileReader("file.txt");
BufferedReader b =
   new BufferedReader(f);

The basic classes discussed last time all extend an input or output superclass, either InputStream and OutputStream, for byte streams; or Reader and Writer for character streams. One thing I didn’t mention last time was that the file output classes have an overloaded constructor with a second boolean argument for appending data to files. The program in Listing 1 uses such a FileWriter to implement file logging. Often when writing to a log file it is necessary to open and close the file each time you access it. Although certainly more costly than keeping the file open continuously, it is often the only way to guarantee that all the log data gets written. For this reason the LogFile constructor just stores the name of the log file.

The log method opens a FileWriter in append mode by using the two-arg constructor with a second argument of true. If the file doesn’t exist already, it is created. I decorate that FileWriter with a BufferedWriter, not for buffering (I really don’t want any!), but for its newline method. You might be tempted to just use a FileWriter and write a ’\n’ to it to terminate a line, but not all platforms use that character as a new line terminator. The technique in Listing 1 ports nicely because newline queries the system property line.separator for the correct character(s) to push onto the output stream. The test program in Listing 2 uses both LogFile methods to write to a log file.

So much for the basics. I’ll now cover the “rest of the story” for file I/O in Java 2.

Random Access Files

You’ve probably noticed by now that there is no basic stream class that provides both input and output capability simultaneously, like iostream does in C++. C++ can do this because it has multiple inheritance (although iostream also requires virtual inheritance, one of the most confusing features in the C++ language — count your blessings, Java people!). What Java does offer is RandomAccessFile, a class that supports both input and output as well as file positioning. A RandomAccessFile traffics in bytes, not characters, so it provides methods for reading and writing single bytes and byte arrays, although it also can read and write strings (converting to and from bytes, of course). RandomAccessFile also implements the DataInput and DataOutput interfaces, so you can also work with primitive types.

A traditional application for random access files processes fixed-length data records, so you can access particular records directly with file positioning. (Database systems used this technique in days of yore). The program in Listing 3 defines a fixed-size Employee class with the following layout:

Employee no.  1 int (4 bytes)
Last name    15 characters (30 bytes)
First name   15 characters (30 bytes)

For the convenience of users of the Employee class, it stores the name fields as String objects, but when it comes time to read or write Employee objects, these fields need to be treated as byte arrays. Furthermore, strings over 15 characters must be truncated and shorter ones need to be filled out. (I chose a fill byte of 0xFF, something that wouldn’t occur in user data.) You can see this technique illustrated in the stringToBytes method. To write an Employee record to a RandomAccessFile, function Employee.write calls stringsToBytes, which builds a buffer large enough for both name fields and calls stringToBytes to fill them, after which it writes the employee number. To read a record back in, Employee.read calls RandomAccessFile.readFully, which fills the fixed-length byte array with the name data. To correctly build each name field string I have to search for the first occurrence of fillByte to determine its length.

As you can see in the test program in Listing 4, to open a RandomAccessFile for both reading and writing, you need to specify a second argument of "rw" in the constructor. After writing a couple of Employee records to the file I move the file pointer between record boundaries by calling RandomAccessFile.seek with the size of the record as an argument. (Seek positions are always relative to the beginning of the file.) This particular example writes two employee records and then swaps them in memory by reading them backwards.

Although this is the first month that this column appears in this Java Solutions supplement, and therefore I am not obliged to mention C++ at all, I still can’t resist showing how to do the above in C++ for comparison. The program in Listing 5 accomplishes the same thing as Listings 3 and 4, but in 50 lines instead of 152! In fairness to the Java, however, I must admit a lot of safety is inherent in the Java version. For example, there is no danger of overflowing a String or even an array in Java, but if I make an error in my array access in C++, I’m dog meat! The C++ version also lacks the advantages of object-orientation, and if I had implemented a C++ Employee class, then more lines would have resulted as well. Nonetheless, if you’re coming from the C world, one of the first things you notice about Java is its verbosity. Like it or lump it.

The complement to the seek method is RandomAccessFile.getFilePointer, which returns the offset of the current file position as a long [1]. As a final example of the file positioning methods, the program in Listings 6, 7, and 8 illustrate a file viewer — an application that scrolls through a file a screen at a time, both forward and backward [2]. The FileViewer class in Listing 6 uses a read-only RandomAccessFile so it can move around, and a stack to keep track of where it’s been so it can scroll backwards. The constructor opens the file and displays the first screen. The topPos field keeps track of the file position of the first line currently in the display. To scroll down, the next method pushes topPos on the stack and then displays the next screen, while previous undoes that operation.

You might think it strange that I bother to separate the read and display operations, storing the current screen’s lines in an ArrayList (which is like a Vector), instead of just displaying the lines immediately. The reason I do so is to support the last method, which scrolls immediately to the end of the file. I need to read sequentially, stacking each screen as I go, so I can scroll backwards once I reach the end, but I certainly don’t need to display as I go.

The program in Listing 7 provides a simple command-line interface for viewing a file with FileViewer. Just to be useful it allows redundant commands for each operation (such as 'n' and 'd' (down) for viewing the next screen). I must admit that I like the way Java forces me to design in a higher-level, object-oriented fashion. The C version of this program I wrote years ago, while shorter, doesn’t separate the file positioning from the viewing, like the FileViewer and ViewFile classes do. It just came automatically now that I’ve been using Java for a number of years.

In Listing 8 you can see that I implemented a stack with Java 2’s LinkedList class. For more on LinkedList, ArrayList, and other collections, see the September 2000 issue of this column in CUJ.

Exploring the File System

Working with files is often more than just doing input and output. Sometimes you need to know what files are in a directory, or whether a certain file exists at all, or you may need to delete a file. All this and more is possible with the methods of the File class. A File object represents a path, not a file stream. In fact, the corresponding file doesn’t even have to exist, although subsequent operations may fail if that is the case. File objects are based on hierarchical directory structures such as are found in Unix and DOS/Windows [3]. Since Unix uses a forward slash to separate components of a path, and Windows uses a backslash, you can determine these characters at run time via the file.separator system property. The program in Listing 9 shows the properties of interest for file processing; the output is for a Windows 2000 system.

A File object can represent either a directory name or a file name, since both are valid path names. You can query which is the case with the isDirectory and isFile methods respectively. You can retrieve the name of the path in two basic forms: absolute and relative. The absolute name of a path is the full path name from its root (e.g., C:\), and the relative name is the last component of the absolute name (such as PropTest.java). An alternate form of absolute name, called the canonical path, is a system-dependent rendition of an absolute path name. Most of the time it is just the same as the absolute path, but on Unix systems, if the absolute path has file system links, then the canonical path will resolve those links to give the true physical path. In other words, a canonical path is more "real" than an absolute path.

The File class has methods for listing the contents of a directory, deleting and renaming files, requesting file attributes such as size, time last modified, and a user’s read and write permissions, and for navigating directories. The program in Listing 10 lists the names of the entries in an entire subdirectory tree. If you don’t specify a starting directory, it uses the current user directory. The File.listFiles method returns an array of File objects representing the contents of the given directory; getName returns the relative pathname of an entry. If the entry is a directory, I call the list method recursively. This particular example shows the files that form this article, and a subdirectory named "temp".

Listing 11 shows how you can control which files come back from a call to listFiles. The nested class SuffixFilter implements the FilenameFilter interface, which has a single method: accept(File dir, String name). When you call the overloaded version of listFiles that takes a FilenameFilter, it calls accept for each entry and returns only those for which your accept method returns true. This example reads a suffix from the command line, stores it in the static field ListSomeFiles.suffix, and displays only the matching files from the current directory.

The ListFiles class in Listing 12 illustrates the informational methods in the File class. It is basically a traditional directory lister that displays directory information in fixed-length columns. If you’re a little rusty on the format classes in java.text, see my article, "Formatted Text and Locales," in the July 2000 issue of this column (in CUJ). The program in Listing 13 shows how easy it is to find a file in a subdirectory tree by applying listFiles recursively. It uses File.getCanonicalPath to print the full pathname of the file.

Summary

Java gives you as much control over files and the file system as a “write once run anywhere” language can claim. Although not necessarily fit for implementing a DBMS, the RandomAccessFile class gives you simultaneous input and output on a file of bytes (more or less the equivalent of an expandable byte array on disk), which can be useful. The File class gives you almost everything you need for navigating and tweaking your file system. It’s not POSIX, but it’s close. Magazine real estate won’t allow me to explore it in this issue, but Java does supply classes that support zip and jar [4] files. Just as a teaser, the program in Listing 14 displays information for each entry in a zip file.

Notes

[1] C/C++ programmers: remember that a long in Java is potentially much larger than in most C/C++ environments (64 bits!), so there is no practical need for a special type like filepos_t as provided in C.

[2] Yes, I know it’s an antiquated command-line style example, but it’s fun, so bear with me.

[3] Much of File’s functionality is a no-op on the MacIntosh.

[4] Jar files are zip files that also contain manifest information. See the September 1999 installment of this column, "Packaging Your Objects," in CUJ for more information on JAR files.

Chuck Allison is a long-time columnist with CUJ. During the day he does Internet-related development in Java and C++ as a Software Engineering Senior in the Custom Development Department at Novell, Inc. in Provo, Utah. He was a contributing member of the C++ standards committee for most of the 1990’s and authored the book C & C++ Codes Capsules: A Guide for Practitioners (Prentice-Hall, 1998). He has taught mathematics and computer science at six western colleges and universities and at many corporations throughout the U.S. You can email Chuck at [email protected].

February 2001 Java Solutions/import java.*/Listing 1

Listing 1: LogFile.java — A class for writing log files

import java.io.*;
import java.util.*;

class LogFile
{
    String fileName;

    public LogFile(String fileName)
    {
        this.fileName = fileName;
    }

    public void log(String message)
        throws IOException
    {
        FileWriter file =
            new FileWriter(fileName, true);
        BufferedWriter w =
            new BufferedWriter(file);
        w.write(new Date() + ": ");
        w.write(message);
        w.newLine();
        w.close();
    }

    public void 
    log(String prefix, String message)
        throws IOException
    {
        log(prefix + ": " + message);
    }
}
— End of Listing —
February 2001 Java Solutions/import java.*/Listing 10

Listing 10: ListAllFiles.java — Lists a subdirectory recursively

import java.io.*;
import java.util.*;
import java.text.*;

class ListAllFiles
{
    static int indentLevel = 0;
    
    public static void main(String[] args)
        throws IOException
    {
        if (args.length > 0)
            list(new File(args[0]));
        else
            list(new File(
                 System.getProperty(
                     "user.dir")));
    }

    static void list(File dir)
        throws IOException
    {
        ++indentLevel;
        File[] files = dir.listFiles();
        for (int i=0; i < files.length; ++i)
        {
            display(files[i].getName());
            if (files[i].isDirectory())
                list(files[i]);
        }
        --indentLevel;
    }
    
    static void display(String name)
    {
        for (int i=0; i < indentLevel; ++i)
            System.out.print("    ");
        System.out.println(name);
    }
}

/* Output:
    Compare.java
    Employee.java
    employees.dat
    FileViewer.java
    FindFile.java
    ListAllFiles.class
    ListAllFiles.java
    ListFiles.java
    ListSomeFiles.java
    LogFile.java
    logfile1.txt
    LogFileTest.java
    ProcessRecords.java
    PropTest.java
    records.c
    Stack.java
    temp
        bar
        baz
        foo
    Test.java
    ViewFile.java
*/
— End of Listing —
February 2001 Java Solutions/import java.*/Listing 11

Listing 11: ListSomeFiles.java — Uses a FilenameFilter to list only certain files

import java.io.*;
import java.util.*;
import java.text.*;

class ListSomeFiles
{
   static int indentLevel = 0;
   static String suffix = null;
    
   public static void main(String[] args)
      throws IOException
   {
      suffix = args[0];
      list(new File(
         System.getProperty("user.dir")));
   }

   static void list(File dir)
      throws IOException
   {
      ++indentLevel;
      File[] files = 
         dir.listFiles(new SuffixFilter());
      for (int i=0; i < files.length; ++i)
      {
          display(files[i].getName());
          if (files[i].isDirectory())
              list(files[i]);
      }
      --indentLevel;
   }
    
  static void display(String name)
  {
     for (int i = 0; i < indentLevel; ++i)
        System.out.print("    ");
     System.out.println(name);
  }
    
  static class SuffixFilter 
     implements FilenameFilter
  {
     public boolean 
     accept(File dir, String name)
     {
        return name.endsWith(suffix);
     }
   }
}

/* Output from 'ListSomeFiles .java':
    Employee.java
    FileViewer.java
    FindFile.java
    ListAllFiles.java
    ListFiles.java
    ListSomeFiles.java
    ListZip.java
    LogFile.java
    LogFileTest.java
    ProcessRecords.java
    PropTest.java
    Stack.java
    Test.java
    ViewFile.java
*/
— End of Listing —
February 2001 Java Solutions/import java.*/Listing 12

Listing 12: ListFiles.java — Lists directory entries with full information

import java.io.*;
import java.util.*;
import java.text.*;

class ListFiles
{
    public static void main(String[] args) throws IOException
    {
        listRoots();

        // Print current directory name:
        String curDir = System.getProperty("user.dir");
        File dir = new File(curDir);
        System.out.println(dir.getCanonicalPath() + ":");
        System.out.println("\trelative path: " + dir.getPath());
        System.out.println("\tabsolute path: "
                           + dir.getAbsolutePath());
        System.out.println("\tas URL: " + dir.toURL());
        System.out.println("==========");
        
        // List files:
        File[] files = dir.listFiles();
        SimpleDateFormat dateFormat =
            new SimpleDateFormat("MM-dd-yyyy kk:mm:ss");
        DecimalFormat sizeFormat =
            new DecimalFormat("########");
        for (int i = 0; i < files.length; ++i)
        {
            String name = buildColumn(files[i].getName(), 20);
            System.out.print(name + " ");

            String size =
                buildColumn(
                    sizeFormat.format(files[i].length()), 8
                           );
            System.out.print(size + " ");
                
            Date when = new Date(files[i].lastModified());
            System.out.print(dateFormat.format(when) + " ");
            if (files[i].isDirectory())
                System.out.print("d");
            if (files[i].canRead())
                System.out.print("r");
            if (files[i].canWrite())
                System.out.print("w");
            if (files[i].isHidden())
                System.out.print("h");
            System.out.println();
        }
    }

    static void listRoots()
    {
        File[] roots = File.listRoots();
        System.out.println("Roots on system:");
        for (int i = 0; i < roots.length; ++i)
            System.out.println("\t"+roots[i].getAbsolutePath());
        System.out.println();
    }        
    
    static String buildColumn(String s, int len)
    {
        // Force a string into a fixed-size column:
        // ... implementation omitted
    }
}

/* Output:
Roots on system:
        C:\
        D:\

C:\CUJ:
        relative path: C:\CUJ
        absolute path: C:\CUJ
        as URL: file:/C:/CUJ/
==========
Compare.java         414      11-22-2000 11:47:39 rw
Employee.java        2308     11-21-2000 15:51:27 rw
employees.dat        128      11-18-2000 17:59:28 rw
FileViewer.java      1830     11-18-2000 18:03:21 rw
FindFile.java        962      11-21-2000 15:50:08 rw
ListAllFiles.java    1310     11-22-2000 12:04:11 rw
ListFiles.class      2504     11-22-2000 12:10:51 rw
ListFiles.java       2631     11-21-2000 15:47:07 rw
ListSomeFiles.java   1400     11-22-2000 11:26:44 rw
ListZip.java         470      11-18-2000 23:48:01 rw
LogFile.java         578      11-21-2000 15:14:11 rw
logfile1.txt         44       11-21-2000 15:15:55 rw
LogFileTest.java     471      11-21-2000 15:37:27 rw
ProcessRecords.java  1341     11-21-2000 16:05:52 rw
PropTest.java        990      11-22-2000 11:07:15 rw
records.c            1059     11-21-2000 15:54:37 rw
Stack.java           463      11-08-2000 12:00:04 rw
temp                 0        11-22-2000 12:01:27 drw
Test.java            947      11-20-2000 18:36:01 rw
ViewFile.java        1538     11-21-2000 15:55:47 rw
*/
— End of Listing —
February 2001 Java Solutions/import java.*/Listing 13

Listing 13: FindFile.java — Searches a subdirectory tree for an entry

import java.io.*;

class FindFile
{
    public static void main(String[] args)
    {
        String dir = null;
        if (args.length < 2)
            dir = new String(".");
        else
            dir = args[1];
        try
        {
            search(new File(dir), args[0]);
        }
        catch (IOException e)
        {
            System.out.println(e.getMessage());
        }
    }
    
    static void search(File dir, String name)
        throws IOException
    {
        File[] files = dir.listFiles();
        if (files == null)
            throw new IOException("not a valid directory");
        for (int i = 0; i < files.length; ++i)
        {
            if (files[i].getName().compareToIgnoreCase(name) == 0)
            {
                System.out.println(files[i].getCanonicalPath());
            }
            if (files[i].isDirectory())
                search(files[i], name);
        }
    }
}

/* Output of 'java FindFile foo':
C:\CUJ\temp\foo
*/
— End of Listing —
February 2001 Java Solutions/import java.*/Listing 14

Listing 14: List the contents of a zip file

import java.util.*;
import java.util.zip.*;

class ListZip
{
    public static void main(String[] args)
        throws Exception
    {
        ZipFile zf = new ZipFile(args[0]);
        Enumeration files = zf.entries();
        while (files.hasMoreElements())
        {
            ZipEntry z = (ZipEntry)files.nextElement();
            System.out.println(z.getName() + "," +
                               z.getSize() + "," +
                               z.getCompressedSize() + "," +
                               new Date(z.getTime()));
        }
    }
}

/* Output from 'java ListZip cuj.zip':
ViewFile.java,1573,533,Sat Nov 18 18:03:42 MST 2000
Employee.java,2433,811,Sat Nov 18 17:57:38 MST 2000
FileViewer.java,1830,533,Sat Nov 18 18:03:22 MST 2000
ListFiles.java,1785,523,Thu Nov 09 17:29:50 MST 2000
ListZip.java,295,204,Sat Nov 18 23:42:56 MST 2000
ProcessRecords.java,1370,519,Sat Nov 18 17:59:20 MST 2000
PropTest.java,216,126,Wed Nov 08 23:26:46 MST 2000
records.c,974,398,Tue Nov 07 22:22:50 MST 2000
Stack.java,463,206,Wed Nov 08 12:00:04 MST 2000
*/
— End of Listing —
February 2001 Java Solutions/import java.*/Listing 2

Listing 2: LogFileTest.java — Tests the LogFile class

import java.io.*;

class LogFileTest
{
    public static void main(String[] args)
    {
        LogFile log = new LogFile(args[0]);

        try
        {
            log.log("A First message");
            log.log("WARNING", "A second message");
        }
        catch (IOException e)
        {
            System.out.println("Error: " + e.getMessage());
        }
    }
}

/* Contents of args[0]:
Mon Nov 27 13:11:35 MST 2000: A First message
Mon Nov 27 13:11:35 MST 2000: WARNING: A second message
*/
— End of Listing —
February 2001 Java Solutions/import java.*/Listing 3

Listing 3: Employee.java — A fixed-length employee data record class

import java.io.*;

// Illustrates fixed-length-record I/O

public class Employee
{
    // Attributes:
    int empno;
    String last;
    String first;

    // Class constants:
    static final int LAST_MAX = 15;
    static final int FIRST_MAX = 15;
    static final int size = LAST_MAX*2 + FIRST_MAX*2 + 4;
    static final byte fillByte = (byte) 0xFF;

    public Employee(String last, String first, int empno)
    {
        this.last = last;
        this.first = first;
        this.empno = empno;
    }

    static void stringToBytes(String s, int max, byte[] dest,
                              int offset)
    {
        // Note that max must be even, so we
        //   don't get half a char.
        byte[] bytes = s.getBytes();
        for (int i = 0; i < max; ++i)
        {
            if (i < bytes.length)
                dest[i + offset] = bytes[i];
            else
                dest[i + offset] = fillByte;
        }
    }

    public byte[] stringsToBytes()
    {
        byte[] buffer = new byte[LAST_MAX*2 + FIRST_MAX*2];
        stringToBytes(last, LAST_MAX*2, buffer, 0);
        stringToBytes(first, FIRST_MAX*2, buffer, LAST_MAX*2);
        return buffer;
    }

    public void write(RandomAccessFile f) throws IOException
    {
        f.write(stringsToBytes());
        f.writeInt(empno);
    }

    public void read(RandomAccessFile f) throws IOException
    {
        byte[] buffer = new byte[LAST_MAX*2 + FIRST_MAX*2];
        f.readFully(buffer);

        last = new String(buffer, 0, 
                       findDelim(buffer, 0, LAST_MAX*2));
        first = new String(buffer, LAST_MAX*2,
                       findDelim(buffer, LAST_MAX*2, 
                          FIRST_MAX*2));
        empno = f.readInt();
    }
    
    public String toString()
    {
        return "{" + last + "," + first + "," + empno + "}";
    }

    int findDelim(byte[] buffer, int start, int max)
    {   // Find first occurrence of 'fillbyte' in
        //   a trailing substring
        // ... implementation omitted
    }
}
— End of Listing —
February 2001 Java Solutions/import java.*/Listing 4

Listing 4: ProcessRecords.java — Processes a random access file of employee records

import java.io.*;

class ProcessRecords
{
    public static void main(String[] args)
    {
        Employee e1 = new Employee("doe", "john", 1);
        Employee e2 = new Employee("dough", "jane", 2);

        RandomAccessFile f = null;
        try
        {
            // Create file; add two records:
            System.out.println("Populating file...");
            f = new RandomAccessFile("employees.dat", "rw");
            e1.write(f);
            e2.write(f);
            System.out.println("e1 = " + e1);
            System.out.println("e2 = " + e2);
            System.out.println();

            // Swap on re-reading:
            System.out.println("Reading file...");
            f.seek(Employee.size);
            e1.read(f);
            f.seek(0);
            e2.read(f);
            System.out.println("e1 = " + e1);
            System.out.println("e2 = " + e2);
        }
        catch (IOException e)
        {
            e.printStackTrace();
            return;
        }
        finally
        {
            if (f != null)
            {
                try
                {
                    f.close();
                }
                catch (IOException e)
                {
                    System.out.println(
                        "File close error: " + e
                                      );
                }
            }
        }
    }
}

/* Output:
Populating file...
e1 = {doe,john,1}
e2 = {dough,jane,2}

Reading file...
e1 = {dough,jane,2}
e2 = {doe,john,1}
*/
— End of Listing —

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.