Java NIO & the iTunes Database

Dmitriy uses the Java NIO package—part of J2SE 1.4—to unravel the iTunesDB format, which is at the heart of the database used in Apple's iPod MP3 player.


December 01, 2003
URL:http://www.drdobbs.com/database/java-nio-the-itunes-database/184405502

Dec03: Java NIO & the iTunes Database

Dmitriy is chief architect at MetricStream. He can be contacted at [email protected].


Free-form data formats such as XML are popular in part because they can be read/edited by humans and understood by different computer systems. Fixed-form binary formats, on the other hand, are cryptic and platform dependent, but nonetheless important where memory and CPU-speed constraints are an issue.

Assembly language and C have traditionally been used when working with binary formats—but not Java. However, the Java NIO package changes this. Delivered as part of J2SE 1.4, Java NIO (short for "New Input/Output") extends the standard I/O package by adding classes for scalable network and file I/O, buffer management, and character-set support. In addition, NIO includes facilities for improving performance and simplifying the parsing of binary data formats, synchronization of data processing, threadless access to large numbers of data requesters, and so on. Consequently, Java has become a viable platform for mobile devices, messaging services with huge numbers of requesters, video/audio compression, legacy binary formats, and the like.

In this article, I examine Java NIO features that are useful for fast parsing of binary data. For the purposes of illustration, I apply the techniques described here to the iTunesDB format, which is at the heart of the database used in Apple's iPod MP3 player. In the process, I present code (available electronically; see "Resource Center," page 5) that lets you export iTunesDB-formatted data to CSV and XML format.

Java NIO

Java NIO doesn't directly deal with general I/O operations. Instead, it relies on the I/O package for traditional read/write operations. However, since buffers are what NIO is all about, the standard I/O stream or direct access read/write operations are extended by buffer operations in NIO. Buffer, NIO's abstract class, has a number of derived classes—the most important for buffer manipulation being ByteBuffer. This class enables more flexible heterogeneous access to buffer content while, at the same time, letting you reserve specific data-type buffers for bulk operations. Most buffer methods return a reference to ByteBuffer, making them convenient for chain operations.

Since all Buffer-based classes are abstract, they can't be instantiated directly. Any byte array can be converted to a buffer using wrap operations. Capacity is an important buffer characteristic. If programs try to retrieve data beyond buffer capacity, then underflow exceptions occur. If programs try to send data beyond buffer capacity, overflow exceptions result. Buffers can be read-only, thereby preventing their content from being changed. Buffers can also be direct or indirect. Direct buffers lead to better I/O performance operations because native methods use the buffer for actual native I/O operations. However, freeing buffers can be problematic for garbage collectors because they may reside in native code. Consequently, it isn't a good idea to frequently allocate/deallocate direct buffers.

ByteBuffer also supports:

In addition to I/O, buffers can be used for primitive data-type conversion, or for storing different data types in one place without creating a specific class for every possible data-type combination. This is similar to unions in C, where the same memory can be interpreted as a set of different data types.

NIO classes involved in actual I/O operations are defined in the nio.channels subpackage. Channel provides access to externally stored data from multiple threads. Traditional read/write operations are extended by buffer manipulation when channels are used. To simplify access to file content, you use MappedByteBuffer, which represents part of a file usually stored in memory. The Channel interface defines a basic channel operation as closed, because a channel should be open when obtained. To check the status of a channel, use the isOpen method. Other interfaces define specific behaviors for channels. Along with interfaces, the NIO package contains basic abstract channel classes used for different I/O operations. Particular implementations give rich sets of methods simplifying concurrent access to channel content. Entire files (or parts of them) can be locked for particular types of operations. There are specific types of channels for pipes and selectors. Most base classes of package I/O let you obtain specific channels. Consequently, the nio.channels package doesn't provide ways for obtaining channels except via the helper class Channels.

Classes defined in nio.charset (another NIO base package) provide methods to convert data to/from different char sets. The classes can work with buffers, which provide a powerful mechanism for interpreting file content in different charsets. You can't do this easily with the basic I/O package. Instead, you have to provide your own implementation if the file content is in different charsets. In addition to conversion, the classes let you provide different types of behavior when decoding/encoding errors occur. There is a mechanism to obtain information about errors without catching exceptions, and Channels and charset are accompanied by corresponding service-provider packages.

Apple's iPod

Apple Computer's iPod (http://www.apple .com/ipod/) is an MP3 player that uses a proprietary data format to store information about music files and playlists on the iPod's hard drive. Although the iPod can appear as a removable drive, you can't just add music files and play them. Instead, you must use special programs to update the iPod's database.

It appears that the iPod's firmware is capable of understanding different types of filesystems, including HFS+ and Windows FAT32. In this article, I assume the iPod supports FAT32. Figure 1 illustrates the iPod's typical directory structure. The iTunes directory contains several files, including iTunesDB (which I focus on here). The Music directory contains 20 subdirectories where actual audio files are stored. Actually storing music files in these directories isn't mandatory.

The iTunesDB file has a tree-like structure, in which each element of the tree is a header. The format of the file looks similar to the hierarchy of atoms used in the QuickTime format. Every header has a similar structure—a 4-byte signature followed by a sequence of 32-bit integers. Little-endian notation is used throughout. A number after the signature indicates the size of the header. If a header has child headers, then the next number indicates the total size of the header—including the size of all children. The headers include the following items.

Figure 4 illustrates the actual header structure flow. For more information on iTunesDB format (Version 0.1b), see http://homepage.ntlworld.com/simon .mason20/ipod_tunes_spec.htm. For additional information on iTunesDB headers, see http://sourceforge.net/docman/?group_id=52976.

NIO and Binary Format Parsing

There are several ways to parse fixed-file binary formats such as iTunesDB. I use NIO to implement three techniques that let you compare coding effort, readability, and performance. Each technique is based on a separate class for reading a header of each type. Since all headers have a common part, it's reasonable to have all header classes inherit from a base class implementing a common task. And since parsing should use a specific knowledge of every particular class, this knowledge is represented by abstract methods that must be executed in particular implementations of headers.

The two knowledge attributes are the basic header size and a header signature. Although size isn't supposed to be a constant, you can assume at times that it is, since the same program can write/read headers. Of course, such assumptions should not be considered when writing software for the iPod because it deals with different formats and header sizes. Access to all other header attributes is via a generic setter/getter method that lets you avoid explicit casting to a header of a particular type when some attributes should be set or retrieved; see Listing One.

Next, you must determine how child headers are read. You can encapsulate this logic in the headers themselves to read all subheaders, or use an external reading procedure. The three approaches I present here use different methods to obtain the content of a buffer, and use header-based or external procedures for parsing. The first approach is based on obtaining the buffer content via Channel's method map; see Listing Two. The second approach is based on sequential reading of a channel to a buffer (Listing Three). These two approaches are used in conjunction with an external parser. A third approach uses buffer-slicing and header-driven parsing; see Listing Four.

In the first approach (Listing Two), the buffer is specified as read-only to potentially increase speed. The buffer-mapping position in iTunesDB is specified as a parameter of the method. Sequential calling of header readings is simplified by using an autoincremental value returned by the method as a position of the beginning of the next header buffer. However, this approach can be problematic when the requested mapped area size is larger than the physical file size. To avoid this, you first need to read a portion of a header into a buffer to determine signature and size. When you get the real size, you can go ahead and read in the rest of header. Listing Three implements the sequential read-buffer technique; in this case, there is no need to explicitly position the reading. In addition, Listing Three illustrates two-step reading that first determines size, then reads the entire header. This approach is similar to standard byte-array reading operations. The buffer-slicing approach (Listing Four) only manipulates buffers because the entire database is read at the start. The method readChildren gets a buffer slice of the first child of the current header. Reading a data header is tricky, since it doesn't contain information about its own size. This information must be extracted from the mhod header. For this purpose, Listing Five adds an extra argument when calling the read function.

Setting a limit to the char buffer is necessary only for the slicing approach because an actual buffer can be larger than you expect. charset manipulations are not required because the Unicode string is already present in the header. Experimentation with charset provided a method for converting a 4-byte signature to integer value when the signature is presented as a four-character string. The parsing process is predetermined, so it's always known what type of a header comes next.

Header signatures are used mainly to ensure the consistency of the database file. Listing Six shows a way to obtain a channel to access iTunesDB. The complete code, including code to export the iPod database to CSV and XML format, is available electronically. This example can also be used to write to iTunesDB. Writing is usually simpler, but can be tricky because the first headers must have size values set, and can be obtained only after the child headers have been formed.

Conclusion

To measure the performance benefits of using the NIO classes, I compared their use to the traditional reading procedure used in MediaChest (http://mediachest.sourceforge.net/), an open-source project of mine. To get both code bases working under similar conditions, I added a similar MediaChest overhead to fill data structures in the NIO implementation. Table 1 shows the results for parsing information for about 5713 songs on a 1-GHz Celeron PC with a USB 2.0 connection.

The fastest implementation was based on reading the entire database in memory using the slicing approach. The function memory-mapped region version performed well, too. The poorest performer was a read-buffer-based implementation, which still worked more than twice as fast as the traditional MediaChest implementation.

DDJ

Listing One

public Object get(int index) {
   if (index >= START_OBJ_INDEX && index <= END_OBJ_IDX)
      return objValues[index - START_OBJ_INDEX];
   if (index >= START_INT_INDEX && index <= END_INT_IDX)
      return new Integer(intValues[index - START_INT_INDEX]);
   throw new IllegalArgumentException("Index " + index + " is out of range.");
}
public int getInt(int index) {
   if (index >= START_INT_INDEX && index <= END_INT_IDX)
     return intValues[index - START_INT_INDEX];
   throw new IllegalArgumentException("Index " + index + " is out of range.");
}

Back to Article

Listing Two

public int read(FileChannel fileChannel, int readPosition) throws IOException {
   ByteBuffer buffer = fileChannel.map(FileChannel.MapMode.READ_ONLY, 
                     readPosition, getSize()).order(ByteOrder.LITTLE_ENDIAN);
   intValues[SIGNATURE] = buffer.getInt();
   if (getSignature() != intValues[SIGNATURE])
   throw new IOException(
      "Unexpected signature 0x"
      + Integer.toHexString(intValues[SIGNATURE])
      + "("
      + asString(intValues[SIGNATURE])
      + ") where 0x"
      + Integer.toHexString(getSignature())
      + " was expected.");
   intValues[SIZE] = buffer.getInt();
   if (buffer.capacity() < intValues[SIZE])
      throw new IOException("Real header size " + intValues[SIZE] + " 
                                            exceeds the buffer capacity.");
   readSpecific(buffer);
      return readPosition + intValues[SIZE];
}

Back to Article

Listing Three

public void read(FileChannel fileChannel) throws IOException {
   ByteBuffer buffer = 
   ByteBuffer.allocateDirect(8).order(ByteOrder.LITTLE_ENDIAN);
   fileChannel.read(buffer);
   buffer.rewind();
   intValues[SIGNATURE] = buffer.getInt();
   if (getSignature() != intValues[SIGNATURE])
      throw new IOException(
         "Unexpected signature 0x"
        + Integer.toHexString(intValues[SIGNATURE])
        + "("
        + asString(intValues[SIGNATURE])
        + ") where 0x"
        + Integer.toHexString(getSignature())
        + " was expected.");
   intValues[SIZE] = buffer.getInt();
   buffer = 
       ((ByteBuffer)ByteBuffer.allocateDirect(intValues[SIZE]).position(8)).
                                               order(ByteOrder.LITTLE_ENDIAN);
   fileChannel.read(buffer);
   buffer.position(8);
   if (buffer.capacity() < intValues[SIZE])
      throw new IOException("Real header size " + intValues[SIZE] + " 
                                              exceeds the buffer capacity.");
   readSpecific(buffer);
}

Back to Article

Listing Four

public void read(ByteBuffer buffer) throws IOException {
   intValues[SIGNATURE] = buffer.getInt();
   if (getSignature() != intValues[SIGNATURE])
      throw new IOException(
         "Unexpected signature 0x"
         + Integer.toHexString(intValues[SIGNATURE])
         + "("
         + asString(intValues[SIGNATURE])
         + ") where 0x"
         + Integer.toHexString(getSignature())
         + " was expected.");
   intValues[SIZE] = buffer.getInt();
   readSpecific(buffer);
   buffer = ((ByteBuffer) 
   buffer.position(intValues[SIZE])).slice().order(ByteOrder.LITTLE_ENDIAN);
   readChildren(buffer);
}

Back to Article

Listing Five

public void readSpecific(ByteBuffer buffer) {
   intValues[SIZE] = getSize(); // 16
   intValues[NUM_ENTRIES] = buffer.getInt();
   intValues[LENGTH] = buffer.getInt();
   intValues[TOTAL_SIZE] = intValues[LENGTH] + intValues[SIZE];
   objValues[DATA - START_OBJ_INDEX] = ((ByteBuffer) 
    buffer.position(intValues[SIZE]).
limit(intValues[SIZE]+intValues[LENGTH])).asCharBuffer().toString();
}

Back to Article

Listing Six

public static final String ITUNESDB_PATH = 
   File.separator + "iPod_Control" + 
   File.separatorChar + "iTunes" +
   File.separatorChar + "iTunesDB";
   String drive = "H:";
   FileChannel fileChannel = new FileInputStream(drive +
                               ITUNESDB_PATH).getChannel();

Back to Article

Dec03: Java NIO & the iTunes Database

iPod_Control\
       iTunes\
           iTunesDB
           iTunesPrefs
           Play Counts
           winPrefs
       Music\
           F00\
              song1.mp3
              ...
           F01\
              ...
           ...
           F19\

Figure 1: The iPod's directory structure.

Dec03: Java NIO & the iTunes Database

0: 6D 68 69 74
4: A0 00 00 00
8: 0E 02 00 00
12:05 00 00 00 num attr
16:12 00 00 00 id
20:01 00 00 00 ?
24:00 00 00 00
28:00 00 00 00 vbr
32:74 5F 05 BB date created
36:FD A8 4C 00 file length in bytes
40:87 CA 04 00 length in ms
44:00 00 00 00
48:00 00 00 00
52:00 00 00 00
56:80 00 00 00 bitrate
60:00 00 00 00
64:00 00 00 00
68:00 00 00 00
72:00 00 00 00
76:00 00 00 00
80:00 00 00 00
84:00 00 00 00
88:00 00 00 00
92:00 00 00 00
96:00 00 00 00
100:00 00 00 00
104:72 5F 05 BB date modified
108:00 00 00 00

Figure 2: Partial hex dump showing header format.

Dec03: Java NIO & the iTunes Database

1	song title 	Unicode string
2	song file	Mac path on iPod filesystem (Unicode string). There is a
                limitation on file name length by 32 characters, however 
                complete path can be much longer.
3	album name 	Unicode string
4	artist	Unicode string
5	genre 	Unicode string
6	file type	Unicode string
7	equalizer	setting
8	commentary	Unicode string
12	composer	Unicode string
100	identifier	Integer

Figure 3: Value/meaning combinations.

Dec03: Java NIO & the iTunes Database

mhbd
      mhsd                     mhsd
          mhlt                     mhlp
             mhit                     mhyp       mhyp
                 mhod string              mhip        mhip    
                 mhod string              mhip        mhip    
             mhit     
                 mhod string              mhip        mhip    
                 mhod string              
             mhit
                 mhod string

Figure 4: Header structure flow.

Dec03: Java NIO & the iTunes Database

Table 1: Reading time in ms.

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.