Channels ▼
RSS

Tools

Multithreaded File I/O


Sequential Read

For sequential access, I read files from the first to the last byte. With multiple files, each thread read one separate file of 20MB. For a single file access, a file of 200MB was divided in slices of equal length and each thread read one of these slices.

Figure 1(a): Read multiple files sequentially.

Figure 1(b): Read single files sequentially.

In short, the results show that:

  • In case of single disks performance decreases significantly with additional threads
  • Only in the case of a RAID system, up to four threads increases performance significantly

This result was surprising because I have a large application where I was able to reduce the load time for large datasets by more then 35% even on the laptop by introducing multithreading recently. The reason for this is obvious: The application did not only read the file. It also processed the data read, stored it into arrays and lists and so on. Using multiple threads, both cores were utilized by up to 100%. In this case, additional threads did not improve the performance of file access but the overall performance increased, although the main task of the process was reading files.

Random Read

For random read access, I positioned the file pointer at a random position somewhere within the file, then read a block 512 Bytes. I did this 10,000 times for each file in the case of multiple files. In the case of a single file, the 10,000 accesses were divided among all threads.

Figure 2(a): Read multiple files randomly.

Figure 2(b): Read single files randomly.

Reading multiple files random was the only case where the behavior was different after a reboot: When reading the files for the first time, all machines showed increased performance with more threads -- even the laptop performed best with 32 threads. The reason for this is that even the hard drive of the laptop supports Native Command Queuing; this is a perfect example of this technology.

When reading a single file, two threads perform a little bit better than one; on the RAID four threads are even better, but more threads decrease performance on all systems. These results did not differ strongly after a reboot.

Sequential Write

Files were written from the first to the last byte for sequential access. With multiple files, each thread wrote one separate file of 20MB. For a single file, a file of 200MB was divided in slices of equal length and each thread did write one of these slices. All files existed in full length before the test has been started, but their entire content was overwritten.

Figure 3(a): Write multiple files sequentially.

Figure 3(b): Read single files sequentially.

The results for multiple files are similar to those for sequential read: Performance generally decreases with multiple threads on single disks, but it increases for a RAID system -- in cases of sequentially writing, up to 8 threads. In cases of writing a single file, the results are surprising: It seems that up to 8 threads do not affect performance on single disks and do increase performance on a RAID system. More than 8 threads always decreases performance.

Random Write

For random write access, I positioned the file pointer at a random position within the file, then wrote a block of 512 Bytes. I did this 10,000 times for each file in the case of multiple files. In the case of a single file, the 10,000 accesses were divided among all threads.

Figure 4(a): Write multiple files randomly.

Figure 4(b): Read single files randomly.

The results of random write show that:

  • With multiple files, performance generally increases with more threads. For single disks, there seems to be a saturation with 2-4 threads. For the RAID system, saturation seems to be reached with 8 threads.
  • When writing a single file, two threads perform better than one. On the RAID, four threads are best, but more threads decrease performance on all systems. However, this is less drastic on RAID system.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video