For sequential access, I read files from the first to the last byte. With multiple files, each thread read one separate file of 20MB. For a single file access, a file of 200MB was divided in slices of equal length and each thread read one of these slices.
In short, the results show that:
- In case of single disks performance decreases significantly with additional threads
- Only in the case of a RAID system, up to four threads increases performance significantly
This result was surprising because I have a large application where I was able to reduce the load time for large datasets by more then 35% even on the laptop by introducing multithreading recently. The reason for this is obvious: The application did not only read the file. It also processed the data read, stored it into arrays and lists and so on. Using multiple threads, both cores were utilized by up to 100%. In this case, additional threads did not improve the performance of file access but the overall performance increased, although the main task of the process was reading files.
For random read access, I positioned the file pointer at a random position somewhere within the file, then read a block 512 Bytes. I did this 10,000 times for each file in the case of multiple files. In the case of a single file, the 10,000 accesses were divided among all threads.
Reading multiple files random was the only case where the behavior was different after a reboot: When reading the files for the first time, all machines showed increased performance with more threads -- even the laptop performed best with 32 threads. The reason for this is that even the hard drive of the laptop supports Native Command Queuing; this is a perfect example of this technology.
When reading a single file, two threads perform a little bit better than one; on the RAID four threads are even better, but more threads decrease performance on all systems. These results did not differ strongly after a reboot.
Files were written from the first to the last byte for sequential access. With multiple files, each thread wrote one separate file of 20MB. For a single file, a file of 200MB was divided in slices of equal length and each thread did write one of these slices. All files existed in full length before the test has been started, but their entire content was overwritten.
The results for multiple files are similar to those for sequential read: Performance generally decreases with multiple threads on single disks, but it increases for a RAID system -- in cases of sequentially writing, up to 8 threads. In cases of writing a single file, the results are surprising: It seems that up to 8 threads do not affect performance on single disks and do increase performance on a RAID system. More than 8 threads always decreases performance.
For random write access, I positioned the file pointer at a random position within the file, then wrote a block of 512 Bytes. I did this 10,000 times for each file in the case of multiple files. In the case of a single file, the 10,000 accesses were divided among all threads.
The results of random write show that:
- With multiple files, performance generally increases with more threads. For single disks, there seems to be a saturation with 2-4 threads. For the RAID system, saturation seems to be reached with 8 threads.
- When writing a single file, two threads perform better than one. On the RAID, four threads are best, but more threads decrease performance on all systems. However, this is less drastic on RAID system.