The idea of using SSD disk to accelerate file system on rotational media was around for some time already.
However using SSD for caching may not require
complicated solutions like flashcache
and bcache
that we tried
earlier.
Among all mainstream file systems only ext4 on rotational media can be accelerated using SSD disk safely and effectively. Its full performance potential can be unleashed by using a unique feature — external journal. Having said this it is worth noticing that xfs also can use external journal but only in theory because xfs can not be mounted if external journal device is lost neither it can be reconfigured to convert journal back to local or to change its size.
First step to optimise ext4 performance is to use
journal_async_commit
mount option. This will automatically enable
journal_checksum
feature as well.
journal_async_commit
alone gives nice performance boost and it is
essential to use it with external journal.
Test: iozone (SYNC,O_DIRECT)
Iozone: Performance Test of File I/O
Version $Revision: 3.397 $
Compiled for 64 bit mode.
Build: linux-AMD64
Machine = Linux 3.7-trunk-amd64 #1 SMP Debian 3.7.3-1~experimental.1 x SYNC Mode.
SYNC Mode.
Include close in write timing
Include fsync in write timing
O_DIRECT feature enabled
File size set to 4194304 KB
Record Size 256 KB
Command line used: iozone -M -o -c -e -t3 -T -I -s 4g -r 256k -i0 -i2 -i6 -i8 -i9 -i11
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
Throughput test with 3 threads
Each thread writes a 4194304 Kbyte file in 256 Kbyte records
The table below shows results of tests where unbuffered writing was performed using direct access (O_DIRECT) and fsync() after every operation.
file system | effective mount options | Test execution time (less is better) |
---|---|---|
ext4 | rw,relatime,journal_checksum,data=ordered | |
xfs | rw,relatime,attr2,inode64,noquota | |
btrfs | rw,relatime,space_cache | |
btrfs | rw,relatime,compress=lzo,space_cache | |
btrfs | rw,relatime,compress=lzo,space_cache,autodefrag | |
ext4 | rw,relatime,journal_checksum,journal_async_commit,data=ordered | |
nilfs2* | rw,relatime | |
ext4 (external journal_data) | rw,relatime,nodelalloc,journal_checksum,journal_async_commit | |
ext4 (external journal_data_ordered) | rw,relatime,journal_checksum,journal_async_commit |
* Don't trust NILFS2 benchmarks. This test was run on freshly-created file system. In real-world scenario the result probably will be at least 2 times worse due to background cleaning.
The first line (slowest) is a default ext4 behaviour (journal_checksum does not have measurable effect on performance).
xfs and btrfs are given for comparison.
journal_async_commit
gives nice performance boost and puts ext4 ahead
of xfs.
The most interesting ext4 benchmarks follow cleanerless nilfs2 measurement.
The last two measurements are taken with ext4 external journal placed to 1906 MiB (3903488 sectors) partition on SSD disk.
journal_data
mode first saves all data to journal then flushes it to
main rotational device. Basically all data is written twice, first to
journal and then to file system — that's why system time is 5 times greater
than in last test where journal_data_ordered
mode was used.
The very last result demonstrates fantastic performance boost — over 3 times faster than ext4 default (first test).
Conclusion
Full ext4 performance potential can be unleashed with the careful use of external journal. This is safe and effective method utilising no additional software and achieving performance comparable to less reliable SSD caching solutions.