27 May 2010

Filesystems for the Future: ext4 vs btrfs vs xfs (pt1)

One of the regular mandates on the storage group is to maintain recommendations for disk configuration for optimal performance. Filesystem choice is a key component of this, and the last time we did a filesystem performance shootout, XFS was the clear winner.

It's been a while since that test, though, so I'm embarking on another shootout, this time comparing the current XFS champion against the new filesystems that have emerged since: ext4 and btrfs.
Of course, btrfs is still "experimental" and ext4 is only present in the SL5 kernel as a "technology preview", so in the interests of pushing the crystal ball into the distant future, I performed the following tests on an install of the RHEL6 beta. This should be something like what SL6 looks like... whenever that happens.

For now, what I'm going to present are iozone performance metrics. For my next post, I'll be investigating the performance of gridftp and other transfer protocols (and hopefully via FTS).

So. As XFS was the champion last time, I generated graphs for the ratio of ext4, btrfs (with defaults) and btrfs (with compression on, and internal checksumming off, and with just internal checksumming off) to xfs performance on the same physical hardware. Values > 1 indicate performance surpassing XFS, values < 1 performance worse than XFS. Colours indicate the size of file written (from 2GB to 16GB) in KB*.

XFS is still the winner, therefore, on pure performance, except for the case of btrfs without internal btrfs checksum calculation, where btrfs regains some edge. I'm not certain how important we should consider filesystem-level per-file checksum functionality, since there is already a layer of checksum-verification present in the data management infrastructure as a whole. (However, note that turning on compression seems to totally ruin btrfs performance for files of this size - I assume that the cpu overhead is simply too great to overcome the file reading advantages.) A further caveat should be noted: these tests are necessarily against an unstable release of btrfs, and may not reflect its final performance. (Indeed, tests by IBM showed significant variation in btrfs benchmarking behaviour with version changes.)

*Whilst data for smaller files is measured, there are more significant caching effects, so the comparison should be against fsynced writes for more accurate metrics for a loaded system with full cache. We expect to pick up such effects with later load tests against the filesystems, when time permits.

No comments: