27 May 2010

Filesystems for the Future: ext4 vs btrfs vs xfs (pt1)

One of the regular mandates on the storage group is to maintain recommendations for disk configuration for optimal performance. Filesystem choice is a key component of this, and the last time we did a filesystem performance shootout, XFS was the clear winner.

It's been a while since that test, though, so I'm embarking on another shootout, this time comparing the current XFS champion against the new filesystems that have emerged since: ext4 and btrfs.
Of course, btrfs is still "experimental" and ext4 is only present in the SL5 kernel as a "technology preview", so in the interests of pushing the crystal ball into the distant future, I performed the following tests on an install of the RHEL6 beta. This should be something like what SL6 looks like... whenever that happens.

For now, what I'm going to present are iozone performance metrics. For my next post, I'll be investigating the performance of gridftp and other transfer protocols (and hopefully via FTS).

So. As XFS was the champion last time, I generated graphs for the ratio of ext4, btrfs (with defaults) and btrfs (with compression on, and internal checksumming off, and with just internal checksumming off) to xfs performance on the same physical hardware. Values > 1 indicate performance surpassing XFS, values < 1 performance worse than XFS. Colours indicate the size of file written (from 2GB to 16GB) in KB*.













XFS is still the winner, therefore, on pure performance, except for the case of btrfs without internal btrfs checksum calculation, where btrfs regains some edge. I'm not certain how important we should consider filesystem-level per-file checksum functionality, since there is already a layer of checksum-verification present in the data management infrastructure as a whole. (However, note that turning on compression seems to totally ruin btrfs performance for files of this size - I assume that the cpu overhead is simply too great to overcome the file reading advantages.) A further caveat should be noted: these tests are necessarily against an unstable release of btrfs, and may not reflect its final performance. (Indeed, tests by IBM showed significant variation in btrfs benchmarking behaviour with version changes.)


*Whilst data for smaller files is measured, there are more significant caching effects, so the comparison should be against fsynced writes for more accurate metrics for a loaded system with full cache. We expect to pick up such effects with later load tests against the filesystems, when time permits.

14 May 2010

And hotter:

So I forgotten about some of my children ( well more precisely they are progeny since they do not come directly from me but are my descendants.) So some had gone further round the world ans even more descendants have been produced.
I now have 671 Ursula's', 151 Dirks',162 Valery's'; they also have 46 long lost cousins I did not know from cousin Gavin ( well that's what I call him, ATLAS call him group owned datasets.

One problem I am having is that my children have now travelled miles around the world. ( I myself have now been cloned and reside in the main site in the USA.

In total, I now have children in Switzerland, UK, USA, Canada, Austria, Czech Republic, Ireland, Italy, France, Germany, Netherlands, Norway, Poland, Portugal, Romania, Russia, Spain, Sweden, Turkey, Australia, China, Japan and Taiwan.

My Avatar has been counting to calculate how much infomation has been produced by me.
If you remember , I was 1779 files and ~3.1TB in size. I now have299078 unique child files (taking a volume of 21.44TB). Taking into consideration replication, this increases to ~815k files and 88.9TB.

12 May 2010

Usage really hotting up

Whoa! I turn my back for a moment an I suddenly get analysed massively (and not in the Higgsian sense.) they say a week is a long time in politics, its seems it an eternity on the grid. My Friendly masters holding up the world have created a new tool so that I can now easily see how busy my children and I have been.

My usage is now as follows:
I now have 384 children all beginning with user* (now known as Ursula's)
These 384 children have been produced by 129 unique ATLAS users
Of these:
60 only have 1 child
17 have 2 children
12 have 3 children
16 have 4 children
6 have 5 children
7 have 6 children
3 have 7 children
2 have 8 children
3 have 9 children
1 has 14 children
1 has 16 children
1 has produced 24 children!

I now have 61 children all beginning with data* (now known as Dirk's)
I now have 27 children all beginning with valid* (now known as Valery's)