11 January 2011

Who cares about TCP anyway....

Don't worry I haven't injured myself and need a cut sterilizing, I mean window sizes!!!
So as part of my work to look at how to speed up individual transfers, I thought I would go back and look to see what the effect of changing some of our favourite TCP window settings would be. These are documented at http://fasterdata.es.net/TCP-tuning/

Our CMS instance of Castor is nice since CMS have a separate disk pool for incoming WAN transfers, outgoing WAN transfers and for pool for internal transfers between WNs and the SE. This is great feature as it means the disk servers in WanIn and WanOut will never have 100s of local connections ( a worry I have for setting TCP settings to high;) so we experimented to see what the effect of changing our TCP settings.

I decided to study transfers that the international as these are the large RTT transfers and most likely to benefit from tweaking. Our settings before the change were. 64kB for default and a 1MB maximum window size.
This lead to a maximum transfer rate per transfer of ~60MB/s and an average of ~7.0 MB/s.
This appears to be hardware dependent across the different generation s of kit.
We changed the settings to 128kB and 4MB. This led to an increase to ~90MB/s maximum data transfer rate per transfer and an average transfer of~11MB/s so roughly a 50% increase in performance. This might not seem a lot since we doubled and quadrupled are settings... However further analysis improves matters. changing TCP settings is only going to help with transfers where the settings at RAL were the bottleneck.
For channels where the settings at the source site are already the limiting factor then these changes would have a limited effect. However looking at transfers from FNAL to RAL for CMS we see a much greater improvement.

Before the tweak the maximum file transfer rate was ~20MB/s with an average of 6.2MB/s. However; after the TCP tweak these increased to 50MB/s and 12.9MB/s respectively.

Another set of sites where the changes dramatically helped were transfers from the US tier2s to RAL ( over the production network rather than the OPN). Before the tweaks the transfers peaked at 10Mb/s and averaged 4.9MB/s. After the tweaks, these values were 40MB/s and 10.8 MB/s respectively.

Now putting all these values into a spreadsheet and looking at other values we get:

Solid Line is Peak. Dotted line is average.
Green is total transfers.
Red is transfer from FNAL.
Blue is transfers to US T2 sites.
Tests on a pre-production system at RAL also show that the efffects on the LAN transfers for these changeas are acceptable.

No comments: