Following recent issues at the
RAL T1 , we were worried about not just overall load on our
SRM caused by ATLAS using the
RAL FTS, but also the rate at which they put load on the system.
At ~10pm on the 10
th November 2011 (
UTC); ATLAS went from running almost empty to almost full on
FTS channels involving
RAL being controlled by the
RAL FTS server. This can be seen in the number of active transfer plot:

This was caused by atlas suddenly putting into the ATLAS
FTS many transfers which can be seen in the "Ready" queue:

This lead to a high transfer rate as shown here:

And is also seen in our own
internal network monitoring:

The
FTS rate is for transfers only going through the
RAL FTS. ( I.e does not include puts by
CERN FTS, Gets from other T1s or the chaotic background of
dq2-gets,
dq2-puts and
lcg-cps not covered in these plots. Hopefully this means our current
FTS settings can cope with start of these ATLAS data transfer spikes. We have seen from previous backlogs that these large spikes lead to a temporary backlog ( for a typical size of spike;) which clears well within a day.
No comments:
Post a Comment