29 May 2008

Monitoring grid data transfers

It has become increasingly clear during the CCRC exercises that GridView is not a good place to go if you want to get a good overview of current Grid data transfer. Simply put, it does not have all of the information required to build such a global view. It is fine if you just want to see how much data is pumping out of CERN, but beyond that I'm really not sure what it is showing me. For a start, no dCache publishes its gridftp records into GridView. Since there are ~70 dCache sites out there (5 of which are T1s!), there is a lot of data missing. There also seems to be data to/from T2s that is missing since the GridView plots I posted a few days ago show a tiny T2 contribution which just doesn't match up with what we are seeing in the UK at the moment.

To get a better idea of what is going on, you've really got to head to the experiment portals and dashboards. Phedex for CMS, DIRAC monitoring for LHCb and the ATLAS dashboard (sorry, not sure about ALICE...) all give really good information (rates, successes, failures) about all of the transfers taking place.

(This being said, I do find the ATLAS dashboard a little confusing - it's never entirely clear where I should be clicking to pull up the information I want.)

You can also go to the FTS servers and get information from their logs. In the UK we have a good set of ganglia-style monitoring plots. This provides an alternative view of the information from the experiment dashboards since it shows transfers from all VOs managed by that FTS instance. Of course, this doesn't give you any information about transfers not managed by that server, or transfers not managed by any FTS anywhere. As I've mentioned before, I put together some basic visualisation of FTS transfers which I find good to get a quick overview of the activity in the UK.

Summary: I won't be going back to GridView to find out the state of current data transfers on the Grid.


Gav said...

Anyone tried FTM yet?

This should be publishing Completed FTS file records into gridview, keyed by source site / dest site / VO.

Greig A Cowan said...

Any links to FTM?

Jamie said...

We made an acknowledged mistake in the preparation for the May run of CCRC'08 by not stating explicitly the requirement for the FTM (and why...)

More at the post-mortem workshop next week.

We have to publish an updated list of the m/w (and storageware) components, following the experience of May.

The deployment of any new versions has to be carefully planned - this is something I will cover in my intro to the w/s and is described in the LCG Services quarterly report, attached to the agenda item.


Cheers, Jamie