06 December 2007

Storage as seen by SAM



We all need more monitoring, don't we? I knocked up these plots showing the storage SAM test results for the ops VO at GridPP sites over the past month. I am only looking at the SE and SRM tests here, where the result for each day is calculated as the number of successes over the total number of tests. The darker-green the square the higher the availability. I think it's clear which sites are having problems.

http://www.gridpp.ac.uk/wiki/GridPP_storage_available_monitoring

We always hear that storage is really unreliable for the experiments, so I was actually quite surprised at the amount of green on the first plot. However, I think since these results are only for the short duration ops tests, they do not truely reflect the view that the experiments have of storage when they are performing bulk data transfer across the WAN or a large amount of local access to the compute farm.

These plots were generated thanks to some great python scripts/tools from Brian Bockleman (and others, I think) from OSG. Brian's also got some interesting monitoring tools for dCache sites which I'm having a look at. It would be great if we could use something similar in GridPP.

No comments: