17 November 2008

What I did in my CERN vacation

"On the first day of my CERN vacation, I arrived." ...Er.
In approximate order of usefulness, this is what I managed to get done at the 2.5 day workshop last week (I'll insert more stuff as I remember...) Other people should report on the non-storage.

  1. Thanks to the ever-helpful EGEE project office, got my paperwork renewed. Hooray!
  2. Working with Jan to deploy the CIP (CASTOR Information Provider) at CERN. Hooray!
  3. Catching up with the rest of the CASTOR team and getting lots of CASTOR related notes. There's a lot of new stuff in 2.1.8 (which can be deployed without GSIRFIO), and some strategic planning. xrootd is considered important strategically. Lots of other proposed developments. There is even a suggestion to investigate using Lustre for CASTOR. And how to best provide POSIX access?
  4. Kickstarting some work with DPM and dCache people. Even interoperation work is not completed and there are lots of little things to follow up on. For example, many SEs support xrootd but not necessarily interoperatingly. SE support for checksumming has improved a lot recently so is becoming increasingly promising.
  5. General GridPP-storage related stuff: priorities, experiment planning, site recovery from downtime.
  6. Catching up with Steven Newhouse over dinner Friday, talking about among other things standardisation work in EGEE (SRM, GLUE, etc), and EGI.
  7. Learning about the experiments storage experiences at the workshop. I heard both "it is too complex" and "we should make use of richer functionality". Made notes we should discuss in the storage meetings. Middleware plans are particularly important when there is no rollback out of an upgrade.
  8. Installed Capacity discussions. This should have been higher up this list as it was my primary reason for being there(!) but we only covered about 5% of what we ought to have covered. Still, 5% is better than 0%, and it was enough to solve 1 of my 7-8 remaining open use cases because dCache does something similar but complicatedlier. More on this later.
  9. FTS monitoring. Talked to Ricardo da Rocha who wants to do visualisation of FTS transfers. Well, whaddayaknow, we (GridPP) have already done this: Gidon built a version for me for the EGEE user forum in 2007 in Manchester, based on his work with the RTM. It uses RGMA monitored transfers, sped up x100 to make it look more interesting. Gidon gave me the source before he left. Unless someone from GridPP wishes to volunteer, I'll hand it over to Ricardo along with a suggestion to keep the "GridPP" sticker and give credit to Gidon.
  10. I am coordinating some additional work on SRM and SRB interoperation (beyond what we've achieved previously, I hope). I discussed my requirements for FTS with Paolo, Àkos, Gavin, et al. Simon Lin was there from ASGC so I got an update on Fu-Min's SRM-to-SRB layer.
  11. I've started using ROOT for some of the testing at RAL (obviously for rootd testing for LHCb) - some new features in PROOF may come in handy if we're to extend this.

That's from a first pass through my notes. Hey ho.

10 November 2008

Edinburgh dCache is dead. Long live DPM!

Today is a sad day in the world of Edinburgh storage. The long serving and (semi-)reliable dCache storage element has now been retired. It's had a hard life, with many ups and downs, particularly in it's youth. However, as it matured it grew into a dependable work horse for our local physics analyses. Unfortunately, the hardware is now old and creaking and the effects of the credit crunch have even managed to propagate all the way to the top of our ivory tower (i.e., electricity has gone up). These effects combined to force us to put dCache to sleep in a peaceful and humane manner at the end of last week.

However, never fear, all is not lost. We have been successfully using DPM for data access for many months now. We still have the occassional Griddy problem with it, but it is growing up to fill the void left behind by it's half brother. Long live DPM!