02 August 2012

DPM-XROOTD and Federated redirection: volume 1

Historically, one of the weaknesses in DPM as an SE, from the perspective of some of the LHC VOs, was its lacking xrootd support. (While, technically, DPM has supported "xrootd" for some time, the release of xrootd involved has always lagged significantly behind the curve, meaning that DPMs supporting the protocol often couldn't actually provide functionality expected of them.)

Partly as a result of the recent enthusiasm for federated storage (a concept whereby storage endpoints become part of a redirection hierarchy, so that requests against files not present locally can be passed up the chain, until a (hopefully close) endpoint with the file can be found to serve the request), and the particular enthusiasm of ATLAS and CMS (thanks to their experiments in the US) for xrootd as the mechanism for this, DPM's xrootd support has recently improved significantly.

At present, the package is still beta (in particular, the YAIM module is not released yet, so hand configuration is more reliable), but it's been tested on the development SE here at Glasgow (svr025), with some success.

The current release of the dpm-xrootd package still needs to be obtained from an unusual location (instructions here: https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/Xroot/Setup ), but has the advantage that it will work with DPM 1.8.2 disk nodes, as long as the head node is DPM 1.8.3.
As 1.8.3 is EMI-only, this effectively allows you to test the protocol with gLite disk nodes for the first time.

I've recently set this release up on the production SE at Glasgow (svr018, which is precisely EMI 1.8.3 head, with a mix of gLite and EMI disk nodes). Some thoughts follow:

1) it is not safe to install dpm-xrootd from the marked repository if your glite-SE_dpm_disk release is less than 1.8.2. One of the dependancies of the package is the 1.8.2 release of dpm-lib, but without the rest of the packages from 1.8.2 being installed, this will simply break gridftp and rfiod.
Update your node to 1.8.2 and then pull in dpm-xrootd.

2) the configuration described in the link above is identical for all disk pool nodes. This means it is much less painful than it might be - test with one disk node then mirror across the others.

3) It appears that, for some reason, the dpm-xrootd package does not like SL5.5 and glite-SE_dpm_disk  - several of our disk pools are on this SL release, and the xrootd service refused to start on them. Updating (yum update) to SL5.7 fixes this, by means currently not fully understood.

4) the provision of a certificate with a valid ATLAS VOMS role for the LFC lookup is provided as an exercise for the reader. This is a requirement of the xrootd redirection framework, not dpm specifically, and I hope it will go away soon, since it's extremely silly.

With those caveats in mind, things seem to work fairly well, although this is all in the testing phase for ATLAS (and Europe) for the moment.

No comments: