GridPP storage news

23 August 2007

DPM vulnerability

Another security hole in the DPM gridftp server has been found and subsequently patched.

All details of the update can be found here.

The security advisory issued by the GSVG can be found here.

All DPM sites should use YAIM (or your method of choice) to upgrade to the latest version (DPM-gridftp-server-1.6.5-6) ASAP. Depending on how regularly you have been updating, there may also be new rpms available for other components of the DPM (all of these are on 1.6.5-5).

sgm and prod pool accounts

I've been a bit confused of late regarding what the best course of action is regarding how to deal with sgm and prod pool accounts on the SEs, in particular, dCache. As an example, Lancaster have run into the problem where a user with an atlassgm proxy has copied files into the dCache and has correspondingly been mapped to atlassgm:atlas (not atlassgm001 etc, just plain old sgm). Non-sgm users have then tried to remove these files from the dCache and have been denied since they are simple atlas001:atlas users. The default dCache file permissions do not allow group write access. This raises a few issues:

1. Why is atlassgm being used to write files into the dCache in the first place?

2. Why are non-sgm users trying to remove files that were placed into the dCache by a (presumably privileged) sgm user?

3. When will dCache have ACLs on the namespace to allow different groups of users access to a bunch of files?

The answer to the 3rd point is that ACLs will be available some time next year when we (finally) get the Chimera namespace replacement to PNFS. ACLs come as a plugin to Chimera.

The interim solution appears to be just to map all atlas users to atlas001:atlas, but this obviously doesn't help the security and traceability aspect that pool accounts are partially trying to solve. Since DPM supports namespace ACLs, we should be OK with supporting sgm and prod pool accounts. Of course, this requires that everyone has the appropriately configured ACLs, which isn't necessarily the case, as we've experienced before.

Comments welcome below.

22 August 2007

Storage accounting - new and improved

We have made some improvements to the storage "accounting" portal (many thanks go to Dave Kant) during the past week or so. The new features are:

1. "Used storage per site" graphs are now generated (see image). This shows the breakdown of resources per site, which is good when looking at the ROC or Tier-2 views.

2. "Available storage per VO" graphs are generated in addition to the "Used" plots that we've always had. This comes with the usual caveats of available storage being shared among multiple VOs.

3. There is a Tier-2 hierarchical tree, so that you can easily pick out the Tier-2s of interest.

4. A few minor tweaks and bug fixes.

Current issues are in savannah.

The page is occasionally slow to load up as the server is also used by the GOC to provide RB monitoring of the production grid. Alternatives to improve speed are being looked at.

15 August 2007

CE-sft-lcg-rm-free released!

A new SAM test is now in production. It does a BDII lookup to check that there is sufficient space on the SE before attempting to run the standard replica management tests. This is good news for sites whose SEs fill up with important experiment data. If the tests finds that there is no free space, then the RM tests don't run. Of course this requires that the information being published into the BDII is correct in the first place. I'll need to check if this system could be abused by sites who publish 0 free space by default, thereby by-passing the RM tests and therefore any failures that could occur. I suppose that GStat already reports sites as being in a WARNING status when they have no free space.

See the related post here.

14 August 2007

CLOSE_WAIT strikes again

Multiple DPM sites are reporting instabilities in the DPM service. The symptoms are massive resource usage by multiple dpm.ftpd processes on the disk servers (running the v1.6.5 of DPM). These have been forked by the main gridftp server process to deal with client requests. Digging a little further we find that the processes are responsible for many CLOSE_WAIT TCP connections between the DPM and the RAL FTS server. It also happens that all of the dpm.ftpd processes are owned by the atlassgm user, but I think this is only because ATLAS are the main (only?) VO using FTS to transfer data at the moment.

CLOSE_WAIT means that the local end of the connection has received a FIN from the other end, but the OS is waiting for the program at the local end to actually close its connection.

Durham, Cambridge, Brunel and Glasgow have all seen this effect. The problem is so bad at Durham that they have written a cron job that kills off the offending dpm.ftpd processes at regular intervals. Glasgow haven't been hit to badly, but then they do have 8GB of RAM on each of their 9 disk servers!

The DPM and FTS developers have been informed. From emails I have seen it appears that the DPM side is at fault, although the root cause is still not understood. This situation is very reminiscent of the CLOSE_WAIT issues that we were seeing with dCache at the end of last year.

Also see here.

DPM and xrootd

Following on from dCache, DPM is also developing an xrootd interface to the namespace. xrootd is the protocol (developed by SLAC) that provides POSIX access to their Scalla storage system, who's other component is the oldb clustering server.

DPM now has a usable xrootd interface. This will sit alongside the rfiod and gridftp servers. Currently, the server has some limitations (provided by A Peters at CERN):

* xrootd server runs as a single 'DPM' identity, all file reads+writes are done on behalf of this identity. However, it can be restricted to read-only mode.

* there is no support of certificate/proxy mapping

* every file open induces a delay of 1s as the interface is implemented as an asynchronous olbd Xmi plugin with polling.

On a short time scale the certificate support in xrootd will be fixed and VOMS roles added (currently certificate authentication is broken for certain CAs) . After that, the DPM interface can be simplified to use certificates/VOMs proxies & run as a simple xrootd OFS plugin without need for an olbd setup.

So it seems that xrootd is soon going to be available across the Grid. I'm sure that ALICE (and maybe some others...) will be very interested.

06 August 2007

dcache on SL4

As part of our planned upgrade to SL4 at Manchester, we've been looking at getting dcache running.
The biggest stumbling block is a lack of glite-SE_dcache* profile, luckily it seems that all of the needed components apart from dcache-server are in the glite-WN profile. Even the GSIFtp Door appears to work.

05 August 2007

SRMv2.2 directory creation

Just discovered that automatic directory creation doesn't happen with SRMv2.2. Directories are created when using SRMv1.

02 August 2007

Annoyed

I re-ran YAIM yesterday on the test DPM I've got at Edinburgh as it turned out we were not publishing the correct site name. Annoyingly, this completely broke information publishing as the BDII couldn't find the correct schema files (again). I had to re-create the symbolic link from /opt/glue/schemas/ldap to /opt/glue/schemas/openldap2.0 and then double check that all was well with the /opt/bdii/etc/schemas files. A restart of the BDII then sorted things out.

It's not really fair to blame YAIM here since I'm running the SL3 build of DPM on SL4, which isn't really supported. Well, I'm hoping that's the source of the trouble.

01 August 2007

Non-improvement of SAM tests

For a while I have been pushing for the creation of a SAM test that only probes the SRM and does not depend on any higher level services (like the LFC or BDII). This would be good as it would prevent sites being marked as unavailable when in fact their SRM is up and running.

Unfortunately, the SAM people have decided to postpone the creation of a pure-SRM test. I don't really understand their concerns. I thought using srmcp with a static (but nightly updated) list of SRM endpoints would have been sufficient. I guess they have some reservations about using the FNAL srmcp client, since it isn't lcg-utils/GFAL, which are the official storage access methods.

https://savannah.cern.ch/bugs/?25249

31 July 2007

Improvement to SAM replica management tests

The SAM people have added a new test that checks if the default SE has any free space left. This
test will not be critical by default, however its failure will cause the actual replica management tests not to execute at all. This is good news, as a full SE (in my opinion) is not really a site problem and not a reason that the site should fail a SAM test.

The full bug can be found here:

http://savannah.cern.ch/bugs/?26046

Greig

SAM failures, again

Looks like something changed inside SAM (again) yesterday, causing a large number of sites to fail the CE replica management tests with a "permission denied" error.

Further investigation shows that the failed tests were being run by someone with a DN from Cyfronet.
By default, this is mapped to the ops group in the grid map file, not opssgm like Judit Novak and Piotr Nyczyk. It is clear from the Glasgow DPM logs that this DN does not belong to ops/Role=lcgadmin. This then leads to failures in DPMs due to the fact that the dpm/domain/home/ops/generated/ directories have ACLs on them which only grant write permissions to people in ops/Role=lcgadmin.

Looks like things have been rectified now.

Why do we keep on getting hit by things like this?

27 July 2007

SRM and SRB interoperability - at last!

People have been talking for years about getting SRM and SRB "interoperable", mostly involving building complicated interfaces from SRX to SRY in various ways.

Now it turns out SRB has a GridFTP interface, developed by Argonne. So here's the idea: why don't we pretend the SRB is a Classic SE?

So we can now transfer files with gridftp (i.e. globus-url-copy) from dCache to SRB and vice versa, although the disadvantage is that you have to know the name of a pool node with a GridFTP door. Incidentally, if you try that, don't forget -nodcau or it won't work (for GridFTP 3rd party copying).

But here's the brilliant thing: it also works with FTS, since FTS still supports Classic SEs. So we have successfully transferred data between dCache, the SRM, and SRB as a, well, GridFTP server, and back again.

Cool, eh?

Next step is to set up a Classic SE-shaped information system for SRB and see if it works with lcg-utils and GFAL (because FTS does not depend on the SE having a GRIS).

This is work with Matt Hodges at Tier 1 who set up FTS, and with Roger Downing and Adil Hasan from STFC for the SRB.

--jens

20 July 2007

Slightly Modified DPM GIP Plugin

The last version of the DPM GIP plugin had a few minor bugs:

The "--si" flag had been lost somewhere.
DNS style VOs were not handled properly (e.g., supernemo.vo.eu-egee.org).

There's a new version which corrects these little problems: http://www.physics.gla.ac.uk/~graeme/scripts/packages/lcg-info-dynamic-dpm-2.2-2.noarch.rpm.

Now submitted as a patch in Savannah.

17 July 2007

Optimised DPM GIP Plugin

Lana Abadie noticed that my DPM GIP plugin was rather inefficient (table joins are expensive!), and sent a couple of options for improving the SQL query. I implemented them in the plugin and it speeded up by a factor of 10.

I have produced a new RPM with the optimised query, which is available here: http://www.physics.gla.ac.uk/~graeme/scripts/packages/lcg-info-dynamic-dpm-2.2-1.noarch.rpm.

I am running this already at Glasgow and I would recommend it for anyone with a large DPM.

N.B. It is compatible with DPM 1.6.3 and 1.6.5, but remember to modify the /opt/lcg/var/gip/plugin/lcg-info-dynamic-se wrapper to run /opt/lcg/libexec/lcg-info-dynamic-dpm instead of /opt/lcg/libexec/lcg-info-dynamic-dpm-beta.

SRM2.2 storage workshop

There was a storage workshop held at CERN on the 2nd and 3rd of July. The focus of discussions was on the SRM2.2 developments and testing of the endpoints. The majority of the endpoints are being published in the PPS, the intention being that the experiments will be able to use them in a ~production environment and allow some real stress tests to be run against them. The experiments see SRM2.2 as being an essential service for them, so hopefully they have sufficient manpower to run the tests...

Getting the software installed on the machines isn't a problem, but getting it configured can be tricky. The main point that I tried to highlight on a number of occasions was the necessity for sites to have really good documentation from both the developers (how the SRM2.2 spaces can be configued) and the experiments (how the SRM2.2 spaces should be configured for their needs). I will make sure that I provide instructions for everyone to ensure that the deployment goes (relatively) smoothly. It shouldn't be too much of a problem for DPM sites, dCache sites will need to start playing around with link groups ;-)

From mid-October, sites should be thinking of having these SRM2.2 spaces configured. The plan is that by January 2008, everyone will have this functionality available, and SRM2.2 will become the default interface.

DPM gridftp security

Apologies for not posting for a while, it's been a busy few weeks. First thing that should be mentioned is the gaping security hole that existed in the DPM gridftp server. Users using the uberftp (or some other suitable) client could log into the server and change permissions on anyones files, move files to different areas of the DPM namespace or even move files outside of the namespace altogether. Thanks to Kostas and Olivier at Imperial for spotting this. Unfortunately, it took a couple of weeks, 3 patch releases and a lot of testing within GridPP before we finally plugged the hole.

Initially only patched version of the 1.6.5 server was produced. I asked for the fix to be back-ported to 1.5.10 as there were a few sites still running this version, unable to upgrade to the latest release (due to the upgrade problems) due to ongoing experiment tests and wanting to be as secure as they could be. This was done, so thanks to the DPM team.

All sites should upgrade to the latest version of DPM and ensure that they are running patch -4 of the gridftp server.

26 June 2007

DPM 1.6.5-1 in PPS

v1.6.5-1 of DPM is now in pre-production. The relevant savannah page is here:

https://savannah.cern.ch/patch/index.php?1179

This release involves various bug fixes. What is interesting is that it will now be possible to set ACLs on DPM pools, rather than just limiting a pool to either a single VO or all VOs. This should make sites happy. The previous posting on this version of DPM mentioned that gridftpv2 would be used, but the release notes don't mention this, so we will have to wait and see.

Also out in PPS is the use of v1.3 of the GLUE schema. This is really good news since GLUE 1.3 will allow SEs to properly publish information about SRM2.2 storage spaces (i.e. Edinburgh has 3TB of ATLAS_AOD space).

https://savannah.cern.ch/patch/index.php?980

21 June 2007

DPM 1.5.10 -> 1.6.4 upgrade path broken in YAIM

As reported at yesterdays storage meeting, the upgrade path from DPM 1.5.10 to 1.6.4 is broken in YAIM 3.0.1-15. The different versions of DPM require database schema upgrades in order to be able to handle all of the SRM2.2 stuff (space reservation etc). YAIM should contain appropriate scripts to perform these upgrades, but it appears that they appropriate code has been removed, meaning that it is no longer possible to move from schema versions 2.[12].0 in v1.5.10 of DPM to schemas 3.[01].0 in v1.6.4. We stumbled upon this bug when I asked Cambridge to upgrade to the latest DPM in an attempt to resolve the intermittent SAM failures that they were experiencing. A fairly detailed report of what was required to solve the problem can be found in this ticket:

https://gus.fzk.de/pages/ticket_details.php?ticket=23569

It should be noted that for some reason (a bug in a YAIM script?) the Cambridge DPM was missing two tables from the dpm_db database. These were dpm_fs and dpm_getfilereq (I think). This severely hindered the upgrade since we were trying to upgrade the schema, which was successful, but then the DPM wouldn't start. A restore of the database backup, then an upgrade to DPM 1.6.3 then onto DPM (I'm keeping a close eye on the SAM tests...). Sites should be aware that they may need to follow the steps detailed in this link while performing the database upgrade.

https://twiki.cern.ch/twiki/bin/view/LCG/DpmSrmv2Support

After the installation, the srmv2.2 daemon was running and the SRM2.2 information was being published by the BDII. This is all good. If you end up using yaim 3.0.1-16, it should not be necessary to manually install the host certificates for the edguser.

In summary, the 1.5.10 to 1.6.4 upgrade was a lot of work. Thanks to Santanu for giving me access to the machine. This problem raises issues about sites keeping up to date with the latest releases of middleware. Although there were problems with the configuration of 1.6.4, v1.6.3 has been stable in production for a while now. I'm not really sure why some sites hadn't upgraded to that. It would be great if every site could publish the version of the middleware that they are using. In fact, such a feature may be coming very soon. Just watch this space.

08 June 2007

Anyone for a DPM filesystem?

Looks like someone at CERN is developing a mechanism to enable DPM servers to be mounted. This DPMfs could be used as a simple DPM browser, presenting the namespace in a more user-friendly form than the DM command line utilities. The DPM fs is implemented using the FUSE kernel module interface. The file system calls are forwarded to the daemon which communicates with the DPM servers using the rfio and dpns API and sends back the answer to the kernel.

It's in development and not officially supported:

https://twiki.cern.ch/twiki/bin/view/LCG/DPMfs