Showing posts with label GLUE. Show all posts
Showing posts with label GLUE. Show all posts

30 June 2014

Thank you for making a simple compliance test very happy

Rob and I had a look at the gstat tests for RAL's CASTOR. For a good while now we have had a number of errors/warnings raised. They did not affect production: so what are they?

Each error message has a bit of text associated with it, saying typically "something is incompatible with something else" - like an "access control base rule" (ACBR) is incorrect, or tape published not consistent with type of Storage Element (SE). The ACBR error arises due to legacy attributes being published alongside the modern ones, and the latter complains about CASTOR presenting itself as tape store (via a particular SE)

So what is going on?  Well, the (only) way to find out is to locate the test script and find out what exactly it is querying. In this case, it is a python script running LDAP queries, and luckily it can be found in CERN's source code repositories. (How did we find it in this repository? Why, by using a search engine, of course.)

Ah, splendid, so by checking the Documentation™ (also known as "source code" to some), we discover that it needs all ACBRs to be "correct" (not just one for each area) and the legacy ones need an extra slash on the VO value, and an SE with no tape pools should call itself "disk" even if it sits on a tape store.

So it's essentially test driven development: to make the final warnings go away, we need to read the code that is validating it, to engineer the LDIF to make the validation errors go away.

19 September 2012

WAD

We had an interesting experience with the CASTOR upgrade to 2.1.12, that the link between the storage area (SA) and the tape pool disappeared in the upgrade. In GLUE speak, the SA is a storage space of sorts, which may be shared between collaborators - we use it to publish dynamic usage data.

In CASTOR, we have used the "service class" as the SA; there is then a many-to-many link to disk pools and tape pools, something like this:

The dynamic data of each pool then gets shared accordingly between all the SvcClasses, which is (was) the Right Thing™.  Now the second association link has gone away, we're wondering how to keep publishing data correctly in the short term - and the upgrade got postponed by a week amidst much scratching of heads.

The information provider may just have enough information (in its config files) to restore the link, but it'd be a bit hairy to code - we're still working on that - but it may just be better to rething what the SA should be (which we will). We also tried a supermassive query which examined disk copies of files from tape pools to see which disk pools they were on, and then linking those with service classes - which was quite enlightening as we discovered those disk copies were all over the place, not just where they were supposed to be...

In the interest of getting it working, we decided to just remember and adjust which data publishes where - meanwhile, we shall then rethink what the SA should be in the future.

18 July 2011

Storage accounting in OSG and OGF

Groups like UR are getting around to discussing storage records. OSG already create storage records: they have XML-formatted records for both the transfer and the file history. (With thanks to Steve Timm from FNAL.)
<StorageElementRecord xmlns:urwg="http://www.gridforum.org/2003/ur-wg">
<RecordIdentity urwg:createTime="2011-07-17T21:18:07Z" urwg:recordId="head01.aglt2.org:544527.26"/>
<UniqueID>AGLT2_SE:Pool:umfs18_3</UniqueID>
<MeasurementType>raw</MeasurementType>
<StorageType>disk</StorageType>
<TotalSpace>25993562993750</TotalSpace>
<FreeSpace>6130300894785</FreeSpace>
<UsedSpace>19863262098965</UsedSpace>
<Timestamp>2011-07-17T21:18:02Z</Timestamp>
<ProbeName>dcache-storage:head01.aglt2.org</ProbeName>
<SiteName>AGLT2_SE</SiteName>
<Grid>OSG</Grid>
</StorageElementRecord>
Over in GLUE-land, the GLUE group insist that using the GLUE schema to publish accounting data - and indeed to use GLUE data for anything other than resource selection - "cannot be done." Unfortunately the chairs didn't make it to OGF, but next steps will include work on the XML rendering of GLUE 2.0, along with the implementations.
Meanwhile, back home in GridPP-land, we use GLUE 1.3 for dynamic data. The question is still mainly about the accuracy (and freshness) of the information published: e.g. temporary copies on disk, files being "deleted" from tape, etc, how these should affect the published dynamic data. As we now have "accurate" tape accounting, the information provider should be updated soon.

26 June 2007

DPM 1.6.5-1 in PPS

v1.6.5-1 of DPM is now in pre-production. The relevant savannah page is here:

https://savannah.cern.ch/patch/index.php?1179

This release involves various bug fixes. What is interesting is that it will now be possible to set ACLs on DPM pools, rather than just limiting a pool to either a single VO or all VOs. This should make sites happy. The previous posting on this version of DPM mentioned that gridftpv2 would be used, but the release notes don't mention this, so we will have to wait and see.

Also out in PPS is the use of v1.3 of the GLUE schema. This is really good news since GLUE 1.3 will allow SEs to properly publish information about SRM2.2 storage spaces (i.e. Edinburgh has 3TB of ATLAS_AOD space).

https://savannah.cern.ch/patch/index.php?980