28 December 2015

There is no such thing as a "Solution"

As we near the end of 2015 and look forward to 2016, it is time to reflect on the past year and look ahead to the next. The GridPP infrastructure can clearly deal with the tens/hundreds petabyte range but the middleware is just that, "middle." Some work always is needed to adapt to every new community's requirements. The large LHC experiments can move away from the established codebase and specialise their infrastructure which is both a good thing (infrastructure is flexible) and bad (not all communities can roll their own infrastructure). Using open and interoperable standards helps.

There exists research which is not high energy physics (HEP). These people look at the HEP infrastructure - the global infrastructure that found the Higgs - and say "we don't work like that."

Yet there is a role for GridPP's storageanddatamanagement infrastructure and expertise - most research areas of have data, and growing volumes of it. The expertise in moving, storing, and making data discoverable and available for processing is important regardless of which area of research the data belongs to. It doesn't matter that their analysis is different; their data problems are the same.

Big data volumes are also already found in astronomy, bioinformatics, Earth sciences, etc., and these communities have developed methods for data management, too. So 2016 will be an important year to continue the discussions with their infrastructure managers and developers - while every community is different and there is no such thing as a "solution," the more expertise and infrastructure we can share, the more easily we can manage the research data challenges of the future.

16 December 2015

DPM Workshop 2015

This year, the annual DPM Collaboration Workshop was hosted by the core DPM development team themselves, at CERN.

As usual, it was a two-day affair, headed by regional site reports from around the world - Italy, France, Australia and the UK reported.

One common topic between the site reports was discussion of the different approaches to configuring/maintaining DPM instances - France being a Quattor-dominant zone, with separately maintained configuration, Italy using a Foreman-driven puppet (requiring some modification of the "standard" DPM puppet scripts), and Australia and the UK using the standard puppet local config or hand-configured systems.
Another common topic was discussion of the recent ATLAS-driven storage tasks - the decommissioning of the large amount of data in PRODDISK, and the push for the new data dumps for consistency checking. People were generally rather unhappy about the PRODDISK task, especially in how slow it was to actually delete all the data from the token using the rfio-based admin tools. The newer davix-tools, which should be much faster, are unfortunately much less well known at present.

(Several sites also called out Andrey Kirianov's dpm-dbck tool for deep database linting and consistency checking as being especially useful - and he had a talk later in the workshop describing his future release plans, incorporating feedback.)


The Core DPM updates followed, beginning with the news that the DPM (and thus also LFC) team is being brought back into the fold of CERN IT-DSS, where they'll join the EOS, Castor and FTS teams. The DPM devs were each at pains to indicate that this would have no visible effects in the "short term".
As for the future plans of the group, they focussed much on the well-known existing topics of the last year: removing the dependance on rfio, and srm, and leveraging this freedom to improve other aspects of the system (for example, better support for multiple checksums on files).
Already, DPM 1.8.10 provides the "beta" versions of the new DPM Space Reporting functionality (which is intended to supplant the use of SRM for space reporting), although it is currently off by default (and is unable to track storage changes made via SRM). This functionality also broke the beta EGI Storage Accounting tools, until I submitted a patch to stop directory sizes being counted as well as the size of the individual files contained (essentially n-ple counting every file in the final tally, with n the directory depth of the file!).
GridFTP Redirection is another feature which the DPM team were keen to point to as a functioning enhancement in DPM 1.8.10. As with the Space Reporting, however, it comes with caveats: the enhanced performance gained from handing off from the Head nodes' GridFTP service to the correct pool nodes' GridFTP as part of the handshake is only possible for clients which support the Delayed PASV operation mode. Neither gfal1 (ie lcg-cp) nor uberftp support this, causing the new GridFTP to fall back to a much less efficient mechanism, which actually has worse network characteristics than the old GridFTP. Additionally, as this mechanism depends on patching GridFTP itself, new releases are needed every time a new Globus release happens...

The focus of the DPM core in the next year is to be removing the rfio underpinnings of DPM, which are used for all of the low-level communication between the head and pool nodes, in favour of a new, RESTful interface based on FastCGI. This is based on some exploratory work by Eric Cheung, and is also planned to bring in additional enhancements for non-SRM DPMs (for example, fully supported space management with directory quotas).

While I did the UK report, Bristol's Luke Kreszko headed up the afternoon's talks with a great talk on the experience of running a lightweight "dmlite" style DPM on HDFS, without an SRM. While Luke had some bugs to report, he was keen to make clear that Andrea and the rest of the DPM core team had responded extremely quickly to each bug report and snag, and that the system was improving over time. (Outstanding issues include a lack of a way to throttle load on any particular service (which is also a wider concern of the UK), and a desire to have more intelligent selection of servers for each request.)

There was also an interesting series of talks on the HTTP provision for Experiments - firstly, from the perspective of the HTTP Deployment Task Force, which exists to manage and enable the deployment of HTTP/WebDAV interfaces at sites with the functionality needed to support the Experiments, and secondly from the announcement of an initiative to attempt to run ATLAS work against a "pure" HTTP DPM. (This was a little short of the promise of the title, as data transfers were envisaged to occur over GridFTP, using the new GridFTP redirection, rather than HTTP.)

We also learned, from the Belle 2 Experiment, that they were currently rolling out their storage and data management infrastructure based entirely around DPM and the LFC as a file catalog!

Heading up the day, there was also a talk on the CMS "performance testing" for DPM sites using the CMS AAA Xrootd Federation. I have to say, I wasn't entirely convinced by some of the "improvements" demonstrated by the graphs, as the "new" tests seemed to stop before approaching the kinds of load which were problematic for the "old" tests. However, it was an interesting insight into CMS's thinking about dealing with the fact that not all sites will meet their performance standards for AAA, with the idea of two Federations (a "proper" AAA, and a "not as good" AAA) existing, with sites being shunted transparently from the "proper" one to the backup if their performance characteristics dropped sufficiently to compromise the federation.

The rest of the meeting was mostly feedback and live demos of the Puppet configuration and existing tools, which was useful and involving.

23 November 2015

First Analysis of file size distribution for vo.dirac.ac.uk at RAL Tier1.

SO the DiRAC virtual organization has successfully  moved ~4M files (95TB ) of data to the RAL Tier1. But in what form:
Smallest file size is ( ignoring the 40k zero size files) is 2B
Median file size is 450kB
The modal average (114k files out of 4M)  is 1.4MB
Mean file size is 23.15MB
Largest file size is 482GB

Looking at the file size distribution is interesting.

 The VO hope to improve this by "tar"ing and compressing the files; (ideally we would like the files to be ~1Gb in size.)

14 October 2015

Dave's locations in Rucio

ATLAS have now fully moved over to their new data management system RUCIO and have also been consolidating  the locations of me and my offspring. Currently the situation is as follows:


I have have  697 unique children of which:
541 have no Clones.
129 have 1 Clone.
27 have 2 Clones ( such as me).
And 1 has 3 clones.
Compare this to early on in my life when some datasets had over 25 clones!!!

There are 637 "rooms" spread over 136 "houses" in total. I and my progeny are in 40 houses spread over 70 rooms. The types of room are:
     22 LOCALGROUPDISK
     16 DATADISK
      6 PERF-MUONS
      5 PHYS-SM
      5 PERF-EGAMMA
      5 DATATAPE
      4 PHYS-SUSY
      2 PHYS-BEAUTY
      1 TZERO
      1 SCRATCHDISK
      1 PERF-JETS
      1 PERF-FLAVTAG
      1 DET-LARG

The type of children I have breaks down as following:
607 Dirk's, 13 Gavin's and  79 Ursula's.

This houses are very international being spread over.
Canada
Czech Republic
France
Germany
Italy
Japan
Netherlands
Nordic Countries
Portugal
Russia
Spain
Switzerland
Taiwan
Turkey
UK
USA

25 September 2015

Milestone Passed with Last Two Years of Transfers Through FTS

Over the Last Two years; the FTS system (as used by the WLCG VOs and others) have moved ~0.5EB of data (over a Billion files). What is an EB ? Well its 1000 PBytes or 10^6 TBytes or 10^9 GBytes or 10^12MBytes or 10^15kBytes or 10^18Bytes. This can be seen form the monitoring page: