29 December 2016

2016 - nice and boring?

We like GridPP services to be "nice and boring": running smoothly, and upgrades are uneventful

One cannot accuse the year 2016 of having been N&B, with lots of interesting and extraordinary events (also) in science. However, computing and physics also had their share of the seemingly extraordinary number of celebrities we will miss, such as Tomlinson and Minsky in computing, and arguably both Kibble, and Rubin should have won Nobel prizes in physics (or to be precise, shared the prizes that were awarded.)

In 2016 we continued to deliver services for LHC despite changing requirements, and also to support the much smaller non-LHC communities. With GridPP as a data e-infrastructure (and "data transfer zone"), have also revisited connecting GridPP to other data infrastructures and will continue this work in 2017.

GridFTP continues to be the popular and efficient workhorse of data transfers; xroot is also popular but mainly in high energy physics. Tier 2 sites are set to become SRM-less. Accounting will need more work in 2017; hopefully we can do this with the GLUE group and the EGI accounting experts. GridPP also looks forward to contributing to the EPSRC-funded Pathfinder pilot project which should eventually enable connecting DiRAC, eMedLab, and GridPP. So, perhaps, not N&B either.

21 December 2016

Comparative Datamanagementology

GridPP was well represented at the cloud workshop at Crick. (The slides and official writeup still have not appeared as of this blog post)

The general theme was hybrids, so it is natural to wonder whether it is useful to move data between infrastructures and what is the best way to do it. In the past we have connected, for example, NGS and GridPP (through SRB, using GridFTP), but there was not a strong need for it in the user community. However, with today's "UKT0" activities and more multidisciplinary e-infrastructures, perhaps the need for moving data across infrastructures will grow stronger.

xroot may be a bit too specialised, as it is almost exclusively used by HEP? but GridFTP is widely use by users of Globus, and is the workhorse behind WAN transfers (as an aside, we hear at the AARC meeting that Globus are pondering moving away from certificates, towards a more OIDC approach - which would be new as GridFTP has always required client certificate authentication.)

The big question is whether moving data between infrastructures is useful at all - will users make use of it? It is tempting to just upload the data to some remote storage service and share links to it with collaborators. Providing "approved" infrastructure for data sharing helps users avoid the pitfalls of inappropriate data management, but they still need tools to move the data efficiently, and to manage permissions.  For example, EUDAT's B2ACCESS was specifically designed to move data into and out of EUDAT (as EUDAT does not offer compute).

So far we have focused on whether it is possible at all to move data between the infrastructures, the idea being to offer users the ability to do so. The next step is efficiency and performance, as we saw with DiRAC where we had to tar the files up in order to make the transfer of small files more efficient, and to preserve ownerships, permissions, and timestamps. 

16 December 2016

XRDCP and Checksums with DPM

To check if a file transfer was successful, xrdcp is able to calculate a checksum at the destination. However, while this works well for plain xrootd installations it is not working at the moment when used together with DPM. The reason for that seems to be that xrootd is only used as disk server clients and doesn't use the redirector component which would do the translations between logical and physical file names. This translation is done within DPM.
(if the reason why it is not working at the moment is different please add a comment)

If a checksum is needed to verify a successful copy of the data, one way is to copy the file from the origin to the disk server first, and then transfer it back to the origin server and calculate the checksum on what was transferred back. That always works, but is not very efficient since it can involve a lot of additional network traffic, especially at sites with a small number of storage servers but large amount of compute resources or when transferring files to distant sites. This method is implemented by some experiments if xrdcp fails to give a checksum.

While the DPM developers work on a build-in solution for DPM, there is also another method that can be used to calculate the checksum without any additional network traffic, which can be used in the meantime.
Xrootd provides a very flexible interface for configurations. What we can use here is the possibility to specify an external program to calculate the checksum. This can be any executable, especially also a shell script.
To do so, one needs to add to the xrootd config file on the disk servers the following option

xrootd.chksum adler32 /PATH/TO/SCRIPT.sh

where "adler32" specifies the used checksum algorithm and "/PATH/TO/SCRIPT.sh" specifies which script is used to calculate the checksum and where it is.
(make sure the script is executable)
Xrootd will also automatically pass the logical file name as a parameter to the script

In the script it is then possible to do the logical filename to physical file name lookup and calculate the checksum.  To be able to do so, the DPM tools need to be installed at least on the DPM head node which can be found in dpm-contrib-admintools​ when using the EPEL repository. Also, the clients need to have a way to contact the DPM head node to do the lookup then.
An example script that can be adapted to own configurations can be found here.