16 December 2016

XRDCP and Checksums with DPM

To check if a file transfer was successful, xrdcp is able to calculate a checksum at the destination. However, while this works well for plain xrootd installations it is not working at the moment when used together with DPM. The reason for that seems to be that xrootd is only used as disk server clients and doesn't use the redirector component which would do the translations between logical and physical file names. This translation is done within DPM.
(if the reason why it is not working at the moment is different please add a comment)

If a checksum is needed to verify a successful copy of the data, one way is to copy the file from the origin to the disk server first, and then transfer it back to the origin server and calculate the checksum on what was transferred back. That always works, but is not very efficient since it can involve a lot of additional network traffic, especially at sites with a small number of storage servers but large amount of compute resources or when transferring files to distant sites. This method is implemented by some experiments if xrdcp fails to give a checksum.

While the DPM developers work on a build-in solution for DPM, there is also another method that can be used to calculate the checksum without any additional network traffic, which can be used in the meantime.
Xrootd provides a very flexible interface for configurations. What we can use here is the possibility to specify an external program to calculate the checksum. This can be any executable, especially also a shell script.
To do so, one needs to add to the xrootd config file on the disk servers the following option

xrootd.chksum adler32 /PATH/TO/SCRIPT.sh


where "adler32" specifies the used checksum algorithm and "/PATH/TO/SCRIPT.sh" specifies which script is used to calculate the checksum and where it is.
(make sure the script is executable)
Xrootd will also automatically pass the logical file name as a parameter to the script

In the script it is then possible to do the logical filename to physical file name lookup and calculate the checksum.  To be able to do so, the DPM tools need to be installed at least on the DPM head node which can be found in dpm-contrib-admintools​ when using the EPEL repository. Also, the clients need to have a way to contact the DPM head node to do the lookup then.
An example script that can be adapted to own configurations can be found here.


No comments: