04 January 2016

Update on vo.dirac.ac.uk data movement and filesize distribution.

So....... I should have known that the information I posted in the blog post in November of last year would soon be out of date; but I didn't think it would be this soon! DiRAC have successfully developed their system to tar and split their data samples before transferring into the RAL Tier1. This system has dramatically increased the data transfer rates.
 What  has also changed is the number of files per tape  due to the change in average filesize per tape:
 
This has meant the number of files per tape varied from a starting value of 2-3 thousand per tape , swelling top 2-3 million before finally settling on 20-40 per tape. ( file size is ~ 250-300GB per file.)
To move large files requires good transfer rates; which we have been able to achieve; (can be seen in this log snippet):

Tue Dec 29 08:00:28 2015 INFO     bytes: 293193121792, avg KB/sec:286321, inst KB/sec:308224, elapsed:1001
Tue Dec 29 08:00:33 2015 INFO     bytes: 294824181760, avg KB/sec:286481, inst KB/sec:318566, elapsed:1006
Tue Dec 29 08:00:38 2015 INFO     bytes: 296458387456, avg KB/sec:286643, inst KB/sec:319180, elapsed:1011
Tue Dec 29 08:00:43 2015 INFO     bytes: 298053795840, avg KB/sec:286766, inst KB/sec:311603, elapsed:1016
Tue Dec 29 08:00:45 2015 INFO     bytes: 298822410240, avg KB/sec:286715, inst KB/sec:268071, elapsed:1018


Incidentally, the large filesize also helps reduce the overall rate loss due to individual overhead setup and completion per transfer. ( overhead of ~15 seconds for this file which then took 1018 seconds to transfer.This has allowed us to transfer ~ 125Tb of data over the new year period:

And a completion rate of ~90%

Although the low number of transfers does not allow the FTS optimizer to change settings so as to improve the throughput rate:


Let's hope we can continue this rate. My next step is to look at the rate at which we can create the tarballs on the source host in preparation for transfer  and whether this technique can be applied at other source sites within vo.dirac.ac.uk.

No comments: