Showing posts with label FTS. Show all posts
Showing posts with label FTS. Show all posts

08 March 2019

ATLAS Jamboree 2019 view from the offiste perspective.

I didn't go in person to the ATLAS Jamboree this year held at CERN. For those who are allowed to view I suggest looking at: https://indico.cern.ch/event/770307/

But I did join for some via vidyo!
Here is my musings about the talk givens I saw. ( Shame I couldn't get involved in coffee/dinner  discussions which are often the most fruitful moments of these meetings):

Even before the main meeting started, there is an interesting talk regarding HPC data access in US an ANL.


In particular, I  like the thought of globus usage and incorporating rucio into DTNs at the sites.  Similar to what was discussed at other sites at the rucio community workshop last week.

In the preview talk, I picked out the switch to Fast Sim rather than Full sim will increase output rate by a factor of 10. A good reminder that user workflow changes could drastically alter computing requirements.
From the main meeting , the following meetings will be of interest on a data storage:

Data Organization and Management Activities: Third Party Copy and (storage) Quality of Service
TPC: details on DPM
DOMA ACCESS: Caches 
DDM Ops overview
Diskless and lightweight sites: consolidation of storage
Data Carousel
Networking - best practice for sites, and evolution
WLCG evolution strategy
 
One thing it di was cause me to think what if; (and I stress the if is me not ATLAS musing,)  ATLAS  wanted to read 1PB of data a day from Tape at RAL and then distribute it across the world?

 
 

07 November 2017

IPV6 slowly encroaching into Data storage and transfers within the GridPP UK community for WLCG experiments.

Its the elephant in the room. We know IPv6 is around the corner. (Some would say say its three corners back and we have just been ignoring it.) However, within the UK, GridPP is making progress.
More and more sites are offering fully dual stacked storage systems. Indeed some sites are already fully dual-stacked.

Other sites are slowly working in IPv6 components into current systems.

As a group we are looking at new gateway technologies to span IPV4 back ends to have have IPV4/6 front ends. Here I am thinking of the work RALPP site has been working with in collaboration with CMS for xrootd Proxy Caching.


Even the FTS middleware deployed at the UK Tier1 is just about to be deployed dual-stacked. It is an interesting time  for IPV6 within the UK and not in the Chinese proverbial sense.

These are just some of the storage related highlights/current activities for IPV6 integration. I leave it as an exercise of the reader regarding other grid components.

12 April 2017

Beware the Kraken! What happens when you start plotting transfer rates.

FTS transfers  are how the WLCG moves alot of its data. I decided decided to look at what the instantaneous rates within the transfers were.Lines in the log files appear as:

I decided to plot the value of the instantaneous rate with respect to how often this value appeared. Plotting this for 2/18 FTS servers at RAL for ~1month of transfers gives:  



This has been described as a Kraken, the hand of Freddie Kruger, a sea anemone or a leafless tree's branches blowing in the wind . Please leave comments on your own suggestion!!

I also decided to look at the subset of data for FTS transfers to the new CEPH storage at the RAL Tier1 and saw this:



My first thought is that it is similar to the Cinderella castle by Disney.
https://www.pinterest.com/explore/disney-castle-silhouette/ :)

04 January 2016

Update on vo.dirac.ac.uk data movement and filesize distribution.

So....... I should have known that the information I posted in the blog post in November of last year would soon be out of date; but I didn't think it would be this soon! DiRAC have successfully developed their system to tar and split their data samples before transferring into the RAL Tier1. This system has dramatically increased the data transfer rates.
 What  has also changed is the number of files per tape  due to the change in average filesize per tape:
 
This has meant the number of files per tape varied from a starting value of 2-3 thousand per tape , swelling top 2-3 million before finally settling on 20-40 per tape. ( file size is ~ 250-300GB per file.)
To move large files requires good transfer rates; which we have been able to achieve; (can be seen in this log snippet):

Tue Dec 29 08:00:28 2015 INFO     bytes: 293193121792, avg KB/sec:286321, inst KB/sec:308224, elapsed:1001
Tue Dec 29 08:00:33 2015 INFO     bytes: 294824181760, avg KB/sec:286481, inst KB/sec:318566, elapsed:1006
Tue Dec 29 08:00:38 2015 INFO     bytes: 296458387456, avg KB/sec:286643, inst KB/sec:319180, elapsed:1011
Tue Dec 29 08:00:43 2015 INFO     bytes: 298053795840, avg KB/sec:286766, inst KB/sec:311603, elapsed:1016
Tue Dec 29 08:00:45 2015 INFO     bytes: 298822410240, avg KB/sec:286715, inst KB/sec:268071, elapsed:1018


Incidentally, the large filesize also helps reduce the overall rate loss due to individual overhead setup and completion per transfer. ( overhead of ~15 seconds for this file which then took 1018 seconds to transfer.This has allowed us to transfer ~ 125Tb of data over the new year period:

And a completion rate of ~90%

Although the low number of transfers does not allow the FTS optimizer to change settings so as to improve the throughput rate:


Let's hope we can continue this rate. My next step is to look at the rate at which we can create the tarballs on the source host in preparation for transfer  and whether this technique can be applied at other source sites within vo.dirac.ac.uk.

25 September 2015

Milestone Passed with Last Two Years of Transfers Through FTS

Over the Last Two years; the FTS system (as used by the WLCG VOs and others) have moved ~0.5EB of data (over a Billion files). What is an EB ? Well its 1000 PBytes or 10^6 TBytes or 10^9 GBytes or 10^12MBytes or 10^15kBytes or 10^18Bytes. This can be seen form the monitoring page: