23 November 2015

First Analysis of file size distribution for vo.dirac.ac.uk at RAL Tier1.

SO the DiRAC virtual organization has successfully  moved ~4M files (95TB ) of data to the RAL Tier1. But in what form:
Smallest file size is ( ignoring the 40k zero size files) is 2B
Median file size is 450kB
The modal average (114k files out of 4M)  is 1.4MB
Mean file size is 23.15MB
Largest file size is 482GB

Looking at the file size distribution is interesting.

 The VO hope to improve this by "tar"ing and compressing the files; (ideally we would like the files to be ~1Gb in size.)