As with every activity when the end of the year is nigh, it is useful to look back. And forward.
It's been a good year for GridPP's storage and data management group (and indeed for GridPP itself), and in a sense it's easy to be a victim of our own success: that researchers just expect the infrastructure to be there. For example, some of the public reporting of the Higgs events seemed to gloss over how it was done, and the fact that finding the 400 Higgs events needle in a very large haystack was a global effort - no doubt to keep things simple... RAL currently holds about 10 PB of LHC data on tape, and around 6PB on disk. What we store is not the impressive bit, though - what we move and analyse is much more important. RAL routinely transfers 3 GB/s for a single one of the experiments (usually ATLAS and CMS.) QMUL alone reported having processed 24PB over the past year. So we do "big data."
In addition to providing a large part of WLCG, GridPP is also supporting non-LHC research. The catch, though, is that they usually have to use the same grid middleware While at first this seems like a hurdle, it is the way to tap into the large computing resources - many research case studies show how it's done.
So "well done" sung to the otherwise relatively unsung heroes and heroines who are keeping the infrastructure running and available. Let's continue to do big data well - one of our challenges for the coming year will be to see how wider research can benefit even more - and maybe how we can get better at telling people about it!
It's been a good year for GridPP's storage and data management group (and indeed for GridPP itself), and in a sense it's easy to be a victim of our own success: that researchers just expect the infrastructure to be there. For example, some of the public reporting of the Higgs events seemed to gloss over how it was done, and the fact that finding the 400 Higgs events needle in a very large haystack was a global effort - no doubt to keep things simple... RAL currently holds about 10 PB of LHC data on tape, and around 6PB on disk. What we store is not the impressive bit, though - what we move and analyse is much more important. RAL routinely transfers 3 GB/s for a single one of the experiments (usually ATLAS and CMS.) QMUL alone reported having processed 24PB over the past year. So we do "big data."
In addition to providing a large part of WLCG, GridPP is also supporting non-LHC research. The catch, though, is that they usually have to use the same grid middleware While at first this seems like a hurdle, it is the way to tap into the large computing resources - many research case studies show how it's done.
So "well done" sung to the otherwise relatively unsung heroes and heroines who are keeping the infrastructure running and available. Let's continue to do big data well - one of our challenges for the coming year will be to see how wider research can benefit even more - and maybe how we can get better at telling people about it!
No comments:
Post a Comment