As we near the end of 2015 and look forward to 2016, it is time to reflect on the past year and look ahead to the next. The GridPP infrastructure can clearly deal with the tens/hundreds petabyte range but the middleware is just that, "middle." Some work always is needed to adapt to every new community's requirements. The large LHC experiments can move away from the established codebase and specialise their infrastructure which is both a good thing (infrastructure is flexible) and bad (not all communities can roll their own infrastructure). Using open and interoperable standards helps.
There exists research which is not high energy physics (HEP). These people look at the HEP infrastructure - the global infrastructure that found the Higgs - and say "we don't work like that."
Yet there is a role for GridPP's storageanddatamanagement infrastructure and expertise - most research areas of have data, and growing volumes of it. The expertise in moving, storing, and making data discoverable and available for processing is important regardless of which area of research the data belongs to. It doesn't matter that their analysis is different; their data problems are the same.
Big data volumes are also already found in astronomy, bioinformatics, Earth sciences, etc., and these communities have developed methods for data management, too. So 2016 will be an important year to continue the discussions with their infrastructure managers and developers - while every community is different and there is no such thing as a "solution," the more expertise and infrastructure we can share, the more easily we can manage the research data challenges of the future.
There exists research which is not high energy physics (HEP). These people look at the HEP infrastructure - the global infrastructure that found the Higgs - and say "we don't work like that."
Yet there is a role for GridPP's storageanddatamanagement infrastructure and expertise - most research areas of have data, and growing volumes of it. The expertise in moving, storing, and making data discoverable and available for processing is important regardless of which area of research the data belongs to. It doesn't matter that their analysis is different; their data problems are the same.
Big data volumes are also already found in astronomy, bioinformatics, Earth sciences, etc., and these communities have developed methods for data management, too. So 2016 will be an important year to continue the discussions with their infrastructure managers and developers - while every community is different and there is no such thing as a "solution," the more expertise and infrastructure we can share, the more easily we can manage the research data challenges of the future.
No comments:
Post a Comment