20 May 2015

A view from a room at WLCG/CHEP 2015

It is very handy to have both CHEP 2015 and the WLCG 2015 workshop at the same venue as I don't have to  change venues! Here are some thoughts I had from the meeting:
From WLCG:
Monitoring and Security issues were my main take away moments from the first day of the WLCG; (looking forward to getting restricted CMS credentials so  that I can see their monitoring pages.)
LHC VOs talked about current Run2 improvements and plans for HL-LHC
Many new sites supporting ALICE and plan to expand EOS usage....
ATLAS keep referring to T2 storage as custodial, but they know this is not what we normally mean as "custodial"
LHCb should a nice slide of the processing workflow for data IO ( a 3GB RAW file ends up also producing ~5GB on disk and 5GB on tape; (they merge their data files.)
Long term future is computer centres will become solely data centres possibly....?
OSG are changing CA and so all users will get a new DN. I can't help bit think about ownership of all their old data, will it survive the change?

Interesting talk on hardware. Each component is really only made by 3-4 companies globally.... and our procurement is minuscule.


From CHEP:
Hopefully my posters went down well. My highlights/points of interest were:

RSE within rucio for atlas can be used to make sure you have more than one replicas. Should be really useful for localgroupdisk and also allows for quota.

Intensity Frontier plenary regarding Computing at Femilab for neutrino experiments using small amounts of staff (made me reminisce about SAM system for data management...)

Data preservation talks for ATLAS CDF/DO interesting.
cms prepared to use network circuits for data transfer, expecting end run2 possibly, definitely run3.

Extension to perfSONAR system to allowe adhoc on demand tests between sites. ( I.e akin to refactoring the NDT/NPAD suites but not requiring the special WEB100 kernel.

Interesting to see that the mean read/write rate for BNL for ATLAS experiment is ~70TB/Yr per disk drive. I wander what other rates are....

Some Posters of interest were:
A173 A191 A317 A339 B358 A359 B20 B214 B114 B254 B284 B292 B358 B362 B408 B441 

18 May 2015

Mind The Gap

One of the features of modern data science - whether from big instruments, lots of data sources, or somewhere else - is that generally researchers need to collaborate to be able to manage the data. No single institute is able to cope with everything. Thus, many researchers use e-Infrastructures (or cyberinfrastructures to our North American friends), to connect resources and institutes together, but also to enable further collaborations with other researchers.
Mind the gap
The next problem then arises when you have two different infrastructures which were not built to talk to each other. Here's where interoperation and standards come in.

One of the things we have talked about for a while but never got round to doing was to bridge (the) two national infrastructures for physics, GridPP and DiRAC (not to be confused with DIRAC nor with DIRAC). Now we will be moving a few petabytes from the latter to the former, initially to back up the data. Which is tricky when there are no common identities, no common data transfer protocols, no common data (replica) catalogues, accounting information, metadata catalogues, etc.

So we're going to bridge the gap without hopefully too much effort on either side, initially by making DiRAC sites look like a Tier2-(very-)lite, with essentially only a GridFTP endpoint and a VO for DiRAC. We will then start to move data across with FTS and see what happens. (Using the analogy above, we are bringing the ends closer to each other rather than increase the voltage :-))

11 May 2015

Notes from JANET/JISC Networkshop 43

These are notes from JANET JISC's Networkshop, now the 43rd, but seen from the GridPP perspective. The workshop took place 0-1-2 April but this post should be timed to appear after the election.

"Big data" started and closed the workshop; the start being Prof Chris Lintott, BBC Sky at Night, er, superstar, talking about Galaxy Zoo: there are too many galaxies out there, and machines can achieve only 85% accuracy in the classification. Core contributors are the kind of people who read science articles in the news, and they contribute because they want to help out. Zooniverse is similar to the grid in a few respects: a single registration lets you contribute to multiple projects (your correspondent asked about using social media to register people, so people could talk about their contributions on social media), and they have unified support for projects (what we would call VOs)

At the other end, a presentation from the Met Office where machines are achieving high accuracy thanks to to investments in the computing and data infrastructure - and of course in the people who develop and program the models, some of whom have spent decades at the Met Office developing them. While our stuff tends to be more parallel and high throughput processing of events, MO's climate and weather is more about supercomputing. Similarities are more in the data area where managing and processing increasing volumes is essential. This is also where the networkshop comes in, support for accessing and moving large volumes of science data. They are also using STFC's JASMIN/CEMS. In fact JASMIN (in a separate presentation) are using similar networkological tools, such as perfsonar and fasterdata.

Sandwiched in between was loads of great stuff:

  • HP are using SDN also for security purposes. Would be useful to understand. Or interesting. Or both.
  • A product called "Nutanix" delivering software defined storage for clouds - basically the storage is managed on what we would call worker nodes with a VM dedicated to managing the storage; it replicates blocks across the cluster, and locally using SSDs as cache. 
  • IPv6 was discussed, with our very own Dave Kelsey presenting.
  • In coffee break discussions with people, WLCG is ahead of the curve being increasingly network-centric. Still very controlled experiment models, but networks are used a lot to move and access data.
  • Fair bit of moved-stuff-to-the-cloud reports. JANET's (excuse me, JISC's) agreement with Azure, AWS considered helpful.
  • Similarly, JISC's data centre offers hosting. Different use from ours, but wonder if we should look into moving data to/from our data centres to theirs? Sometimes it is useful to support users, e.g. users of GO or FTS by testing out data transfers between sites, e.g. when the data centres need to run specific end points, like Globus Connect, SRM, GridFTP, etc.
  • Lots of identity management stuff, which was the main reason your correspondent was there. Also for AARC and EUDAT (more on that later).
  • And of course talking to people to find out what they're doing and see if we can usefully do stuff together.
Speaking of sandwiched, we were certainly also made welcome at Exeter, with the local staff welcoming us, colour-coded (= orange) students supporting us, and lots of great food, including of course pasties.