28 March 2015

EUDAT and GridPP

EUDAT2020 (the H2020 follow-up project to EUDAT) just finished its kick-off meeting at CSC. Might be useful to jot down a few thoughts on similarities and differences and such before it is too late.

Both EUDAT and GridPP are - as far as this blog is concerned - data e- (or cyber-) infrastructures. The infrastructure is distributed across sites, sites provide storage capacity or users, there is a common authentication and authorisation scheme, there are data discovery mechanisms, both use GOCDB for service availability.

  • EUDAT will be using CDMI as its storage interface - just like EGI does - and CDMI is in many ways fairly SRM-like. We have previously done work comparing the two.
  • EUDAT will also be doing HTTP "federations" (i.e. automatic failover when a replica is missing; this is confusingly referred to as "federation" by some people).
  • Interoperation with EGI is useful/possible/thought-about (delete as applicable). EUDAT's B2STAGE will be interfacing to EGI - there is already a mailing list for discussions.
  • GridPP's (or WLCG's) metadata management is probably a bit too confusing at the moment since there is no single file catalogue 
  • B2ACCESS is the authentication and authorisation infrastructure in EUDAT; it could interoperate with GridPP via SARoNGS (ask us at OGF44 where we will also look at AARC's relation to GridPP and EUDAT). Jos tells us that KIT also have a SARoNGS type service.
  • Referencing a file is done with a persistent identifier, rather like the LFN (Logical Filename) GridPP used to have.
  • "Easy" access via WebDAV is an option for both projects. GlobusOnline is an option (sometimes) for both projects. In fact, B2STAGE is currently using GO, but will also be using FTS.
Using FTS is particularly interesting because it should then be possible to transfer files between EUDAT and GridPP. The differences between the projects are mainly that
  • GridPP is more mature - has had 14-15 years now to build its infrastructure; EUDAT is of course a much younger project (but then again, EUDAT is not exactly starting from scratch)
  • EUDAT is doing more "dynamic data" where the data might change later. Also looking at more support for the lifecycle.
  • EUDAT and GridPP have distinct user communities, to a first approximation at least.
  • The middleware is different; GridPP does of course offer compute where EUDAT will offer simpler server-side workflows. GridPP services are more integrated, where in EUDAT the B2 services are more separated (but will be unified by the discovery/lookup service and by B2ACCESS)
  • Authorisation mechanisms will be very different (but might hopefully interface to each other; there are plans for this in B2ACCESS).
There is some overlap between data sites in WLCG and those in EUDAT. This could lead to some interesting collaborations and cross-pollinations. Come to OGF44 and the EGI conference and talk to us about it.

No comments: