Project

General

Profile

16-03-15 GRSF overall architecture

Meeting Notes:
Topics: Overall GRSF applications architecture and identify main software components and players
Participants: Pasquale Pagano (CNR), Yannis Marketakis (FORTH), Nikos Minadakis (FORTH), Anton Ellenbroek (FAO), Aureliano Gentile (FAO, Moderator)

Notes
On Software license, software architecture
PP: we need to define the components, how to operate the components in the infrastructure, also after the project. The import mechanism of the sources is also critical.
Yannis/Nikos: it depends also by the type of sources, i.e. dump, webservices, API etc. MatWare is what we agreed to use to build the knowledge base and it is difficult to have MatWare in the infrastructure due to the license. But everybody can use MatWare, the point is that we cannot share the code since it is protected by the license. The binary can be provided with a written agreement (e.g. between CNR and FORTH). We'll check with our legal office.
PP: We need to describe how data are ingested, processed, stored and disseminated, the components involved, and the integration and maintenance scenarios. We need this in the Administration wiki.

YM: We have several scenarios, e.g. in the case of RAM, data are extracted from an MS Access DB as csv, and then processed in MatWare.
PP: The Java code would run in the Executor of the infrastructure to get new data. This type of processes need to be autonomously run, also after the project.
Aureliano: we can think of an automatic updates every one/three months and/or upon GRSF admin request, i.e. in occasion of a considerable amount of new records produced by one of the data providers.
Yannis/Nikos: MatWare can work as an API and can be added in the VRE as a single component (a black box) together with the configurations files. Virtuoso is an external component.
PP: With MatWare installed in the VRE, the fetcher component take data externally and it is connected as a plugin (sparql, http, jdbc..)
PP: a dashboard for administrator is needed to run the job when needed.
Anton: It is important to specify the type of import (part of it, full replacement, etc.), and to get a report of the work done (monitoring).
Anton: The key issue regarding software is to ensure that an agreement will be in place for full utilization after the end of the project.
Yannis/Nikos: we are already doing this with other institutions, e.g. British Museum and we regularly provide support.
Anton: we need to have the scenario documented D5.1 - VRE plan and plan for the software in the infrastructure.

On CMS requirements
Aureliano:The overall idea is that from the input sources (at the moment FIRMS, FishSource, RAM), the GRSF produce outputs in form of stocks and fisheries records. (See also image below)
PP: MatWare is assembling the data, then one or more admin will have the rights to approve and publish the list of records. Every new record is generated with an identifier, which can be flagged for traceability purposes when appropriate. The CMS may provide aggregators (i.e. by species) to facilitate the approval workflow.
Anton: do we need a fourth 'source' or are the mapping artefacts the GRSF? The competency queries then provide the list of stocks and fisheries. With properly managed mapping artefacts we can automate most of the GRSF workflow. Only in case of uncertainties the application provides a list to the admin for revision and approval.
Anton: A mapping would contain source url's to one or more source records, and if we add publication date, expiry date, editors, etc. we can control dissemination. FAO needs to work with data providers on the minimal contant of a GRSF mapping tool.

On Effort
PP: FORTH need to analyse the requirements, indicate possible effort and feasibility

Summary of key Points

  • Clarify the license issue
  • Deployment architecture, link to documentation, how these applications can be run by external people
  • Feasibility of functional requirements
  • VRE plan

Follow-up actions

  • CNR: to initiate SLA agreement with FORTH on behalf of iMarine for the use of the MatWare software in the context of the GRSF
  • FORTH: to consult its legal office
  • FORTH: to provide a deployment architecture beside the logical architecture and add this to the admin guide
  • FORTH: to further analyse the requirements discussed during the call, make an effort estimate and indicate the feasibility of the basic components indicated: dashboard for administrator (to lunch and control the harvesting steps), the CMS to approve/reject/hold the mappings and the generation of unique ID for the new records identified
  • FAO: to launch a new call to continue the discussion

GRSF software components

MatWare Architecture - FORTH March 2016
MatWare Architecture - FORTH March 2016

GRSF overall architecture - BlueBRIDGE TWG March 2016
GRSF overall architecture - BlueBRIDGE TWG March 2016

Materials

Add picture from clipboard (Maximum size: 8.91 MB)