Project

General

Profile

Actions

Task #24817

open

Task #24611: Create the new GRSF Publisher service

Task #25008: Requested fields changes

Change the content of 'database_sources'

Added by Luca Frosini about 2 years ago. Updated about 2 years ago.

Status:
Paused
Priority:
Normal
Target version:
Start date:
Mar 21, 2023
Due date:
% Done:

80%

Estimated time:

Description

Stock and Fishery records contain the fields database_sources.

As an example, please consider the following the Stock Record (Assessment Unit) from GRSF with GRSF UUID cb468fb0-fed5-370f-9bc6-08fa99c54f3c.

https://data.d4science.org/ctlg/GRSF_Admin/cb468fb0-fed5-370f-9bc6-08fa99c54f3c

Please find attached the JSON submitted by the KB in June 2021 (maybe it is not the latest version, but this is not important to present the situation)


"database_sources" : [ {
    "name" : "FishSource",
    "description" : "FishSource is an online information resource about the status of stocks and fisheries, that compiles and summarizes all the information that is needed from analysts to evaluate the sustainability.",
    "url" : "http://www.fishsource.com"
  }, {
    "name" : "FIRMS",
    "description" : "Fisheries and Resources Monitoring System aims to provide access to a wide range of high-quality information on the global monitoring and management of fishery marine resources",
    "url" : "http://firms.fao.org/firms/en"
  }, {
    "name" : "RAM",
    "description" : "RAM Legacy Stock Assessment Database is a compilation of stock assessment results for commercially exploited marine populations from around the world.",
    "url" : "http://ramlegacy.org"
  } ]

Currently grsf-publisher-ws service uses database_sources to:

  • Generate a simple field with the content "RAM FishSource FIRMS".

I'm wondering if we simplify the process and have something like:


"database_sources" : "RAM FishSource FIRMS"

Please note that if we agree on this, the change must be made when the new service will be available and not for the current implementation.


Files

cb468fb0-fed5-370f-9bc6-08fa99c54f3c.json (33.4 KB) cb468fb0-fed5-370f-9bc6-08fa99c54f3c.json Luca Frosini, Mar 21, 2023 04:02 PM
clipboard-202303221129-va37k.png (8.28 KB) clipboard-202303221129-va37k.png Yannis Marketakis, Mar 22, 2023 10:29 AM
clipboard-202303221130-8ms3t.png (20.8 KB) clipboard-202303221130-8ms3t.png Yannis Marketakis, Mar 22, 2023 10:30 AM
Database Source Field.png (6.43 KB) Database Source Field.png Luca Frosini, Mar 22, 2023 11:32 AM
resources_from_refers_to.png (19.8 KB) resources_from_refers_to.png Luca Frosini, Mar 22, 2023 11:33 AM
clipboard-202303221251-exrza.png (26.8 KB) clipboard-202303221251-exrza.png Yannis Marketakis, Mar 22, 2023 11:51 AM

Related issues

Related to StocksAndFisheriesKB - Feature #24604: Fix 'Database Source' fieldNewLuca FrosiniFeb 16, 2023

Actions
Actions #1

Updated by Luca Frosini about 2 years ago

Actions #2

Updated by Luca Frosini about 2 years ago

@marketak@ics.forth.gr I just remember that @aureliano.gentile@fao.org asked me the following requirements during the meeting in Pisa see #24604

To satisfy both requirements, we could have the following JSON

"database_sources" : [ "Fisheries and Resources Monitoring System (FIRMS)", "FAO SDG 14.4.1 Questionnaire" ]

Moreover, I generate the multiple times the metadata Database Source (one time for each element of the array).

Actions #3

Updated by Yannis Marketakis about 2 years ago

We could do that, but I see that the extra information in the json object for database sources (i.e. description and URL) are indeed used both in GRSF Admin VRE and GRSF Public VRE.

More specifically in GRSF VRE the URL and description accompany the database source (for example FIRMS in the case of https://data.d4science.org/ctlg/GRSF/d9e59e52-4440-39a1-88c1-71d8f1f246d7)

The same is shown in GRSF Admin (at least for legacy records). Check https://data.d4science.org/ctlg/GRSF_Admin/9a236015-1fee-3d08-ad48-25782e49032d for example

So overall, I suspect that if we change this all these resources will be gone.

To avoid lookups on your side, I could provide you with a list of db source names complementary to the database source objects, using an additional field (e.g. database_source_names ). what do you think?

Actions #4

Updated by Luca Frosini about 2 years ago

Yannis Marketakis wrote in #note-3:

We could do that, but I see that the extra information in the json object for database sources (i.e. description and URL) are indeed used both in GRSF Admin VRE and GRSF Public VRE.

More specifically in GRSF VRE the URL and description accompany the database source (for example FIRMS in the case of https://data.d4science.org/ctlg/GRSF/d9e59e52-4440-39a1-88c1-71d8f1f246d7)

The same is shown in GRSF Admin (at least for legacy records). Check https://data.d4science.org/ctlg/GRSF_Admin/9a236015-1fee-3d08-ad48-25782e49032d for example

So overall, I suspect that if we change this all these resources will be gone.

To avoid lookups on your side, I could provide you with a list of db source names complementary to the database source objects, using an additional field (e.g. database_source_names ). what do you think?

database_sources is just used to create the field Database Source

e.g.

Database Source: RAM FishSource FIRMS

Instead, the resource are generated form other properties, e.g. refers_to

See https://data.d4science.org/ctlg/GRSF_Admin/cb468fb0-fed5-370f-9bc6-08fa99c54f3c

Instead the resource with name http://firms.fao.org/firms/resource/13728/en in the screeeshot you provided come from source_of_information

"source_of_information" : [ {
    "name" : "http://firms.fao.org/firms/resource/13728/en",
    "description" : "",
    "url" : "http://firms.fao.org/firms/resource/13728/en"
  } ]
Actions #5

Updated by Yannis Marketakis about 2 years ago

I have added the image to point you to the resource of FIRMS, not to the FIRMS record (which clearly comes from refers_to , no objection about that).
See the resource in the red box. The description of the database source clearly comes from

"database_sources" : [ {
    "name" : "FIRMS",
    "description" : "Fisheries and Resources Monitoring System aims to provide access to a wide range of high-quality information on the global monitoring and management of fishery marine resources",
    "url" : "http://firms.fao.org/firms/en"
  }

Actions #6

Updated by Luca Frosini about 2 years ago

@marketak@ics.forth.gr I just realised that you are right. Thanks for highlighting this.
Also the database_sources should create a resource but in the case of the record

https://data.d4science.org/ctlg/GRSF_Admin/cb468fb0-fed5-370f-9bc6-08fa99c54f3c

they were not created.

Maybe, they were overwritten by the resources generated from refers_to due to the name clash.

"refers_to":
    [
        {
            "id": "2109694f-17d1-3206-815d-179bc52e5ec7",
            "url": "https://data.d4science.org/ctlg/GRSF_Admin/2109694f-17d1-3206-815d-179bc52e5ec7"
        },
        {
            "id": "c0d7d44f-71cb-3b28-9b65-ddf72c57d44f",
            "url": "https://data.d4science.org/ctlg/GRSF_Admin/c0d7d44f-71cb-3b28-9b65-ddf72c57d44f"
        },
        {
            "id": "1b3ce0de-1ed1-3ba8-8a80-f866b287617e",
            "url": "https://data.d4science.org/ctlg/GRSF_Admin/1b3ce0de-1ed1-3ba8-8a80-f866b287617e"
        }
    ],

In fact, as explained in ticket #24816 the name of refers_to resources are derived from the lookup of the pointed resources and we discussed about having a more significant name.

Can we talk about this in a call?

Actions #7

Updated by Luca Frosini about 2 years ago

In any case, I've just realised that this field database_sources is fine as it is.
Instead, I have to change the behaviour for the name in refers_to to avoid name clashes.
I can use the id you already provide or a more significant name, but we will discuss bout this in ticket #24816.

To comply with the new request of @aureliano.gentile@fao.org in #24604, what about if we add another property in each database_sources element e.g. long_name

E.g.

"database_sources" : [ {
    "name" : "FishSource",
    "long_name" : "FishSource",
    "description" : "FishSource is an online information resource about the status of stocks and fisheries, that compiles and summarizes all the information that is needed from analysts to evaluate the sustainability.",
    "url" : "http://www.fishsource.com"
  }, {
    "name" : "FIRMS",
    "long_name" : "Fisheries and Resources Monitoring System (FIRMS)",
    "description" : "Fisheries and Resources Monitoring System aims to provide access to a wide range of high-quality information on the global monitoring and management of fishery marine resources",
    "url" : "http://firms.fao.org/firms/en"
  }, {
    "name" : "RAM",
    "long_name" : "RAM Legacy Stock Assessment Database",
    "description" : "RAM Legacy Stock Assessment Database is a compilation of stock assessment results for commercially exploited marine populations from around the world.",
    "url" : "http://ramlegacy.org"
  } ]

Actions #8

Updated by Luca Frosini about 2 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 80
Actions #9

Updated by Luca Frosini about 2 years ago

@aureliano.gentile@fao.org does the creation of the resource point to the database source which are well know an added value?

Actions #10

Updated by Yannis Marketakis about 2 years ago

Hi @luca.frosini@isti.cnr.it,

I replied to #24816 (please check).
Of course we can discuss this (and any other issue) in a dedicated call

Actions #11

Updated by Luca Frosini about 2 years ago

I just had a call with @aureliano.gentile@fao.org and we agreed that:

  • the generated Resources not only are useless but they create confusion;
  • we need to create a metadata for each database source instead of concatenating them in a single metadata.
  • each metadata must contain the longest version of the source.

So we can keep the same structure and add the long_name as follows:

"database_sources" : [ {
    "name" : "FishSource",
    "long_name" : "FishSource",
    "description" : "FishSource is an online information resource about the status of stocks and fisheries, that compiles and summarizes all the information that is needed from analysts to evaluate the sustainability.",
    "url" : "http://www.fishsource.com"
  }, {
    "name" : "FIRMS",
    "long_name" : "Fisheries and Resources Monitoring System (FIRMS)",
    "description" : "Fisheries and Resources Monitoring System aims to provide access to a wide range of high-quality information on the global monitoring and management of fishery marine resources",
    "url" : "http://firms.fao.org/firms/en"
  }, {
    "name" : "RAM",
    "long_name" : "RAM Legacy Stock Assessment Database",
    "description" : "RAM Legacy Stock Assessment Database is a compilation of stock assessment results for commercially exploited marine populations from around the world.",
    "url" : "http://ramlegacy.org"
  } ]

or having a simples array like this

"database_sources": [
  "FishSource",
  "Fisheries and Resources Monitoring System (FIRMS)",
  "RAM Legacy Stock Assessment Database"
]

I prefer the latter if possible

Actions #12

Updated by Yannis Marketakis about 2 years ago

OK let's keep the second approach that is much simpler and less verbose

"database_sources": [
  "FishSource",
  "Fisheries and Resources Monitoring System (FIRMS)",
  "RAM Legacy Stock Assessment Database"
]
Actions #13

Updated by Aureliano Gentile about 2 years ago

Thank you, all fine with me. The above examples do not cover the 4th source of information (FAO SDG 14.4.1 Questionnaire) but I trust this will be part of the implementation.
Aureliano

Actions #14

Updated by Luca Frosini about 2 years ago

  • Tracker changed from Support to Task
  • Status changed from In Progress to Paused
  • Target version set to GRSF

Aureliano Gentile wrote in #note-13:

Thank you, all fine with me. The above examples do not cover the 4th source of information (FAO SDG 14.4.1 Questionnaire) but I trust this will be part of the implementation.
Aureliano

Yes, that is only an example.

As agreed with @marketak@ics.forth.gr I'm going to pause this ticket.
I'll ask to change the produced JSON to @marketak@ics.forth.gr as soon as ready with the new service.

Actions #15

Updated by Luca Frosini about 2 years ago

  • Subject changed from About 'database_sources' to Change the content of 'database_sources'
Actions #16

Updated by Luca Frosini about 2 years ago

The complete list is:

"database_sources": [
  "FishSource",
  "Fisheries and Resources Monitoring System (FIRMS)",
  "RAM Legacy Stock Assessment Database",
  "FAO SDG 14.4.1 Questionnaire"
]
Actions #17

Updated by Luca Frosini about 2 years ago

  • Parent task changed from #24611 to #25008
Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 8.91 MB)