Task #24817
openTask #24611: Create the new GRSF Publisher service
Task #25008: Requested fields changes
Change the content of 'database_sources'
80%
Description
Stock and Fishery records contain the fields database_sources
.
As an example, please consider the following the Stock Record (Assessment Unit) from GRSF with GRSF UUID cb468fb0-fed5-370f-9bc6-08fa99c54f3c.
https://data.d4science.org/ctlg/GRSF_Admin/cb468fb0-fed5-370f-9bc6-08fa99c54f3c
Please find attached the JSON submitted by the KB in June 2021 (maybe it is not the latest version, but this is not important to present the situation)
"database_sources" : [ {
"name" : "FishSource",
"description" : "FishSource is an online information resource about the status of stocks and fisheries, that compiles and summarizes all the information that is needed from analysts to evaluate the sustainability.",
"url" : "http://www.fishsource.com"
}, {
"name" : "FIRMS",
"description" : "Fisheries and Resources Monitoring System aims to provide access to a wide range of high-quality information on the global monitoring and management of fishery marine resources",
"url" : "http://firms.fao.org/firms/en"
}, {
"name" : "RAM",
"description" : "RAM Legacy Stock Assessment Database is a compilation of stock assessment results for commercially exploited marine populations from around the world.",
"url" : "http://ramlegacy.org"
} ]
Currently grsf-publisher-ws service uses database_sources
to:
- Generate a simple field with the content "RAM FishSource FIRMS".
I'm wondering if we simplify the process and have something like:
"database_sources" : "RAM FishSource FIRMS"
Please note that if we agree on this, the change must be made when the new service will be available and not for the current implementation.
Files
Related issues
Updated by Luca Frosini about 2 years ago
- Related to Feature #24604: Fix 'Database Source' field added
Updated by Luca Frosini about 2 years ago
@marketak@ics.forth.gr I just remember that @aureliano.gentile@fao.org asked me the following requirements during the meeting in Pisa see #24604
To satisfy both requirements, we could have the following JSON
"database_sources" : [ "Fisheries and Resources Monitoring System (FIRMS)", "FAO SDG 14.4.1 Questionnaire" ]
Moreover, I generate the multiple times the metadata Database Source
(one time for each element of the array).
Updated by Yannis Marketakis about 2 years ago
- File clipboard-202303221129-va37k.png clipboard-202303221129-va37k.png added
- File clipboard-202303221130-8ms3t.png clipboard-202303221130-8ms3t.png added
We could do that, but I see that the extra information in the json object for database sources (i.e. description and URL) are indeed used both in GRSF Admin VRE and GRSF Public VRE.
More specifically in GRSF VRE the URL and description accompany the database source (for example FIRMS in the case of https://data.d4science.org/ctlg/GRSF/d9e59e52-4440-39a1-88c1-71d8f1f246d7)
The same is shown in GRSF Admin (at least for legacy records). Check https://data.d4science.org/ctlg/GRSF_Admin/9a236015-1fee-3d08-ad48-25782e49032d for example
So overall, I suspect that if we change this all these resources will be gone.
To avoid lookups on your side, I could provide you with a list of db source names complementary to the database source objects, using an additional field (e.g. database_source_names
). what do you think?
Updated by Luca Frosini about 2 years ago
- File Database Source Field.png Database Source Field.png added
- File resources_from_refers_to.png resources_from_refers_to.png added
Yannis Marketakis wrote in #note-3:
We could do that, but I see that the extra information in the json object for database sources (i.e. description and URL) are indeed used both in GRSF Admin VRE and GRSF Public VRE.
More specifically in GRSF VRE the URL and description accompany the database source (for example FIRMS in the case of https://data.d4science.org/ctlg/GRSF/d9e59e52-4440-39a1-88c1-71d8f1f246d7)
The same is shown in GRSF Admin (at least for legacy records). Check https://data.d4science.org/ctlg/GRSF_Admin/9a236015-1fee-3d08-ad48-25782e49032d for example
So overall, I suspect that if we change this all these resources will be gone.
To avoid lookups on your side, I could provide you with a list of db source names complementary to the database source objects, using an additional field (e.g.
database_source_names
). what do you think?
database_sources
is just used to create the field Database Source
e.g.
Database Source: RAM FishSource FIRMS
Instead, the resource are generated form other properties, e.g. refers_to
See https://data.d4science.org/ctlg/GRSF_Admin/cb468fb0-fed5-370f-9bc6-08fa99c54f3c
Instead the resource with name http://firms.fao.org/firms/resource/13728/en
in the screeeshot you provided come from source_of_information
"source_of_information" : [ {
"name" : "http://firms.fao.org/firms/resource/13728/en",
"description" : "",
"url" : "http://firms.fao.org/firms/resource/13728/en"
} ]
Updated by Yannis Marketakis about 2 years ago
I have added the image to point you to the resource of FIRMS, not to the FIRMS record (which clearly comes from refers_to
, no objection about that).
See the resource in the red box. The description of the database source clearly comes from
"database_sources" : [ {
"name" : "FIRMS",
"description" : "Fisheries and Resources Monitoring System aims to provide access to a wide range of high-quality information on the global monitoring and management of fishery marine resources",
"url" : "http://firms.fao.org/firms/en"
}
Updated by Luca Frosini about 2 years ago
@marketak@ics.forth.gr I just realised that you are right. Thanks for highlighting this.
Also the database_sources
should create a resource but in the case of the record
https://data.d4science.org/ctlg/GRSF_Admin/cb468fb0-fed5-370f-9bc6-08fa99c54f3c
they were not created.
Maybe, they were overwritten by the resources generated from refers_to
due to the name clash.
"refers_to":
[
{
"id": "2109694f-17d1-3206-815d-179bc52e5ec7",
"url": "https://data.d4science.org/ctlg/GRSF_Admin/2109694f-17d1-3206-815d-179bc52e5ec7"
},
{
"id": "c0d7d44f-71cb-3b28-9b65-ddf72c57d44f",
"url": "https://data.d4science.org/ctlg/GRSF_Admin/c0d7d44f-71cb-3b28-9b65-ddf72c57d44f"
},
{
"id": "1b3ce0de-1ed1-3ba8-8a80-f866b287617e",
"url": "https://data.d4science.org/ctlg/GRSF_Admin/1b3ce0de-1ed1-3ba8-8a80-f866b287617e"
}
],
In fact, as explained in ticket #24816 the name of refers_to
resources are derived from the lookup of the pointed resources and we discussed about having a more significant name.
Can we talk about this in a call?
Updated by Luca Frosini about 2 years ago
In any case, I've just realised that this field database_sources
is fine as it is.
Instead, I have to change the behaviour for the name in refers_to
to avoid name clashes.
I can use the id you already provide or a more significant name, but we will discuss bout this in ticket #24816.
To comply with the new request of @aureliano.gentile@fao.org in #24604, what about if we add another property in each database_sources
element e.g. long_name
E.g.
"database_sources" : [ {
"name" : "FishSource",
"long_name" : "FishSource",
"description" : "FishSource is an online information resource about the status of stocks and fisheries, that compiles and summarizes all the information that is needed from analysts to evaluate the sustainability.",
"url" : "http://www.fishsource.com"
}, {
"name" : "FIRMS",
"long_name" : "Fisheries and Resources Monitoring System (FIRMS)",
"description" : "Fisheries and Resources Monitoring System aims to provide access to a wide range of high-quality information on the global monitoring and management of fishery marine resources",
"url" : "http://firms.fao.org/firms/en"
}, {
"name" : "RAM",
"long_name" : "RAM Legacy Stock Assessment Database",
"description" : "RAM Legacy Stock Assessment Database is a compilation of stock assessment results for commercially exploited marine populations from around the world.",
"url" : "http://ramlegacy.org"
} ]
Updated by Luca Frosini about 2 years ago
- Status changed from New to In Progress
- % Done changed from 0 to 80
Updated by Luca Frosini about 2 years ago
@aureliano.gentile@fao.org does the creation of the resource point to the database source which are well know an added value?
Updated by Yannis Marketakis about 2 years ago
Hi @luca.frosini@isti.cnr.it,
I replied to #24816 (please check).
Of course we can discuss this (and any other issue) in a dedicated call
Updated by Luca Frosini about 2 years ago
I just had a call with @aureliano.gentile@fao.org and we agreed that:
- the generated Resources not only are useless but they create confusion;
- we need to create a metadata for each database source instead of concatenating them in a single metadata.
- each metadata must contain the longest version of the source.
So we can keep the same structure and add the long_name
as follows:
"database_sources" : [ {
"name" : "FishSource",
"long_name" : "FishSource",
"description" : "FishSource is an online information resource about the status of stocks and fisheries, that compiles and summarizes all the information that is needed from analysts to evaluate the sustainability.",
"url" : "http://www.fishsource.com"
}, {
"name" : "FIRMS",
"long_name" : "Fisheries and Resources Monitoring System (FIRMS)",
"description" : "Fisheries and Resources Monitoring System aims to provide access to a wide range of high-quality information on the global monitoring and management of fishery marine resources",
"url" : "http://firms.fao.org/firms/en"
}, {
"name" : "RAM",
"long_name" : "RAM Legacy Stock Assessment Database",
"description" : "RAM Legacy Stock Assessment Database is a compilation of stock assessment results for commercially exploited marine populations from around the world.",
"url" : "http://ramlegacy.org"
} ]
or having a simples array like this
"database_sources": [
"FishSource",
"Fisheries and Resources Monitoring System (FIRMS)",
"RAM Legacy Stock Assessment Database"
]
I prefer the latter if possible
Updated by Yannis Marketakis about 2 years ago
OK let's keep the second approach that is much simpler and less verbose
"database_sources": [
"FishSource",
"Fisheries and Resources Monitoring System (FIRMS)",
"RAM Legacy Stock Assessment Database"
]
Updated by Aureliano Gentile about 2 years ago
Thank you, all fine with me. The above examples do not cover the 4th source of information (FAO SDG 14.4.1 Questionnaire) but I trust this will be part of the implementation.
Aureliano
Updated by Luca Frosini about 2 years ago
- Tracker changed from Support to Task
- Status changed from In Progress to Paused
- Target version set to GRSF
Aureliano Gentile wrote in #note-13:
Thank you, all fine with me. The above examples do not cover the 4th source of information (FAO SDG 14.4.1 Questionnaire) but I trust this will be part of the implementation.
Aureliano
Yes, that is only an example.
As agreed with @marketak@ics.forth.gr I'm going to pause this ticket.
I'll ask to change the produced JSON to @marketak@ics.forth.gr as soon as ready with the new service.
Updated by Luca Frosini about 2 years ago
- Subject changed from About 'database_sources' to Change the content of 'database_sources'
Updated by Luca Frosini about 2 years ago
The complete list is:
"database_sources": [
"FishSource",
"Fisheries and Resources Monitoring System (FIRMS)",
"RAM Legacy Stock Assessment Database",
"FAO SDG 14.4.1 Questionnaire"
]
Updated by Luca Frosini about 2 years ago
- Parent task changed from #24611 to #25008