Bug #12417
closedAssociated data are not returned
100%
Description
Several GRSF records do not present associated data although time series are available in the source files.
In example "Pseudotolithus typus Pseudotolithus senegalensis Pseudotolithus elongatus Western Gulf of Guinea" http://data.d4science.org/ctlg/GRSF_Admin/9d81aa77-1fba-3804-b3f9-d4514d6705dc the box "Data and Resources" does not display any data while the box "Stock Data" is displaying the most recent catches.
Files
Updated by Aureliano Gentile almost 7 years ago
See also the legacy source file with associated data http://data.d4science.org/ctlg/GRSF_Admin/d4b2dff5-4106-3e7e-9e7e-98c505aaebe1
Updated by Luca Frosini almost 7 years ago
- Assignee changed from Luca Frosini to Yannis Marketakis
Updated by Luca Frosini almost 7 years ago
- Assignee changed from Yannis Marketakis to Francesco Mangiacrapa
Updated by Luca Frosini almost 7 years ago
- Assignee changed from Francesco Mangiacrapa to Yannis Marketakis
Sorry, @marketak@ics.forth.gr I assigned the ticket to @francesco.mangiacrapa@isti.cnr.it by mistake.
Updated by Yannis Marketakis almost 7 years ago
- File 9d81aa77-1fba-3804-b3f9-d4514d6705dc.json 9d81aa77-1fba-3804-b3f9-d4514d6705dc.json added
- Assignee changed from Yannis Marketakis to Luca Frosini
@luca.frosini@isti.cnr.it the issue has to do with the timeseries resources as they appear in the catalogue.
If you see the record http://data.d4science.org/ctlg/GRSF_Admin/9d81aa77-1fba-3804-b3f9-d4514d6705dc does contain the corresponding timeseries (under Stock Data section). However they do not appear as resources (under the Data and Resources section).
I am also attaching the JSON serialization of this object (you'll see that all the timeseries are there).
Updated by Luca Frosini almost 7 years ago
Looking at the attached JSON file I found only one resource (see refers_to field) which is compliant with what we can see in the record in the catalogue.
If there are some missing resources the issues could be two:
- The resources were not published at all (it is an additional step see https://wiki.gcube-system.org/gcube/GRSF-services#grsf-services-updater "Updates the list of connected exploiting resources.)
- There was an error on GRSF web service while updating the list of connected resources.
@marketak@ics.forth.gr am I missing something?
Updated by Yannis Marketakis almost 7 years ago
The source record is OK. The issue is not there.
The issue is that although the GRSF record has timeseries associated with it (i.e. catches, abundance level, etc.) and the most recent timeseries appear correctly under the section Stock Data, they do not appear as resources (under the Data and Resources section.)
Updated by Luca Frosini almost 7 years ago
@marketak@ics.forth.gr the most recent time series are metadata attached to the record. Instead, the resources are published by using refers_to field.
This is because the record has to display the most recent timeseries as fields in the record page and the whole time series as a resource (even the most recent timeseries match with the whole time series).
Updated by Francesco Mangiacrapa almost 7 years ago
Yannis Marketakis wrote:
The source record is OK. The issue is not there.
The issue is that although the GRSF record has timeseries associated with it (i.e. catches, abundance level, etc.) and the most recent timeseries appear correctly under the section Stock Data, they do not appear as resources (under the Data and Resources section.)
Hi Yannis,
I'm missing something...
I) About refer_to:
In your attached JSON there is only one refer_to field which was added as resource "FIRMS" in the record (under the Data and Resources section) and that's ok because refers_to is:
A list of objects of the format {"url": "http://", "id": "..."} that allows the aggregated GRSF records to point to their source records already published in the catalogue.
and the source of the submitted record is only FIRMS (is it right?)
II) About catches or landings, abundance level, etc.:
They should be also resources...
e.g. "catches_or_landings" are resources: [{"unit" : "...", "value": "...", "year": "..."}, ...] A time series of value, unit and date
The problem is that most recent timeseries appear correctly under the section Stock Data but full timeseries (like a "file") do not appear as resources. In this particular case is not an issue of Catalogue GUI. The Catalogue GUI is compliant and showing only that resource under the Data and Resources section as result of the following JSON in the catalogue:
"resources": [ { "mimetype": null, "cache_url": null, "hash": "", "description": "", "name": "FIRMS", "format": "", "url": "http://data.d4science.org/ctlg/GRSF_Admin/d4b2dff5-4106-3e7e-9e7e-98c505aaebe1", "datastore_active": false, "cache_last_updated": null, "package_id": "0dd3f88a-843c-4feb-9431-cdb3ffd1ff32", "created": "2018-07-31T13:17:30.240860", "state": "active", "mimetype_inner": null, "last_modified": null, "position": 0, "revision_id": "623c27cf-0028-4804-917e-9ee85793bf6e", "url_type": null, "id": "4fa762a0-1ec1-4ebd-9ecf-49cd09eb46a8", "resource_type": null, "size": null } ],
As you can see only one resource is stored (that is FIRMS).
My question is... when you publish a record who is responsible for adding resources (alias full timeseries)?
For example the following record http://data.d4science.org/ctlg/GRSF_Admin/60b1cc9a-a83d-3370-8fb1-7ea213d9276b contains the resource "Abundance Level" (and the Catalogue GUI displaying it) as a resource and a full timeseries because it is stored in the resources of the record:
"resources": [ { "mimetype": null, "cache_url": null, "hash": "", "description": "", "name": "FishSource", "format": "", "url": "http://data.d4science.org/ctlg/GRSF_Admin/987b0e70-2b83-339a-bbdb-95594a8134fe", "datastore_active": false, "cache_last_updated": null, "package_id": "a67f51e8-e9a6-4b92-9aa9-6abc16fe8c50", "created": "2018-07-31T12:33:06.273934", "state": "active", "mimetype_inner": null, "last_modified": null, "position": 0, "revision_id": "a411bfa6-2ea2-45c4-97fe-c84ea69bfa84", "url_type": null, "id": "b66b7390-4a1d-4b70-83c7-877f8637591c", "resource_type": null, "size": null }, { "mimetype": "text/csv", "cache_url": null, "hash": "", "description": "Strangomera bentincki Chilean region VII Chilean region VIII Chilean region VI Chilean region X Chilean region V Chilean region IX Pacific, Southeast", "name": "Abundance Level", "format": "CSV", "url": "https://ckan-grsf-admin2.d4science.org/dataset/a67f51e8-e9a6-4b92-9aa9-6abc16fe8c50/resource/5b38a6ad-d445-4775-bcbf-56a8a7873b9a/download/abundance-level", "datastore_active": true, "cache_last_updated": null, "package_id": "a67f51e8-e9a6-4b92-9aa9-6abc16fe8c50", "created": "2018-07-31T12:33:32.452215", "state": "active", "mimetype_inner": null, "last_modified": "2018-07-31T12:33:32.326001", "position": 1, "revision_id": "b74de2ad-c720-4c98-ace7-4fee51324412", "url_type": "upload", "id": "5b38a6ad-d445-4775-bcbf-56a8a7873b9a", "resource_type": null, "size": null }, { "mimetype": "text/csv", "cache_url": null, "hash": "", "description": "Strangomera bentincki Chilean region VII Chilean region VIII Chilean region VI Chilean region X Chilean region V Chilean region IX Pacific, Southeast", "name": "Fishing Pressure", "format": "CSV", "url": "https://ckan-grsf-admin2.d4science.org/dataset/a67f51e8-e9a6-4b92-9aa9-6abc16fe8c50/resource/f6cf3985-d988-4ded-95f0-1a5ebafbc85d/download/fishing-pressure", "datastore_active": true, "cache_last_updated": null, "package_id": "a67f51e8-e9a6-4b92-9aa9-6abc16fe8c50", "created": "2018-07-31T12:33:47.508115", "state": "active", "mimetype_inner": null, "last_modified": "2018-07-31T12:33:47.350197", "position": 2, "revision_id": "4b58eeed-bb95-411e-87a3-eb35b180ff96", "url_type": "upload", "id": "f6cf3985-d988-4ded-95f0-1a5ebafbc85d", "resource_type": null, "size": null }, { "mimetype": "text/csv", "cache_url": null, "hash": "", "description": "Strangomera bentincki Chilean region VII Chilean region VIII Chilean region VI Chilean region X Chilean region V Chilean region IX Pacific, Southeast", "name": "State and Trend", "format": "CSV", "url": "https://ckan-grsf-admin2.d4science.org/dataset/a67f51e8-e9a6-4b92-9aa9-6abc16fe8c50/resource/4c3440b6-122d-47c4-9fce-13bf2dacfa41/download/state-and-trend", "datastore_active": true, "cache_last_updated": null, "package_id": "a67f51e8-e9a6-4b92-9aa9-6abc16fe8c50", "created": "2018-07-31T12:34:04.226874", "state": "active", "mimetype_inner": null, "last_modified": "2018-07-31T12:34:04.083003", "position": 3, "revision_id": "4b58eeed-bb95-411e-87a3-eb35b180ff96", "url_type": "upload", "id": "4c3440b6-122d-47c4-9fce-13bf2dacfa41", "resource_type": null, "size": null } ],
Who and When should add those csv as resources?
Maybe there was an error on GRSF web service while updating the list of connected resources...
Updated by Yannis Marketakis almost 7 years ago
Francesco Mangiacrapa wrote:
I) About refer_to:
In your attached JSON there is only one refer_to field which was added as resource "FIRMS" in the record (under the Data and Resources section) and that's ok because refers_to is:
A list of objects of the format {"url": "http://", "id": "..."} that allows the aggregated GRSF records to point to their source records already published in the catalogue.and the source of the submitted record is only FIRMS (is it right?)
Exactly. The GRSF record has only one record (the one from FIRMS).
Francesco Mangiacrapa wrote:
II) About catches or landings, abundance level, etc.:
They should be also resources...
Of course. That's the spirit of my previous comments.
Francesco Mangiacrapa wrote:
My question is... when you publish a record who is responsible for adding resources (alias full timeseries)?
It should be a duty of the publisher service.
Francesco Mangiacrapa wrote:
Who and When should add those csv as resources?
Maybe there was an error on GRSF web service while updating the list of connected resources...
This should be done by the publisher. The contract is the following: I am providing the timeseries within the JSON (ordered). The top recent (5 most recent) are added as metadata fields, and the rest of it is constructed as a CSV resource (from the publisher) and added in the workspace.
Updated by Francesco Mangiacrapa almost 7 years ago
Hi Yannis,
thank you for clarifying the situation.
Probably the problem (the missing resources) was caused by a bug in the catalogue-publisher service. @luca.frosini@isti.cnr.it can you check on the issue asap? Then I think that after the fix we need to republish the GRSF records.. @marketak@ics.forth.gr is this feasible?
Updated by Yannis Marketakis almost 7 years ago
@francesco.mangiacrapa@isti.cnr.it of course!
Updated by Luca Frosini almost 7 years ago
@marketak@ics.forth.gr sorry for the misunderstanding. I just tried to publish the record in dev and the Resources are correctly created.
Maybe it was a temporary problem. What do you think if I try to update the resource using the JSON you attached. Is it better if you republish it via KB? For me are ok both solutions. Let me know and sorry again
Updated by Yannis Marketakis almost 7 years ago
No worries @luca.frosini@isti.cnr.it
I will re-publish it right away.
Updated by Yannis Marketakis almost 7 years ago
- Status changed from New to In Progress
I just tried to republish the record but it seems that the resources are not yet available. I tried: (1) updating the record and (2) removing and publishing again but the problem persists.
Updated by Luca Frosini almost 7 years ago
Thanks, @marketak@ics.forth.gr for your feedback. I'm going to investigate the logs.
Updated by Luca Frosini almost 7 years ago
I have found the issue. I opened the bug #12476 to solve it.
Updated by Luca Frosini almost 7 years ago
- Status changed from In Progress to Closed
- % Done changed from 0 to 100
The item has been successfully recreated, sorry for the inconvenience caused.
Updated by Luca Frosini almost 7 years ago
Of course, you can find the record here http://data.d4science.org/ctlg/GRSF_Admin/9d81aa77-1fba-3804-b3f9-d4514d6705dc