Project

General

Profile

Actions

Bug #12417

closed

Associated data are not returned

Added by Aureliano Gentile almost 7 years ago. Updated almost 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
Start date:
Sep 07, 2018
Due date:
% Done:

100%

Estimated time:

Description

Several GRSF records do not present associated data although time series are available in the source files.

In example "Pseudotolithus typus Pseudotolithus senegalensis Pseudotolithus elongatus Western Gulf of Guinea" http://data.d4science.org/ctlg/GRSF_Admin/9d81aa77-1fba-3804-b3f9-d4514d6705dc the box "Data and Resources" does not display any data while the box "Stock Data" is displaying the most recent catches.


Files

misssingassociateddata.JPG (26.8 KB) misssingassociateddata.JPG csv icons should be displayed Aureliano Gentile, Sep 07, 2018 04:28 PM
misssingassociateddata2.JPG (13.4 KB) misssingassociateddata2.JPG no associated data displayed Aureliano Gentile, Sep 07, 2018 04:29 PM
misssingassociateddata3.JPG (51.1 KB) misssingassociateddata3.JPG Aureliano Gentile, Sep 07, 2018 04:29 PM
9d81aa77-1fba-3804-b3f9-d4514d6705dc.json (8.43 KB) 9d81aa77-1fba-3804-b3f9-d4514d6705dc.json Yannis Marketakis, Sep 11, 2018 08:54 AM
Actions #1

Updated by Aureliano Gentile almost 7 years ago

See also the legacy source file with associated data http://data.d4science.org/ctlg/GRSF_Admin/d4b2dff5-4106-3e7e-9e7e-98c505aaebe1

Actions #2

Updated by Luca Frosini almost 7 years ago

  • Assignee changed from Luca Frosini to Yannis Marketakis
Actions #3

Updated by Luca Frosini almost 7 years ago

  • Assignee changed from Yannis Marketakis to Francesco Mangiacrapa
Actions #4

Updated by Luca Frosini almost 7 years ago

  • Assignee changed from Francesco Mangiacrapa to Yannis Marketakis

Sorry, @marketak@ics.forth.gr I assigned the ticket to @francesco.mangiacrapa@isti.cnr.it by mistake.

Actions #5

Updated by Yannis Marketakis almost 7 years ago

@luca.frosini@isti.cnr.it the issue has to do with the timeseries resources as they appear in the catalogue.

If you see the record http://data.d4science.org/ctlg/GRSF_Admin/9d81aa77-1fba-3804-b3f9-d4514d6705dc does contain the corresponding timeseries (under Stock Data section). However they do not appear as resources (under the Data and Resources section).

I am also attaching the JSON serialization of this object (you'll see that all the timeseries are there).

Actions #6

Updated by Luca Frosini almost 7 years ago

Looking at the attached JSON file I found only one resource (see refers_to field) which is compliant with what we can see in the record in the catalogue.

If there are some missing resources the issues could be two:

  1. The resources were not published at all (it is an additional step see https://wiki.gcube-system.org/gcube/GRSF-services#grsf-services-updater "Updates the list of connected exploiting resources.)
  2. There was an error on GRSF web service while updating the list of connected resources.

@marketak@ics.forth.gr am I missing something?

Actions #7

Updated by Yannis Marketakis almost 7 years ago

The source record is OK. The issue is not there.

The issue is that although the GRSF record has timeseries associated with it (i.e. catches, abundance level, etc.) and the most recent timeseries appear correctly under the section Stock Data, they do not appear as resources (under the Data and Resources section.)

Actions #8

Updated by Luca Frosini almost 7 years ago

@marketak@ics.forth.gr the most recent time series are metadata attached to the record. Instead, the resources are published by using refers_to field.
This is because the record has to display the most recent timeseries as fields in the record page and the whole time series as a resource (even the most recent timeseries match with the whole time series).

Actions #9

Updated by Francesco Mangiacrapa almost 7 years ago

Yannis Marketakis wrote:

The source record is OK. The issue is not there.

The issue is that although the GRSF record has timeseries associated with it (i.e. catches, abundance level, etc.) and the most recent timeseries appear correctly under the section Stock Data, they do not appear as resources (under the Data and Resources section.)

Hi Yannis,

I'm missing something...

I) About refer_to:

In your attached JSON there is only one refer_to field which was added as resource "FIRMS" in the record (under the Data and Resources section) and that's ok because refers_to is:

A list of objects of the format {"url": "http://", "id": "..."} that allows the aggregated GRSF records to point to their source records already published in the catalogue.

and the source of the submitted record is only FIRMS (is it right?)

II) About catches or landings, abundance level, etc.:

They should be also resources...

e.g. "catches_or_landings" are resources: [{"unit" : "...", "value": "...", "year": "..."}, ...]    A time series of value, unit and date

The problem is that most recent timeseries appear correctly under the section Stock Data but full timeseries (like a "file") do not appear as resources. In this particular case is not an issue of Catalogue GUI. The Catalogue GUI is compliant and showing only that resource under the Data and Resources section as result of the following JSON in the catalogue:

"resources": [

    {
        "mimetype": null,
        "cache_url": null,
        "hash": "",
        "description": "",
        "name": "FIRMS",
        "format": "",
        "url": "http://data.d4science.org/ctlg/GRSF_Admin/d4b2dff5-4106-3e7e-9e7e-98c505aaebe1",
        "datastore_active": false,
        "cache_last_updated": null,
        "package_id": "0dd3f88a-843c-4feb-9431-cdb3ffd1ff32",
        "created": "2018-07-31T13:17:30.240860",
        "state": "active",
        "mimetype_inner": null,
        "last_modified": null,
        "position": 0,
        "revision_id": "623c27cf-0028-4804-917e-9ee85793bf6e",
        "url_type": null,
        "id": "4fa762a0-1ec1-4ebd-9ecf-49cd09eb46a8",
        "resource_type": null,
        "size": null
    }

],

As you can see only one resource is stored (that is FIRMS).

My question is... when you publish a record who is responsible for adding resources (alias full timeseries)?

For example the following record http://data.d4science.org/ctlg/GRSF_Admin/60b1cc9a-a83d-3370-8fb1-7ea213d9276b contains the resource "Abundance Level" (and the Catalogue GUI displaying it) as a resource and a full timeseries because it is stored in the resources of the record:

"resources": [

    {
        "mimetype": null,
        "cache_url": null,
        "hash": "",
        "description": "",
        "name": "FishSource",
        "format": "",
        "url": "http://data.d4science.org/ctlg/GRSF_Admin/987b0e70-2b83-339a-bbdb-95594a8134fe",
        "datastore_active": false,
        "cache_last_updated": null,
        "package_id": "a67f51e8-e9a6-4b92-9aa9-6abc16fe8c50",
        "created": "2018-07-31T12:33:06.273934",
        "state": "active",
        "mimetype_inner": null,
        "last_modified": null,
        "position": 0,
        "revision_id": "a411bfa6-2ea2-45c4-97fe-c84ea69bfa84",
        "url_type": null,
        "id": "b66b7390-4a1d-4b70-83c7-877f8637591c",
        "resource_type": null,
        "size": null
    },
    {
        "mimetype": "text/csv",
        "cache_url": null,
        "hash": "",
        "description": "Strangomera bentincki Chilean region VII Chilean region VIII Chilean region VI Chilean region X Chilean region V Chilean region IX Pacific, Southeast",
        "name": "Abundance Level",
        "format": "CSV",
        "url": "https://ckan-grsf-admin2.d4science.org/dataset/a67f51e8-e9a6-4b92-9aa9-6abc16fe8c50/resource/5b38a6ad-d445-4775-bcbf-56a8a7873b9a/download/abundance-level",
        "datastore_active": true,
        "cache_last_updated": null,
        "package_id": "a67f51e8-e9a6-4b92-9aa9-6abc16fe8c50",
        "created": "2018-07-31T12:33:32.452215",
        "state": "active",
        "mimetype_inner": null,
        "last_modified": "2018-07-31T12:33:32.326001",
        "position": 1,
        "revision_id": "b74de2ad-c720-4c98-ace7-4fee51324412",
        "url_type": "upload",
        "id": "5b38a6ad-d445-4775-bcbf-56a8a7873b9a",
        "resource_type": null,
        "size": null
    },
    {
        "mimetype": "text/csv",
        "cache_url": null,
        "hash": "",
        "description": "Strangomera bentincki Chilean region VII Chilean region VIII Chilean region VI Chilean region X Chilean region V Chilean region IX Pacific, Southeast",
        "name": "Fishing Pressure",
        "format": "CSV",
        "url": "https://ckan-grsf-admin2.d4science.org/dataset/a67f51e8-e9a6-4b92-9aa9-6abc16fe8c50/resource/f6cf3985-d988-4ded-95f0-1a5ebafbc85d/download/fishing-pressure",
        "datastore_active": true,
        "cache_last_updated": null,
        "package_id": "a67f51e8-e9a6-4b92-9aa9-6abc16fe8c50",
        "created": "2018-07-31T12:33:47.508115",
        "state": "active",
        "mimetype_inner": null,
        "last_modified": "2018-07-31T12:33:47.350197",
        "position": 2,
        "revision_id": "4b58eeed-bb95-411e-87a3-eb35b180ff96",
        "url_type": "upload",
        "id": "f6cf3985-d988-4ded-95f0-1a5ebafbc85d",
        "resource_type": null,
        "size": null
    },
    {
        "mimetype": "text/csv",
        "cache_url": null,
        "hash": "",
        "description": "Strangomera bentincki Chilean region VII Chilean region VIII Chilean region VI Chilean region X Chilean region V Chilean region IX Pacific, Southeast",
        "name": "State and Trend",
        "format": "CSV",
        "url": "https://ckan-grsf-admin2.d4science.org/dataset/a67f51e8-e9a6-4b92-9aa9-6abc16fe8c50/resource/4c3440b6-122d-47c4-9fce-13bf2dacfa41/download/state-and-trend",
        "datastore_active": true,
        "cache_last_updated": null,
        "package_id": "a67f51e8-e9a6-4b92-9aa9-6abc16fe8c50",
        "created": "2018-07-31T12:34:04.226874",
        "state": "active",
        "mimetype_inner": null,
        "last_modified": "2018-07-31T12:34:04.083003",
        "position": 3,
        "revision_id": "4b58eeed-bb95-411e-87a3-eb35b180ff96",
        "url_type": "upload",
        "id": "4c3440b6-122d-47c4-9fce-13bf2dacfa41",
        "resource_type": null,
        "size": null
    }

],

Who and When should add those csv as resources?
Maybe there was an error on GRSF web service while updating the list of connected resources...

Actions #10

Updated by Yannis Marketakis almost 7 years ago

Francesco Mangiacrapa wrote:

I) About refer_to:

In your attached JSON there is only one refer_to field which was added as resource "FIRMS" in the record (under the Data and Resources section) and that's ok because refers_to is:

A list of objects of the format {"url": "http://", "id": "..."} that allows the aggregated GRSF records to point to their source records already published in the catalogue.

and the source of the submitted record is only FIRMS (is it right?)

Exactly. The GRSF record has only one record (the one from FIRMS).

Francesco Mangiacrapa wrote:

II) About catches or landings, abundance level, etc.:

They should be also resources...

Of course. That's the spirit of my previous comments.

Francesco Mangiacrapa wrote:

My question is... when you publish a record who is responsible for adding resources (alias full timeseries)?

It should be a duty of the publisher service.

Francesco Mangiacrapa wrote:

Who and When should add those csv as resources?
Maybe there was an error on GRSF web service while updating the list of connected resources...

This should be done by the publisher. The contract is the following: I am providing the timeseries within the JSON (ordered). The top recent (5 most recent) are added as metadata fields, and the rest of it is constructed as a CSV resource (from the publisher) and added in the workspace.

Actions #11

Updated by Francesco Mangiacrapa almost 7 years ago

Hi Yannis,

thank you for clarifying the situation.
Probably the problem (the missing resources) was caused by a bug in the catalogue-publisher service. @luca.frosini@isti.cnr.it can you check on the issue asap? Then I think that after the fix we need to republish the GRSF records.. @marketak@ics.forth.gr is this feasible?

Actions #12

Updated by Yannis Marketakis almost 7 years ago

@francesco.mangiacrapa@isti.cnr.it of course!

Actions #13

Updated by Luca Frosini almost 7 years ago

@marketak@ics.forth.gr sorry for the misunderstanding. I just tried to publish the record in dev and the Resources are correctly created.

https://next.d4science.org/group/nextnext/catalogue?path=/dataset/9d81aa77-1fba-3804-b3f9-d4514d6705dc

Maybe it was a temporary problem. What do you think if I try to update the resource using the JSON you attached. Is it better if you republish it via KB? For me are ok both solutions. Let me know and sorry again

Actions #14

Updated by Yannis Marketakis almost 7 years ago

No worries @luca.frosini@isti.cnr.it

I will re-publish it right away.

Actions #15

Updated by Yannis Marketakis almost 7 years ago

  • Status changed from New to In Progress

I just tried to republish the record but it seems that the resources are not yet available. I tried: (1) updating the record and (2) removing and publishing again but the problem persists.

Actions #16

Updated by Luca Frosini almost 7 years ago

Thanks, @marketak@ics.forth.gr for your feedback. I'm going to investigate the logs.

Actions #18

Updated by Luca Frosini almost 7 years ago

I have found the issue. I opened the bug #12476 to solve it.

Actions #19

Updated by Luca Frosini almost 7 years ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

The item has been successfully recreated, sorry for the inconvenience caused.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 8.91 MB)