Project

General

Profile

GRSF Validation plan

Principles identified during the TWG3 on GRSF:

1) a record not similar to other records and compliant with GRSF standards/mapping is directly accepted (no manual validation)
2) record matching other records and compliant with GRSF standards/mappings is merged with the records, but needs to be approved with final validation
3) a record similar but not identical (e.g. same species but different areas, partially overlapping, adjacent), with other records should go through validation by experts and data providers (e.g. RFBs).

Implementation:

GRSF Assessment Unit (2139)

  • A first round with an initial assessment with selected manual approvals for unique records: i) without similar records, and ii) with similar records.

Stock - RAM (1155) (e.g. checks by specific area codes in the mapping file - with/without similarities - ... )
Stock - FIRMS (560)
Stock - FishSource (1046)

Manual approvals for all merged records:

Stock - FIRMS RAM (73)
Stock - FIRMS FishSource RAM (37)
Stock - FIRMS FishSource (31)
Stock - FishSource RAM (25)

Once the above checks and validations have been run successfully (with some lessons learned), then we can implement automated rules.

GRSF Marine Resource (789)

GRSF Fishing Unit (6596)

GRSF other fishery (242)

Validating and approving RAM records

A. Validation Steps for the GRSF VRE:

  1. verify proper matching of GRSF record within RAM db (identity of the assessment unit - species + area codes)
  2. verify time series are properly retrieved (data attached to the GRSF record)

B. Criteria for bulk approval of sub-sets for GRSF Admin VRE:

  • Identify species not covered by the other sources (FIRMS, FishSource) AND/OR
  • identify areas not covered by the other sources (e.g. FIRMS does not have much on national waters, southwest Atlantic, Pacific)
  • Identifiy TAG, GROUPS and/or combinations of them for bulk approvals (e.g. all GRSF records originated uniquely by RAM for this species in this area code /group of area codes)

Feedback

It can be hard to find a balance between accuracy and keeping to rules-based sorting. It's desirable to minimize the number of exceptions that have to be made, while at the same time undesirable to have to re-define standards or add new variables. I'll put
this out as a suggestion that tries to find a balance, using that Sand eel SA 5 stock we discussed as an example, which has the RAM areaid "multinational-ICES-SA5":

1) in cases where a stock's area of distribution is considered to be non-standard, a subscript (something like "nonstd", "ns", or "subarea") is added to the GRSF area code. This would be a one-time, manual change based on the results from the approval
comparisons we've already done. For example, the area code for this stock would temporarily change from "fao:21.4.a" to "fao:21.4.a_nonstd_"

2) if the string "nonstd" is detected in a GRSF area code:

  a) the FIRMS record drops to lowest precedence in the priority ranking for this stock

  b) the area code of the record of next-highest-precedence (either RAM or FishSource) is automatically appended to the GRSF area code. In this case, the full GRSF area code would become either "fao:21.4.a_nonstd_SA5" or "fao:21.4.a_nonstd_multinational-ICES-SA5"
(depending if the full RAM areaid is used or only the portion after the final "-"). This string would also appear in the GRSF semantic identifier.

  c) for linking to data, the user would be brought to the page for this non-standard record (RAM or FishSource) as first priority instead of the page for the corresponding FIRMS record.

This might help to clarify that stock SA5 is still nested within fao area 21.4.a, but also making it clear that it doesn't perfectly overlap the area and/or that other stocks (such as "fao:21.4.a_nonstd_SA7") might also share the same area 21.4.a. There are
probably several ways to tweak this approach to make it more compatible with the existing database, but it's one possible approach that might help to resolve the issue.

Different topic: we have RAM version 4.48 out this week. Should we send Yannis the details for that? There are a few new stocks added compared to our previous version, which may result in more cases in which there is no corresponding stock currently in the
GRSF.

Attached is the time series data where I rechecked the GRSF time series data. This is specifically for the issue in column F where "GRSF has more records than RAM.” My response to the recheck is listed in column H.

Attached is the result of our check for those RAM records that are “without similar records.” The second row shows the various status, group, and tag selections that were made. The rows below show the records that should be rejected and why. The remaining records within this Group 1 check (281) can be a bulk approval.

Once corrections are made to the 69 individually described rejected records, then those records could also be accepted. Below is a summary of these rejections.

  • Two records need a 3-letter species code
  • One record needs to have “Northwest” spelled correctly
  • Sixty-one records require a capitalization correction
  • For five records, we suggest that spelling be changed to “Tyrrhenian” so that the full and short stock names are spelled consistently.

Also, would you be willing to provide us a list of all the RAM stocks that have been approved? Ideally, this list would include the 281 stocks we approve from this Group 1 check, or if possible, all 350 records once the 69 records are corrected.

As promised, attached you will find the results of the random checks for 25 of the 50 stocks that were checked for a name match between GRSF and RAM.

In addition, below are a few things we noticed that you may find helpful.

  1. I was unable to download the CSV files easily and so had to do these checks through the GRSF web browser. This made the task more time consuming. Perhaps you will want to provide the ability for users to download the CSV files.
  2. For RAM, there may be a year associated with assessid that does not have a value. For GRSF, I noticed this is not the case. Rather, if there is not a value, the year does not show. Note that when I did year range checks, I considered it a match of year range if the year/values matched up (i.e. I did not note it as an issue if RAM had a year with no value and GRSF did not show the year. Row 11(Haddock NAFO-5Y) provides an example of this situation.
  3. I noticed that in the GRSF web browser, to do the assid checks, I needed to search on SISIMP2016 or the year range —I was unable to search on SIMP.
  4. When I did the name check, I incorrectly listed NEPHFU6 (six) as NEPHFU8 (eight). The name check matched for both stocks. For this time series exercise, I have the stock correctly listed as NEPHFU6 (six).

Mike and I have made headway on checking the RAM information in the public side of GRSF. Attached are our comments for the name checks. Please note that columns A and B show GSRF data and that column G shows problems that were encountered. This list has been filtered on column I, so if you would like to see the entire list of species we checked you can select all of the data from this column. Next week we will follow this up with the results of the time series checks.

Out of 50 RAM stocks checked, we found spelling errors for 9 stocks (mainly capitalization), a partial spatial mismatch for 1 stock, and an incorrect match for one stock. Regarding the latter, Mike has the following comment:

One major mismatch was identified, for the RAM stock:
stockid = NEPHFU33 ; areaid = multinational-ICES-FU33 ; stocklong = Norway lobster Off Horn Reef (FU 33)
The GRSF name that was assigned to this stock was:
title = Nephrops norvegicus Southern North Sea (Division 27.4.c) Central North Sea (Division 27.4.b); shortname = Norway lobster - North Sea (Botney Gut, Off Horn Reef )
It is correct that the stock NEPHFU33 occurs in Divisions 4b,c, but other Nephrops stocks also occur in these same divisions. Presumably the Nephrops stock from Botney Gut, NEPHFU5, would have the exact same title and short name in the GRSF. Because these are distinct biological stocks, they should be distinct in the GRSF and should not have duplicate names. Information about the Nephrops management areas (FU_) could be added to the title and short name fields of the GRSF which would allow these stocks to be differentiated (for example, Division 27.4.c.FU33). The same problem will arise for other stocks that have multiple actual stocks occurring within the same statistical area of convenience, for example: other Nephrops stocks, North Sea sandeel stocks, Canadian scallop fishing areas (SFA), Canadian shrimp management areas (SMA), Canadian lobster fishing areas (LFA), Canadian salmon areas (A), and similar classifications in the U.S. This is expected to occur any time the resolution of multiple distinct stocks is smaller than the resolution of defined statistical areas. A simple solution could be to just allow for finer differentiation of those statistical areas, e.g. Division 27.4.c.FU33.

Validating and approving FishSource records

Validation Steps for the GRSF VRE:

A. Assessment Unit/Marine Resource

  1. verify proper matching of GRSF Stock record within FishSource db (identity of the assessment unit - species + area codes)
  2. verify time series are properly retrieved (data attached to the GRSF record)

B. Criteria for bulk approval of sub-sets for GRSF Admin VRE:

  • Select Fishing Unit with Traceability Flag = true (Group: GRSF Traceability Flag)
  • Identify species not covered by the other sources (FIRMS, RAM) AND/OR
  • identify areas not covered by the other sources (e.g. FIRMS does not have much on national waters, southwest Atlantic, Pacific)
  • Identifiy TAG, GROUPS and/or combinations of them for bulk approvals (e.g. all GRSF records originated uniquely by FishSource for this species in this area code /group of area codes)

C. Fishing Unit

  1. verify proper matching of GRSF Stock record within FishSource db (identity of the fishing unit: species + area(s) + mngt entity + flag state + gear)
  2. verify/activate flag for traceability if all fields are filled and compliant with GRSF standards
  3. connect fishing units with assessment unit when applicable

Regarding Fishery names, these can be edited through the Management Panel. As per GRSF requirements, FIRMS naming conventions should be considered, see i.e. "Annex 1.4 - Fisheries naming conventions" page 39: http://www.fao.org/3/a-ax530e.pdf
However the system should already have built with English names from the standard codes, similarly to stock names. This has to be checked with FORTH.

Discussions

Merging RAM records with FIRMS records - issues on assessment units definitions and area codes

From: Michael Melnychuk <mmel@u.washington.edu>

Sent: 04 November 2019 8:00 PM
To: Gentile, Aureliano (FIAS) Aureliano.Gentile@fao.org; Charmane E Ashbrook charmane@uw.edu
Cc: VanNiekerk, Bracken (FIAS) Bracken.VanNiekerk@fao.org; Taconet, Marc (FIAS) Marc.Taconet@fao.org; Ellenbroek, Anton (FIAS) Anton.Ellenbroek@fao.org
Subject: Re: Mapping RAM area IDs

hi Aureliano,

I would actually suggest that we do not generalize to rules such as "the RAM area code Multinational-ISC-NPAC is equivalent to the FIRMS code NPO North Pacific Ocean". For this albacore stock, that is indeed the case, so merging these two records should be OK. But for other stocks in RAM, possibly including tuna stocks of other species, the label "NPAC" or the areaid "Multinational-ISC-NPAC" will not necessarily represent the same area of distribution as it does for this albacore stock.

Here is a hypothetical example:

  • This albacore assessment covers those areas 1-5 below, and the assessment calls this "Albacore in the North Pacific Ocean", so we adopt "NPAC" in our areaid.
  • A stock of a different tuna species might cover only areas 3 and 5 below (maybe there are other "central" stocks that cover tropical waters), and the assessment might also call this "...North Pacific Ocean". In this case we would also adopt "NPAC" as the areaid. In both cases we would use "NPAC" as the label, and the areaid would be "Multinational-ISC-NPAC", but these would represent different areas. Similar examples occur off the East Coast of the U.S. For some species, assessments for "... Atlantic coast" cover a set of states along the coast, but for other species, assessments for "... Atlantic coast" are specific to a different set of states. In RAM, we would use something like "ATLC" as the label even though this means different things for different assessed species.

We do not intend our areaid codes to be any kind of a standard coding system, they just reflect the general areas stated in the title or description of the stock assessment documents. If two assessments cover different actual areas but refer to them with the same description such as "North Pacific" or "Atlantic coast", that inconsistency propagates into RAM. For this reason, I think we should assume that a common areaid label in RAM does not necessarily mean the same thing for 2+ different stocks (different species). I don't think this is a problem, though, as all the manual inspections of stock areas that we have done for suggested merging with other records have focused on the actual areas of distribution as described in stock assessments, they don't just rely on the areaid codes.

I agree that it is probably the case that for any given stock, the area of distribution might change somewhat from assessment to assessment, but the core area will generally remain constant. (I think this often occurs in ICES assessments.) I also agree that if a stock assessment area changed substantially from one assessment to the next, it would receive a different areaid label. But even without any such substantial changes, there can still be cases where stocks of 2+ different species have different areas of distribution but share the same areaid in RAM because that's how they were (generally) described in the title of the stock assessment document.

Mike

On 2019-11-02 2:32 AM, Gentile, Aureliano (FIAS) wrote:

Thanks Mike,

I think the two records* can be merged, with the understanding that the RAM area code Multinational-ISC-NPAC is equivalent to the FIRMS code NPO North Pacific Ocean (IATTC coding system, Pacific Tuna and Tuna-like Reporting areas).

Yes, I think it is correct saying that our area codes stems from the data reports (i.e. IATTC) and not from the original assessment document. Which I would say it belongs to a previous step in the reporting flow under the IATTC work.

In this discussion I would like to highlight the difference of the area definitions which can change from an assessment to another one and the overall identification of the underlying assessment unit, which probably does not change regardless the (minor) variations of area borders in different assessments.
In other words, in a next assessment (year) the area code MULTINATIONAL-ISC-NPAC might be bigger or smaller but in any case it would still map to the same GRSF assessment unit record Albacore - Northern Pacific.
If the assessment area will be definitely too different so as to refer to another assessment unit, then the RAM area code should change as well.
Do you concur on that?

Let me copy our colleagues for their info and for any further opinions.

Best,
Aureliano

*The two records to be merged:
FIRMS
Short Name: Albacore - Northern Pacific
GRSF Semantic identifier: asfis:ALB+pac_tun:NPO
Record URL: http://data.d4science.org/ctlg/GRSF_Admin/74e5a6ca-6a25-383d-ba0e-3953c010f717

to be merged with

RAM
Short Name: Albacore tuna North Pacific
GRSF Semantic identifier: asfis:ALB+unk:MULTINATIONAL-ISC-NPAC
Record URL: http://data.d4science.org/ctlg/GRSF_Admin/7917aade-3c83-3215-9809-541ce35e57d7

provided that this RAM area code Multinational-ISC-NPAC is equivalent to NPO North Pacific Ocean (IATTC coding system, Pacific Tuna and Tuna-like Reporting areas).

From: Michael Melnychuk <mmel@u.washington.edu>

Sent: 30 October 2019 6:38 PM
To: Gentile, Aureliano (FIAS) Aureliano.Gentile@fao.org; Charmane E Ashbrook charmane@uw.edu
Cc: VanNiekerk, Bracken (FIAS) Bracken.VanNiekerk@fao.org
Subject: Re: Mapping RAM area IDs

hi Aureliano,

Here is the area for the assessed stock:

It's not an exact overlap with the area in FIRMS, but it's very close, the area in FIRMS includes a bit more of the eastern Pacific off central America, a bit more of the western Pacific off southeast Asia, and a bit more of the northeast Pacific off Alaska.

Would you say that these are close enough in order to approve a merge? (Charmane can correct me if I'm wrong, but I suspect that during the round of approvals, for cases like this we would have said these are close enough and approved a merge.)

Just a minor point - I think the link you gave below is actually to a data report, not to the stock assessment (i.e. that area in FIRMS might not come from the source you cited). The 2017 assessment is available from either of these sites:
http://isc.fra.go.jp/reports/stock_assessments.html
https://www.wcpfc.int/node/29522

In general, I would say that if you know that the area in a record in FIRMS comes from a stock assessment definition, then it will probably align with the corresponding area in RAM, because all areas in RAM come straight out of assessments.

Mike

On 2019-10-30 9:45 AM, Gentile, Aureliano (FIAS) wrote:

Thank you Mike,

The FIRMS report on Albacore - Northern Pacific is provided by IATTC http://firms.fao.org/firms/resource/5/en , the area reference is "NPO North Pacific Ocean" (see map in the fact sheet or attached) and indeed I think that is the spatial reference for the assessment. At the bottom of the fact sheet you will find also the source of information:
Inter-American Tropical Tuna Commission (IATTC). “"Tunas and billfishes in the eastern Pacific Ocean in 2017. Inter-American Tropical Tuna Commission." Fishery Status Report. IATTC 2018.” https://www.iattc.org/PDFFiles/FisheryStatusReports/_English/No-16-2018_Tunas%20billfishes%20and%20other%20pelagic%20species%20in%20the%20eastern%20Pacific%20Ocean%20in%202017.pdf

So, I understand the other "NPAC" labels cannot be the same area, but likely for albacore we are referring to the same extension. What do you think?

Thanks
Aureliano

From: Michael Melnychuk <mmel@u.washington.edu>

Sent: 30 October 2019 4:35 PM
To: Gentile, Aureliano (FIAS) Aureliano.Gentile@fao.org; Charmane E Ashbrook charmane@uw.edu
Cc: VanNiekerk, Bracken (FIAS) Bracken.VanNiekerk@fao.org
Subject: Re: Mapping RAM area IDs

hi Aureliano and Bracken,

Unfortunately those 'NPAC' labels imply different areas in each of those different countries/regions, and the stocks within each of these regions that use these labels probably also differ in their areas of distribution. I haven't checked, but I would guess that the bounding boxes vary among the stocks and among the regions that use these 'NPAC' labels, reflecting the differences in defined stock distribution areas.

It might be the case that for albacore, the stock's distribution area aligns 1:1 with the two North Pacific FAO areas, but this is by no means certain. It may also include the two central Pacific FAO areas, or even more likely is that the stock's defined area of distribution does not correspond with FAO area boundaries. Even within tuna reporting areas, stocks of two different tuna species may have different areas of distribution even though they are both called "NPO" by IATTC and thus have 'NPAC' as their areaid in RAM. (Our original intention with the areaid codes was not to align with FAO areas, but rather just to reflect the general area in which the stock is located; the distributions of stocks in the same general area are often different.)

I'm not sure what the FIRMS albacore stock represents in terms of spatial definitions. If that is based on the stock assessment, then it's probably true that these records can be merged, with the caveat that spatial stock definitions can change from one assessment to the next, so that would need to be verified before merging. If the FIRMS stock instead aligns with FAO areas in the North Pacific, then it shouldn't be merged with the RAM stock because their areas probably won't align.

hope that helps,

Mike

On 2019-10-30 7:51 AM, Gentile, Aureliano (FIAS) wrote:

Dear Charmane and Mike,

We are revising some FIRMS records and we noticed some potential merges.

For example:

FIRMS
Short Name: Albacore - Northern Pacific
GRSF Semantic identifier: asfis:ALB+pac_tun:NPO
Record URL: http://data.d4science.org/ctlg/GRSF_Admin/74e5a6ca-6a25-383d-ba0e-3953c010f717

to be merged with

RAM
Short Name: Albacore tuna North Pacific
GRSF Semantic identifier: asfis:ALB+unk:MULTINATIONAL-ISC-NPAC
Record URL: http://data.d4science.org/ctlg/GRSF_Admin/7917aade-3c83-3215-9809-541ce35e57d7

provided that this RAM area code Multinational-ISC-NPAC is equivalent to NPO North Pacific Ocean (IATTC coding system, Pacific Tuna and Tuna-like Reporting areas).

If this is correct, would it be the same logic also for the following codes?

I.e.
multinational-IPHC-NPAC
USA-NMFS-NPAC
Japan-FAJ-NPAC
Thanks in advance for your feedback.

Best,
Bracken & Aureliano

Resources

GRSF VREs

Add picture from clipboard (Maximum size: 8.91 MB)