Project

General

Profile

Actions

Support #238

closed

The nagios check 'gcube search' goes often in critical state

Added by Andrea Dell'Amico almost 10 years ago. Updated almost 10 years ago.

Status:
Rejected
Priority:
Normal
Category:
System Application
Start date:
Jun 08, 2015
Due date:
% Done:

100%

Estimated time:
Infrastructure:
Production

Description

The "Gcube search" nagios check on portal.i-marine.d4science.org is often in critical state. The script source code is:

FILE_TMP=/tmp/xml-aquamaps

# wget http://$1/images/report/xml -O $FILE_TMP > /dev/null 2>&1 # service.d4science.org
"http://portal.i-marine.d4science.org/aslHttpInformationRetrieval/GenericSearch?responseType=xml&searchTerms="tuna"&allFields=false&count=1” -O $FILE_TMP > /dev/null 2>&1

nimages=`xmllint --xpath '//speciesCount/text()' $FILE_TMP`

rm $FILE_TMP

if [ "$nimages" -gt "0" ]
       then
               echo "Images cached : $nimages"
               exit $STATE_OK
       else
               echo "0 Images cached!"
               exit $STATE_CRITICAL
fi

There was a proposal to dismiss the discovery service, but it still active. If the service cannot be dismissed, a fix is needed.


Files

check_search.sh (571 Bytes) check_search.sh nagios check for search on i-marine Andrea Dell'Amico, Jun 10, 2015 12:58 PM
check_images.sh (433 Bytes) check_images.sh nagios check on service.d4science.org Andrea Dell'Amico, Jun 10, 2015 12:58 PM
gcube_search_trends.png (8.55 KB) gcube_search_trends.png nagios trends of the last 30 days Andrea Dell'Amico, Jun 10, 2015 04:06 PM
Actions #1

Updated by Massimiliano Assante almost 10 years ago

I think this check is messy, it seems mixing the check on the Images Servlet serving the iOS and Android app AppliFish (in the first part) and the gCube Search where you indicate the URL portal,i-marine....

I'll try to explain in the following:

hope this helps

Actions #2

Updated by Massimiliano Assante almost 10 years ago

  • Assignee changed from Massimiliano Assante to Andrea Dell'Amico

If my Analysis is right you may need to correct the Nagios check

Actions #3

Updated by Andrea Dell'Amico almost 10 years ago

Sorry, I just noted that the formatting lost a part of the script. The line

wget http://$1/images/report/xml -O $FILE_TMP > /dev/null 2>&1

is only used on the check dedicated to service.d4science.org.
The i-marine check uses the second URL only.

Actions #4

Updated by Andrea Dell'Amico almost 10 years ago

I attach the two scripts to clarify.

Actions #5

Updated by Massimiliano Assante almost 10 years ago

okay, clear now.
Is the second (search) failing now? It should not because if I click on http://portal.i-marine.d4science.org/aslHttpInformationRetrieval/GenericSearch?responseType=xml&searchTerms=%22tuna%22&allFields=false&count=1 i get an answer in 10 seconds (circa).

If it is failing now then there's something wrong in the search check. If it is not failing then I'm not sure what we can do as the problem resides on the QoS of the search implemented by NKUOA.

Actions #6

Updated by Andrea Dell'Amico almost 10 years ago

It's working correctly since the start of the month, but it flipped a lot in May.

When if fails it's always because the request times out, and the actual timeout is 240 seconds.

Actions #7

Updated by Massimiliano Assante almost 10 years ago

then the ticket can be closed for me.

Actions #8

Updated by Andrea Dell'Amico almost 10 years ago

  • Status changed from New to Rejected
Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 8.91 MB)