Task #850
closedInvestigate a new way for check smartgears container by nagios
100%
Description
At this time we have a simple nagios check on tomcat port but this check often is not enough.
If a container, for some reasons, is not longer registered on the infrastructure for some time, the nagios check on tomcat port doesn't detect this problem.
A possible way for enhance the nagios check could be the following:
Every Smartgears node has an enabling service: Whn-Manager. This service could be checked via http for verify the container status.
For example, this url (related to node2-d-d4s.d4science.org): http://node2-d-d4s.d4science.org:8080/whn-manager/gcube/resource/ responds with a "resource is active" string.
What happen if this container, for some reasons, is no longer registered to the Infrastructure? What answer will be provided by this url?
if the answer is "the resource is not active", we have found a more specific nagios check.
There is a way to check this behavior?
Related issues
Updated by Roberto Cirillo over 9 years ago
- Tracker changed from Support to Task
- Start date changed from Oct 01, 2015 to Apr 01, 2016
Updated by Roberto Cirillo about 9 years ago
- Target version changed from System Configuration to improve nagios checks
Updated by Roberto Cirillo about 9 years ago
- Related to Task #3140: Improve nagios checks for gCore container added
Updated by Roberto Cirillo about 9 years ago
When the container is down, this url return an error. So, I think, this url should be better than the standard check on the port.
In my opinion, we could change the standard check with this one.
What do you think about this, @lucio.lelii@isti.cnr.it , @andrea.dellamico@isti.cnr.it ?
Updated by Lucio Lelii about 9 years ago
I agree with you, this is the right way to check the smartgears container
Updated by Andrea Dell'Amico about 9 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
- Infrastructure Pre-Production, Production added
Updated by Andrea Dell'Amico about 9 years ago
- Related to Task #3157: Improve the nagios check for the Smartgears (not the smart executor ones) nodes added
Updated by Andrea Dell'Amico about 9 years ago
- Status changed from Closed to In Progress
@lucio.lelii@isti.cnr.it Can you thell when the /whn-manager/gcube/resource/ URL was added to the whn-manager? There are some smartgears installations that fail the nagios check. Most of them are Greek VMs that I cannot access. But on node31.p.d4science.research-infrastructures.eu
and node13.p.d4science.research-infrastructures.eu
where the check is also failing, the whn-manager version is 1.0.0-3.1.0 installed on april 2014, so two years old.
Updated by Andrea Dell'Amico about 9 years ago
- Status changed from In Progress to Closed
Never mind. The war installation name was the problem, see #3159