Task #8103
closed
a nagios script must be scheduled every 24 h to add a list of resources to all scopes
100%
Related issues
Updated by Lucio Lelii about 8 years ago
- Subject changed from a nagios script must be sheduled every 24 h to add a list of resources to all scopes to a nagios script must be scheduled every 24 h to add a list of resources to all scopes
Updated by Andrea Dell'Amico about 8 years ago
- Assignee changed from Andrea Dell'Amico to _InfraScience Systems Engineer
Can you give us less information? With so many details we lose the joy of discovering new ways to mess up with our systems.
Updated by Pasquale Pagano about 8 years ago
- Priority changed from Low to Urgent
Updated by Andrea Dell'Amico about 8 years ago
Wy a nagios script anyway? Is it something that can be used as a check too?
Updated by Pasquale Pagano about 8 years ago
Nagios was my suggestion for the following reason:
- it notifies when something happen
- it keeps track of the history and we can understand how frequently we are loosing resources
- it allows to easily schedule a check
Any other solution can be selected clearly.
Updated by Lucio Lelii about 8 years ago
but this script only adds all the scope to the selected resource. The check for the scope loss in the resource should be done with a smartexecutor plugin as reported in the ticket #7800
Updated by Andrea Dell'Amico about 8 years ago
And is the plugin meant to run the script that restores the scopes too? Or we run it as a secondary effect of the nagios check?
From what I read on the other related tasks, a nagios check could run on the host where the smart executor plugin lives. The plugin could write its results into a file (list of missing scopes, if any. 'none' if it's all OK). The nagios check will read that file, report the result and, if there are missing scopes, run the script that fixes them. So the scopes would be added again in a matter of minutes and we will have a nagios report.
Updated by Costantino Perciante about 8 years ago
@lucio.lelii@isti.cnr.it told me that his script just needs the list of missing ids (no matter where the resources were missing, because the script adds them everywhere again).
@andrea.dellamico@isti.cnr.it Please, just let me know the path in which nagios will look for that file
Updated by Andrea Dell'Amico about 8 years ago
We need a place writeable by the gcube user. Anything under /home/gcube is fine with me, better a subdirectory. /home/gcube/scopes_data/scopes_status
maybe? where scopes_status
is the filename?
Updated by Costantino Perciante about 8 years ago
Andrea Dell'Amico wrote:
We need a place writeable by the gcube user. Anything under /home/gcube is fine with me, better a subdirectory.
/home/gcube/scopes_data/scopes_status
maybe? wherescopes_status
is the filename?
Since we are dealing with identifiers of resources, I would say that /home/gcube/missing_resources/identifiers is ok too, isn't it? ("identifiers" is the file name)
Moreover, is there a place in dev in which we can test both the smart-executor plugin and the script?
Updated by Andrea Dell'Amico about 8 years ago
Costantino Perciante wrote:
Andrea Dell'Amico wrote:
We need a place writeable by the gcube user. Anything under /home/gcube is fine with me, better a subdirectory.
/home/gcube/scopes_data/scopes_status
maybe? wherescopes_status
is the filename?Since we are dealing with identifiers of resources, I would say that /home/gcube/missing_resources/identifiers is ok too, isn't it? ("identifiers" is the file name)
Yes, no probl.
Moreover, is there a place in dev in which we can test both the smart-executor plugin and the script?
It's a question for @lucio.lelii@isti.cnr.it I guess.
Updated by Lucio Lelii about 8 years ago
- File AddResourcesToAllScopes.java added
I have just attached the script to add all scopes to the selected resource ids.
Run it with the command:
java AddResourcesToAllScopes id1 id2 ... idn
with the smartgears classpath.
Updated by Pasquale Pagano about 8 years ago
- Priority changed from Urgent to Immediate
We had another issue in production and another ticket from the user. It is fundamental to implement this workaround now.
Updated by Andrea Dell'Amico about 8 years ago
Lucio Lelii wrote:
I have just attached the script to add all scopes to the selected resource ids.
Run it with the command:java AddResourcesToAllScopes id1 id2 ... idnwith the smartgears classpath.
So I need to run it on a smartgears node, the one that collects the missing IDs. I still don't know what node is that to be used.
The java source should live on subversion, btw.
Updated by Costantino Perciante about 8 years ago
The smart executor plugin will run every hour, starting from now, on the node resource-checker-d-d4s.d4science.org. It will write a file with the missing resources' identifiers at /home/gcube/missing_resources/identifiers (it will contain "none" if nothing is wrong). Please perform the other missing operations to let Lucio's code properly work
Updated by Andrea Dell'Amico about 8 years ago
@lucio.lelii@isti.cnr.it the java source has a fixed list of VO from production. I cannot use it to test in dev. Can you provide a binary, btw?
Updated by Lucio Lelii about 8 years ago
yes, the list of VOs is fixed for the production, I can change the script with 2 options:
- the script takes as argument the list of VOs
- the script has a fixed list of production VOs and a list of development VOs and you can select the environment passing a special argument
Tell me which one you prefer.
Updated by Costantino Perciante about 8 years ago
What if we use the smart-executor plugin for this task too?
I mean, no other external file/scripts/whatever, just the plugin.
Updated by Andrea Dell'Amico about 8 years ago
Costantino Perciante wrote:
What if we use the smart-executor plugin for this task too?
I mean, no other external file/scripts/whatever, just the plugin.
It sounds better to me. So the nagios check should report the status change only. Correct?
Updated by Costantino Perciante about 8 years ago
Andrea Dell'Amico wrote:
Costantino Perciante wrote:
What if we use the smart-executor plugin for this task too?
I mean, no other external file/scripts/whatever, just the plugin.It sounds better to me. So the nagios check should report the status change only. Correct?
ok!
@lucio.lelii@isti.cnr.it the plugin already evaluates the list of VOs (taking into account the infrastructure in which it is running, of course).
I guess I can easily import your code in the resource-checker plugin
Updated by Costantino Perciante about 8 years ago
The updated script is running on that node. It also takes care of re-adding any missing resource to a context.
Updated by Andrea Dell'Amico almost 8 years ago
- Blocked by VM Creation #8486: Provide a new SmartGears node for resource-checker plugin added
Updated by Andrea Dell'Amico almost 8 years ago
- File deleted (
AddResourcesToAllScopes.java)
Updated by Pasquale Pagano almost 8 years ago
This activity was urgent one month ago. Please try to complete it asap.
Updated by Andrea Dell'Amico almost 8 years ago
Pasquale Pagano wrote:
This activity was urgent one month ago. Please try to complete it asap.
As soon as the production service works.
Updated by Andrea Dell'Amico almost 8 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
- Infrastructure deleted (
Development)
Done. The check goes to CRITICAL if the /home/gcube/missing_resources/identifiers
file contains anything different than the string none
.