Task #169
closedVerification of the smartgears nodes list
100%
Description
Ganglia is showing more smartgears workers than expected: [[[http://monitoring.research-infrastructures.eu/ganglia/?c=D4science%20Smartgears%20cluster&m=load_one&r=day&s=by%20name&hc=4&mc=2]]].
Gianpaolo says that we are listing as worker nodes hosts that aren't workers even if they are part of the smartgears cluster.
I think that we need a verification of the ghn hosts list that lives in the ansible-playbooks git repo, here: d4science-ghn-cluster/inventory/hosts.production.
Note that the new smartgears nodes are listed in a separate file: d4science-ghn-cluster/inventory/hosts.smartgears. The two lists will be merged when all the hosts will be completely provisioned via ansible.
Updated by Roberto Cirillo almost 10 years ago
- Status changed from New to In Progress
The list of new workers defined in "host.smartgears" file is right
But the list of old workers defined in "host.production" is not right:
I see in ganglia all the nodes of the group "ghn_smartgears_prod_old" but not all the nodes presents in this list are workers. For example "tabulardata.d4science.org" is a node smartgears based but this is not a worker.
I see the group "ghn_workers_prod" empty in "hosts.production". Why? Maybe all the workers have to be defined in this group.
In addition, Ganglia is showing workers that are not present in any lists:
node13.p.d4science.research-infrastructures.eu
node31.p.d4science.research-infrastructures.eu
node51.p.d4science.research-infrastructures.eu
Updated by Andrea Dell'Amico almost 10 years ago
Roberto Cirillo wrote:
The list of new workers defined in "host.smartgears" file is right
But the list of old workers defined in "host.production" is not right:I see in ganglia all the nodes of the group "ghn_smartgears_prod_old" but not all the nodes presents in this list are workers. For example "tabulardata.d4science.org" is a node smartgears based but this is not a worker.
Ok.
I see the group "ghn_workers_prod" empty in "hosts.production". Why? Maybe all the workers have to be defined in this group.
That group is used inside host.smartgears
and contains the ansible provisioned hosts. The other ones are under ghn_smartgears_prod_old
. When will be ready to also provision the old workers, they all will be gathered under ghn_smartgears_prod
and the inventory file host.smartgears
will disappear
In addition, Ganglia is showing workers that are not present in any lists:
node13.p.d4science.research-infrastructures.eu
node31.p.d4science.research-infrastructures.eu
node51.p.d4science.research-infrastructures.eu
Ah. I'll remove them too.
Updated by Andrea Dell'Amico almost 10 years ago
- % Done changed from 0 to 80
tabulardata.d4science.org should appear in the right ganglia cluster soon.
I've removed ganglia from the other nodes. Is it possible that they were manually cloned from a smartgears worker node? It seems that they never appeared in the inventory host list.
Updated by Andrea Dell'Amico almost 10 years ago
- Status changed from In Progress to Closed
- % Done changed from 80 to 100