Project

General

Profile

Actions

Task #169

closed

Verification of the smartgears nodes list

Added by Andrea Dell'Amico about 10 years ago. Updated almost 10 years ago.

Status:
Closed
Priority:
Normal
Category:
System Application
Target version:
Start date:
May 27, 2015
Due date:
% Done:

100%

Estimated time:
Infrastructure:
Production

Description

Ganglia is showing more smartgears workers than expected: [[[http://monitoring.research-infrastructures.eu/ganglia/?c=D4science%20Smartgears%20cluster&m=load_one&r=day&s=by%20name&hc=4&mc=2]]].

Gianpaolo says that we are listing as worker nodes hosts that aren't workers even if they are part of the smartgears cluster.

I think that we need a verification of the ghn hosts list that lives in the ansible-playbooks git repo, here: d4science-ghn-cluster/inventory/hosts.production.
Note that the new smartgears nodes are listed in a separate file: d4science-ghn-cluster/inventory/hosts.smartgears. The two lists will be merged when all the hosts will be completely provisioned via ansible.

Actions #1

Updated by Roberto Cirillo almost 10 years ago

  • Status changed from New to In Progress

The list of new workers defined in "host.smartgears" file is right
But the list of old workers defined in "host.production" is not right:

I see in ganglia all the nodes of the group "ghn_smartgears_prod_old" but not all the nodes presents in this list are workers. For example "tabulardata.d4science.org" is a node smartgears based but this is not a worker.
I see the group "ghn_workers_prod" empty in "hosts.production". Why? Maybe all the workers have to be defined in this group.

In addition, Ganglia is showing workers that are not present in any lists:
node13.p.d4science.research-infrastructures.eu
node31.p.d4science.research-infrastructures.eu
node51.p.d4science.research-infrastructures.eu

Actions #2

Updated by Andrea Dell'Amico almost 10 years ago

Roberto Cirillo wrote:

The list of new workers defined in "host.smartgears" file is right
But the list of old workers defined in "host.production" is not right:

I see in ganglia all the nodes of the group "ghn_smartgears_prod_old" but not all the nodes presents in this list are workers. For example "tabulardata.d4science.org" is a node smartgears based but this is not a worker.

Ok.

I see the group "ghn_workers_prod" empty in "hosts.production". Why? Maybe all the workers have to be defined in this group.

That group is used inside host.smartgears and contains the ansible provisioned hosts. The other ones are under ghn_smartgears_prod_old. When will be ready to also provision the old workers, they all will be gathered under ghn_smartgears_prod and the inventory file host.smartgears will disappear

In addition, Ganglia is showing workers that are not present in any lists:
node13.p.d4science.research-infrastructures.eu
node31.p.d4science.research-infrastructures.eu
node51.p.d4science.research-infrastructures.eu

Ah. I'll remove them too.

Actions #3

Updated by Andrea Dell'Amico almost 10 years ago

  • % Done changed from 0 to 80

tabulardata.d4science.org should appear in the right ganglia cluster soon.

I've removed ganglia from the other nodes. Is it possible that they were manually cloned from a smartgears worker node? It seems that they never appeared in the inventory host list.

Actions #4

Updated by Andrea Dell'Amico almost 10 years ago

  • Status changed from In Progress to Closed
  • % Done changed from 80 to 100
Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 8.91 MB)