Project

General

Profile

Actions

Task #10460

closed

Upgrade resources for worker[1-3]-hadoop-test.d4science.org

Added by Sandro La Bruzzo over 7 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
_InfraScience Systems Engineer
Category:
-
Target version:
Start date:
Nov 27, 2017
Due date:
% Done:

100%

Estimated time:
Infrastructure:
Development

Description

We need to have the same hw resource of the other workers of the cluster


Related issues

Blocks D4Science Infrastructure - Task #10669: Reconfigure the openaire dev solr nodesClosedAndrea Dell'AmicoDec 12, 2017

Actions
Actions #1

Updated by Pasquale Pagano over 7 years ago

  • Tracker changed from Support to Task
Actions #2

Updated by Andrea Dell'Amico over 7 years ago

  • Status changed from New to In Progress

The involved VMs are:

ambari-hadoop-test.d4science.org (dlib34x) ambari-hadoop.d4science.org
rm1-hadoop-test.d4science.org (dlib18x) -> dlib25x rm1-hadoop.d4science.org
rm2-hadoop-test.d4science.org (dlib20x) -> dlib34x rm2-hadoop.d4science.org
rm3-hadoop-test.d4science.org (dlib22x) rm3-hadoop.d4science.org
rm4-hadoop-test.d4science.org (dlib21x) -> dlib35x rm4-hadoop.d4science.org
worker1-hadoop-test.d4science.org (dlib25x) -> dlib32x worker1-hadoop.d4science.org
worker2-hadoop-test.d4science.org (dlib28x) -> dlib33x worker2-hadoop.d4science.org
worker3-hadoop-test.d4science.org (dlib29x) -> dlib34x worker3-hadoop.d4science.org
worker4-hadoop.d4science.org (dlib26x)
worker5-hadoop.d4science.org (dlib22x)
worker6-hadoop.d4science.org (dlib18x) -> dlib35x
worker7-hadoop.d4science.org (dlib26x)
worker8-hadoop.d4science.org (dlib20x)
worker9-hadoop.d4science.org (dlib22x)
worker10-hadoop.d4science.org (dlib23x)
worker11-hadoop.d4science.org (dlib23x)

Actions #3

Updated by Andrea Dell'Amico over 7 years ago

  • % Done changed from 0 to 60

The VMs have been renamed, moved and reinstalled when needed. All the data disk volumes renamed to match the VM hostname.
A new VM, db-hadoop.d4science.org has been created. It will host the ambari, oozie, hive (and possibly some other) databases.

The ambari VM has been reinstalled and its old ssh keys copied on the new VM. All the HDP distribution must be installed from scratch.

Actions #4

Updated by Andrea Dell'Amico over 7 years ago

  • % Done changed from 60 to 70

The DB server has been configured.

Actions #5

Updated by Andrea Dell'Amico over 7 years ago

All the cluster have been reconfigured, HDP 2.6.3 has been installed.
There's a problem on worker8-hadoop.d4science.org, the storage disk keeps detaching itself.

Actions #6

Updated by Andrea Dell'Amico over 7 years ago

Erasing the nodes without reinstalling them wasn't a good choice. We will reinstall them from scratch, and then reconfigure.

Actions #7

Updated by Andrea Dell'Amico over 7 years ago

Reinstallation started. The up to date node mapping:

ambari-hadoop.d4science.org (dlib34x)
rm1-hadoop.d4science.org (dlib25x)
rm2-hadoop.d4science.org (dlib34x)
rm3-hadoop.d4science.org (dlib22x)
rm4-hadoop.d4science.org (dlib35x)
worker1-hadoop.d4science.org (dlib32x)
worker2-hadoop.d4science.org (dlib33x)
worker3-hadoop.d4science.org (dlib34x)
worker4-hadoop.d4science.org (dlib26x)
worker5-hadoop.d4science.org (dlib22x)
worker6-hadoop.d4science.org (dlib35x)
worker7-hadoop.d4science.org (dlib26x)
worker8-hadoop.d4science.org (dlib20x)
worker9-hadoop.d4science.org (dlib28x)
worker10-hadoop.d4science.org (dlib23x)
worker11-hadoop.d4science.org (dlib23x)
Actions #8

Updated by Andrea Dell'Amico over 7 years ago

worker8-hadoop.d4science.org still creates problems. Maybe it should be installed on a better hypervisor.

Actions #9

Updated by Andrea Dell'Amico over 7 years ago

  • % Done changed from 70 to 90

The cluster is up and running. worker8-hadoop.d4science.org has been removed for the time being.

Actions #10

Updated by Andrea Dell'Amico over 7 years ago

  • Blocks Task #10669: Reconfigure the openaire dev solr nodes added
Actions #11

Updated by Andrea Dell'Amico over 7 years ago

  • Status changed from In Progress to Closed
  • % Done changed from 90 to 100

The cluster is working.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 8.91 MB)