Project

General

Profile

Actions

Support #12737

closed

Social-ISTI- mongoR4-si heartbeat fails from mongoR2-si

Added by Roberto Cirillo over 6 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
High
Assignee:
_InfraScience Systems Engineer
Category:
System Application
Start date:
Oct 19, 2018
Due date:
% Done:

100%

Estimated time:
Infrastructure:
Production

Description

The mongodb server mongoR4-si.isti.cnr.it is often restarted because of communication problem with the primary node: mongoR2-si.isti.cnr.it.

The error reported in the mongoR4 log is the following:

Wed Oct 17 11:48:02.981 [conn6735] end connection 146.48.85.144:36635 (3 connections now open)
Wed Oct 17 11:48:03.110 [rsBackgroundSync] replSet sync source problem: 10278 dbclient error communicating with server: mongoR2-si.isti.cnr.it:27017
Wed Oct 17 11:48:03.121 [signalProcessingThread] shutdown: closing all files...
Wed Oct 17 11:48:03.136 [signalProcessingThread] closeAllFiles() finished
Wed Oct 17 11:48:03.136 [signalProcessingThread] journalCleanup...
Wed Oct 17 11:48:03.136 [signalProcessingThread] removeJournalFiles
Wed Oct 17 11:48:03.236 [signalProcessingThread] shutdown: removing fs lock...
Wed Oct 17 11:48:03.236 dbexit: really exiting now


***** SERVER RESTARTED ***** 

while the error reported on mongoR2-si.isti.cnr.it is the following:

Wed Oct 17 11:48:04.917 [rsHealthPoll] replset info mongoR4-si.isti.cnr.it:27017 heartbeat failed, retrying
Wed Oct 17 11:48:04.919 [rsHealthPoll] replSet info mongoR4-si.isti.cnr.it:27017 is down (or slow to respond):
Wed Oct 17 11:48:04.919 [rsHealthPoll] replSet member mongoR4-si.isti.cnr.it:27017 is now in state DOWN
Wed Oct 17 11:48:06.069 [conn16451613] end connection 146.48.85.140:50491 (7 connections now open)
Wed Oct 17 11:48:06.922 [rsHealthPoll] replset info mongoR4-si.isti.cnr.it:27017 heartbeat failed, retrying
Wed Oct 17 11:48:08.924 [rsHealthPoll] replset info mongoR4-si.isti.cnr.it:27017 heartbeat failed, retrying
Wed Oct 17 11:48:10.925 [rsHealthPoll] replset info mongoR4-si.isti.cnr.it:27017 heartbeat failed, retrying 

I've notice in the recent past, other problems between the mongoR4-si.isti.cnr.it and the primary node that was mongoR3 and mongoR4.
I've verified this kind of problem only on mongoR4 server: mongoR3, mongoR4 were never restarted in the last months while mongoR4 is restarted more than one time as day.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 8.91 MB)