Project

General

Profile

Actions

Task #8787

closed

mongo2-d-d4s: is not synchronized with other cluster members

Added by Roberto Cirillo almost 8 years ago. Updated almost 8 years ago.

Status:
Closed
Priority:
Normal
Category:
System Application
Target version:
Start date:
May 30, 2017
Due date:
% Done:

100%

Estimated time:
Infrastructure:
Development

Description

mongo2-d-d4s was down from 26 May for an OutOfMemory problem.
Now it's needed to perform a full resynch

Actions #1

Updated by Roberto Cirillo almost 8 years ago

  • Status changed from New to In Progress
  • Assignee changed from Roberto Cirillo to Tommaso Piccioli

I've removed mongo2 from the development cluster and now @tommaso.piccioli@isti.cnr.it is performing a resynch. When the resynch is completed I'm going to re add the node to the cluster. I'm going to assign this ticket to @tommaso.piccioli@isti.cnr.it

Actions #3

Updated by Roberto Cirillo almost 8 years ago

After the rsync unfortunately, we have an ERROR on mongodb log:

2017-05-30T17:27:00.203+0200 I CONTROL  ***** SERVER RESTARTED *****
2017-05-30T17:27:00.268+0200 I CONTROL  [initandlisten] MongoDB starting : pid=29659 port=27017 dbpath=/data/mongo_home 64-bit host=mongo2-d-d4s
2017-05-30T17:27:00.268+0200 I CONTROL  [initandlisten] db version v3.0.8
2017-05-30T17:27:00.268+0200 I CONTROL  [initandlisten] git version: 83d8cc25e00e42856924d84e220fbe4a839e605d
2017-05-30T17:27:00.268+0200 I CONTROL  [initandlisten] build info: Linux ip-10-187-89-126 3.13.0-24-generic #46-Ubuntu SMP Thu Apr 10 19:11:08 UTC 2014 x86_64 BOOST_LIB_VERSION=1_49
2017-05-30T17:27:00.268+0200 I CONTROL  [initandlisten] allocator: tcmalloc
2017-05-30T17:27:00.268+0200 I CONTROL  [initandlisten] options: { config: "/etc/mongod.conf", net: { port: 27017 }, replication: { oplogSizeMB: 7000, replSetName: "storagedev" }, security: { keyFile: "/data/mongo_home/dev-d4scienc
e-keyfile" }, storage: { dbPath: "/data/mongo_home", engine: "wiredTiger", journal: { enabled: true } }, systemLog: { destination: "file", logAppend: true, path: "/data/mongo_log.txt" } }
2017-05-30T17:27:00.298+0200 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=1G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_m
anager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2017-05-30T17:27:00.643+0200 I STORAGE  [initandlisten] Starting WiredTigerRecordStoreThread local.oplog.rs
2017-05-30T17:27:00.651+0200 E STORAGE  [initandlisten] WiredTiger (0) [1496158020:651140][29659:0x7f704b89fbc0], file:collection-21--6954882954492924240.wt, session.open_cursor: read checksum error for 241664B block at offset 3509
93436672: block header checksum of 2627437933 doesn't match expected checksum of 167318316
2017-05-30T17:27:00.651+0200 E STORAGE  [initandlisten] WiredTiger (0) [1496158020:651259][29659:0x7f704b89fbc0], file:collection-21--6954882954492924240.wt, session.open_cursor: collection-21--6954882954492924240.wt: encountered a
n illegal file format or internal value
2017-05-30T17:27:00.651+0200 E STORAGE  [initandlisten] WiredTiger (-31804) [1496158020:651306][29659:0x7f704b89fbc0], file:collection-21--6954882954492924240.wt, session.open_cursor: the process must exit and restart: WT_PANIC: Wi
redTiger library panic
2017-05-30T17:27:00.651+0200 I -        [initandlisten] Fatal Assertion 28558

This error seems to be solved on version 3.2.10: https://jira.mongodb.org/browse/SERVER-26237

Now, I've started on mongo2-d a resync from scratch. In this way all the payloads will be compacted and some space will be recovered but the cluster is not usable for few hours

Actions #4

Updated by Roberto Cirillo almost 8 years ago

Roberto Cirillo wrote:

Now, I've started on mongo2-d a resync from scratch. In this way all the payloads will be compacted and some space will be recovered but the cluster is not usable for few hours

I've verified. The cluster is also usable while the resync is running because the node mongo2-d is transitated in another state: RECOVERING.
This state allow the write and the read on the cluster.
I'm sorry for my previous comment.

Actions #5

Updated by Roberto Cirillo almost 8 years ago

  • Tracker changed from Incident to Task
  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

mongo2-d-d4s is now aligned with the cluster.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 8.91 MB)