Project

General

Profile

Actions

Task #10729

closed

Broken disk on dlib33x.dom0.research-infrastructures.eu

Added by Andrea Dell'Amico over 7 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Urgent
Category:
Other
Target version:
Start date:
Dec 18, 2017
Due date:
% Done:

100%

Estimated time:
Infrastructure:
Development, Pre-Production, Production

Description

From the nagios check:

CRITICAL: mdstat:[md3(931.39 GiB raid1):F:sdg1:_U, md2(931.39 GiB raid1):UU, md1(931.15 GiB raid1):UU, md0(242.81 MiB raid1):UU]
Actions #1

Updated by Tommaso Piccioli over 7 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 50

From the dmesg:

[Mon Dec 18 01:57:03 2017] sd 0:0:6:0: [sdg] Device not ready
...
[Mon Dec 18 01:57:03 2017] md/raid1:md3: Disk failure on sdg1, disabling device.
...
[Mon Dec 18 01:58:23 2017] sd 0:0:6:0: Attached scsi generic sg6 type 0
[Mon Dec 18 01:58:23 2017] sd 0:0:6:0: [sdi] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
...
[Mon Dec 18 01:58:23 2017] sdi: sdi1

The disk disappeared to the OS for a minute and appeared again as new.

I added a different disk to the raid (sdf1) that is in recovery/resync now, in the meantime I will check the old sdg (now sdi).

Actions #2

Updated by Tommaso Piccioli over 7 years ago

From the idrac log:

    2017-12-18T01:55:05-0600    PDR3    

Disk 6 in Backplane 1 of Integrated RAID Controller 1 is not functioning correctly.
2017-12-18T01:55:05-0600 PDR87

Disk 6 in Backplane 1 of Integrated RAID Controller 1 was reset.
2017-12-18T01:55:05-0600 PDR5

Disk 6 in Backplane 1 of Integrated RAID Controller 1 is removed.
2017-12-18T01:55:40-0600 PDR8

Disk 6 in Backplane 1 of Integrated RAID Controller 1 is inserted.

Still checking the Disk 6 from the OS.

Actions #3

Updated by Andrea Dell'Amico over 7 years ago

That's worrisome, the omsa tools should have reported them to nagios. Or maybe the problem lasted too little time to be reported?

Actions #4

Updated by Tommaso Piccioli over 7 years ago

resync done, the disk sdi seems to be OK (2/3 tested)

Actions #5

Updated by Tommaso Piccioli over 7 years ago

  • Status changed from In Progress to Closed
  • % Done changed from 50 to 100
Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 8.91 MB)