Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
Re: Disk Failure Detected...
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-03-13
01:42 PM
2011-03-13
01:42 PM
Disk Failure Detected...
I've recently purchased a ReadyNAS Ultra 6 along with 6 2 Tb Seagate ST2000DL003 disks (which are on the HCL).
I've set up the NAS in a dual redundancy X-RAID2 configuration and have starting copying all the data over the network from my old ReadyNAS NV to the new ultra 6...
About half way through copying (on 6th March), I got a disk failure detected (on channel 4). I powered down the NAS took the disk out and reinserted it, assuming there might be some kind of connection problem... When I powered back up it detected the disk, tested it and started to resync (which takes about 24 hours)... I left it alone while it did that and then it seemed to be ok, so I started copying the rest of my data across. There is nothing in the SMART+ log for disk 4 which would indicate that there was ever a problem with that disk.
A few minutes ago, I just got another disk failure (this time on channel 2). Exactly the same story... powered down and then back up again, the disk comes back to life and the NAS starts testing it and resyncing it... again, there is nothing in the SMART+ log for disk 2 which indicates (to me at least) that there was ever a problem.
After both occasions, I've downloaded the system logs from the NAS, but I'm not sure what to do with them. Is there something in the log which would show what exactly failed?
Any idea what's going on here? Have I got a couple of dud disks which need to be sent back, or is there something else going on? If they are dud, I'd need to be able to prove to the retailer that they were... the only indication I have of a problem is that the ReadyNAS ultra 6 _said_ that they had failed... but they both seem to be working fine now.
Thanks,
Ash.
P.S. Here's the SMART+ report from disk 2:
This looks like the appropriate section of system.log for the failure which occurred today:
and here is what looks like the relevant part of the log from the failure on 6th March:
I've set up the NAS in a dual redundancy X-RAID2 configuration and have starting copying all the data over the network from my old ReadyNAS NV to the new ultra 6...
About half way through copying (on 6th March), I got a disk failure detected (on channel 4). I powered down the NAS took the disk out and reinserted it, assuming there might be some kind of connection problem... When I powered back up it detected the disk, tested it and started to resync (which takes about 24 hours)... I left it alone while it did that and then it seemed to be ok, so I started copying the rest of my data across. There is nothing in the SMART+ log for disk 4 which would indicate that there was ever a problem with that disk.
A few minutes ago, I just got another disk failure (this time on channel 2). Exactly the same story... powered down and then back up again, the disk comes back to life and the NAS starts testing it and resyncing it... again, there is nothing in the SMART+ log for disk 2 which indicates (to me at least) that there was ever a problem.
After both occasions, I've downloaded the system logs from the NAS, but I'm not sure what to do with them. Is there something in the log which would show what exactly failed?
Any idea what's going on here? Have I got a couple of dud disks which need to be sent back, or is there something else going on? If they are dud, I'd need to be able to prove to the retailer that they were... the only indication I have of a problem is that the ReadyNAS ultra 6 _said_ that they had failed... but they both seem to be working fine now.
Thanks,
Ash.
P.S. Here's the SMART+ report from disk 2:
SMART Information for Disk 2
Model: ST2000DL003-9VT166
Serial: 5YD2196G
Firmware: CC32
SMART Attribute
Spin Up Time 0
Start Stop Count 12
Reallocated Sector Count 0
Power On Hours 151
Spin Retry Count 0
Power Cycle Count 12
Reported Uncorrect 0
High Fly Writes 0
Airflow Temperature Cel 42
G-Sense Error Rate 0
Power-Off Retract Count 6
Load Cycle Count 12
Temperature Celsius 42
Current Pending Sector 0
Offline Uncorrectable 0
UDMA CRC Error Count 0
Head Flying Hours 221474283585687
ATA Error Count 0
This looks like the appropriate section of system.log for the failure which occurred today:
Mar 13 20:00:09 ultranas ntpdate[11162]: step time server 194.238.48.3 offset 0.310812 sec
Mar 13 20:16:27 ultranas kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Mar 13 20:16:27 ultranas kernel: ata2.00: failed command: FLUSH CACHE EXT
Mar 13 20:16:27 ultranas kernel: ata2.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
Mar 13 20:16:27 ultranas kernel: res 40/00:ff:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
Mar 13 20:16:27 ultranas kernel: ata2.00: status: { DRDY }
Mar 13 20:16:27 ultranas kernel: ata2: hard resetting link
Mar 13 20:16:33 ultranas kernel: ata2: link is slow to respond, please be patient (ready=0)
Mar 13 20:16:37 ultranas kernel: ata2: COMRESET failed (errno=-16)
Mar 13 20:16:37 ultranas kernel: ata2: hard resetting link
Mar 13 20:16:43 ultranas kernel: ata2: link is slow to respond, please be patient (ready=0)
Mar 13 20:16:47 ultranas kernel: ata2: COMRESET failed (errno=-16)
Mar 13 20:16:47 ultranas kernel: ata2: hard resetting link
Mar 13 20:16:53 ultranas kernel: ata2: link is slow to respond, please be patient (ready=0)
Mar 13 20:17:23 ultranas kernel: ata2: COMRESET failed (errno=-16)
Mar 13 20:17:23 ultranas kernel: ata2: limiting SATA link speed to 1.5 Gbps
Mar 13 20:17:23 ultranas kernel: ata2: hard resetting link
Mar 13 20:17:28 ultranas kernel: ata2: COMRESET failed (errno=-16)
Mar 13 20:17:28 ultranas kernel: ata2: reset failed, giving up
Mar 13 20:17:28 ultranas kernel: ata2.00: disabled
Mar 13 20:17:28 ultranas kernel: ata2.00: device reported invalid CHS sector 0
Mar 13 20:17:28 ultranas kernel: ata2: EH complete
Mar 13 20:17:28 ultranas kernel: end_request: I/O error, dev sdb, sector 0
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] CDB: Write(10): 2a 00 00 90 00 50 00 00 02 00
Mar 13 20:17:28 ultranas kernel: end_request: I/O error, dev sdb, sector 9437264
Mar 13 20:17:28 ultranas kernel: end_request: I/O error, dev sdb, sector 9437264
Mar 13 20:17:28 ultranas kernel: **************** super written barrier kludge on md2: error==IO 0xfffffffb
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] CDB: Write(10): 2a 00 00 00 00 48 00 00 02 00
Mar 13 20:17:28 ultranas kernel: end_request: I/O error, dev sdb, sector 72
Mar 13 20:17:28 ultranas kernel: end_request: I/O error, dev sdb, sector 72
Mar 13 20:17:28 ultranas kernel: **************** super written barrier kludge on md0: error==IO 0xfffffffb
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 00 51 8f 30 00 00 28 00
Mar 13 20:17:28 ultranas kernel: end_request: I/O error, dev sdb, sector 5345072
Mar 13 20:17:28 ultranas kernel: raid1: sdb1: rescheduling sector 5342960
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] CDB: Write(10): 2a 00 00 90 00 50 00 00 02 00
Mar 13 20:17:28 ultranas kernel: end_request: I/O error, dev sdb, sector 9437264
Mar 13 20:17:28 ultranas kernel: md: super_written gets error=-5, uptodate=0
Mar 13 20:17:28 ultranas kernel: raid5: Disk failure on sdb5, disabling device.
Mar 13 20:17:28 ultranas kernel: raid5: Operation continuing on 5 devices.
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] CDB: Write(10): 2a 00 00 00 00 48 00 00 02 00
Mar 13 20:17:28 ultranas kernel: end_request: I/O error, dev sdb, sector 72
Mar 13 20:17:28 ultranas kernel: md: super_written gets error=-5, uptodate=0
Mar 13 20:17:28 ultranas kernel: raid1: Disk failure on sdb1, disabling device.
Mar 13 20:17:28 ultranas kernel: raid1: Operation continuing on 5 devices.
Mar 13 20:17:28 ultranas kernel: RAID5 conf printout:
Mar 13 20:17:28 ultranas kernel: --- rd:6 wd:5
Mar 13 20:17:28 ultranas kernel: disk 0, o:1, dev:sda5
Mar 13 20:17:28 ultranas kernel: disk 1, o:0, dev:sdb5
Mar 13 20:17:28 ultranas kernel: disk 2, o:1, dev:sdc5
Mar 13 20:17:28 ultranas kernel: disk 3, o:1, dev:sdd5
Mar 13 20:17:28 ultranas kernel: disk 4, o:1, dev:sde5
Mar 13 20:17:28 ultranas kernel: disk 5, o:1, dev:sdf5
Mar 13 20:17:28 ultranas kernel: RAID5 conf printout:
Mar 13 20:17:28 ultranas kernel: --- rd:6 wd:5
Mar 13 20:17:28 ultranas kernel: disk 0, o:1, dev:sda5
Mar 13 20:17:28 ultranas kernel: disk 2, o:1, dev:sdc5
Mar 13 20:17:28 ultranas kernel: disk 3, o:1, dev:sdd5
Mar 13 20:17:28 ultranas kernel: disk 4, o:1, dev:sde5
Mar 13 20:17:28 ultranas kernel: disk 5, o:1, dev:sdf5
Mar 13 20:17:28 ultranas kernel: RAID1 conf printout:
Mar 13 20:17:28 ultranas kernel: --- wd:5 rd:6
Mar 13 20:17:28 ultranas kernel: disk 0, wo:0, o:1, dev:sda1
Mar 13 20:17:28 ultranas kernel: disk 1, wo:1, o:0, dev:sdb1
Mar 13 20:17:28 ultranas kernel: disk 2, wo:0, o:1, dev:sdc1
Mar 13 20:17:28 ultranas kernel: disk 3, wo:0, o:1, dev:sdd1
Mar 13 20:17:28 ultranas kernel: disk 4, wo:0, o:1, dev:sde1
Mar 13 20:17:28 ultranas kernel: disk 5, wo:0, o:1, dev:sdf1
Mar 13 20:17:28 ultranas kernel: RAID1 conf printout:
Mar 13 20:17:28 ultranas kernel: --- wd:5 rd:6
Mar 13 20:17:28 ultranas kernel: disk 0, wo:0, o:1, dev:sda1
Mar 13 20:17:28 ultranas kernel: disk 2, wo:0, o:1, dev:sdc1
Mar 13 20:17:28 ultranas kernel: disk 3, wo:0, o:1, dev:sdd1
Mar 13 20:17:28 ultranas kernel: disk 4, wo:0, o:1, dev:sde1
Mar 13 20:17:28 ultranas kernel: disk 5, wo:0, o:1, dev:sdf1
Mar 13 20:17:28 ultranas kernel: raid1: sdf1: redirecting sector 5342960 to another mirror
Mar 13 20:17:32 ultranas RAIDiator: Disk failure detected.\n\nIf the failed disk is used in a RAID level 1, 5, or X-RAID volume, please note that volume is now unprotected, and an additional disk failure may render that volume dead. If this disk is a part of a RAID 6 volume, your volume is still protected if this is your first failure. A 2nd disk failure will make your volume unprotected. It is recommended that you replace the failed disk as soon as possible to maintain optimal protection of your volume.\n\n[Sun Mar 13 20:17:29 WET 2011]
Mar 13 20:20:24 ultranas kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
and here is what looks like the relevant part of the log from the failure on 6th March:
Mar 6 16:00:07 nas-EA-A6-42 ntpdate[12452]: step time server 62.84.188.34 offset -0.103568 sec
Mar 6 18:48:21 nas-EA-A6-42 kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Mar 6 18:48:22 nas-EA-A6-42 kernel: ata4.00: failed command: FLUSH CACHE EXT
Mar 6 18:48:22 nas-EA-A6-42 kernel: ata4.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
Mar 6 18:48:22 nas-EA-A6-42 kernel: res 40/00:00:b8:f7:0e/00:00:00:00:00/40 Emask 0x4 (timeout)
Mar 6 18:48:22 nas-EA-A6-42 kernel: ata4.00: status: { DRDY }
Mar 6 18:48:22 nas-EA-A6-42 kernel: ata4: hard resetting link
Mar 6 18:48:27 nas-EA-A6-42 kernel: ata4: link is slow to respond, please be patient (ready=0)
Mar 6 18:48:32 nas-EA-A6-42 kernel: ata4: COMRESET failed (errno=-16)
Mar 6 18:48:32 nas-EA-A6-42 kernel: ata4: hard resetting link
Mar 6 18:48:37 nas-EA-A6-42 kernel: ata4: link is slow to respond, please be patient (ready=0)
Mar 6 18:48:42 nas-EA-A6-42 kernel: ata4: COMRESET failed (errno=-16)
Mar 6 18:48:42 nas-EA-A6-42 kernel: ata4: hard resetting link
Mar 6 18:48:47 nas-EA-A6-42 kernel: ata4: link is slow to respond, please be patient (ready=0)
Mar 6 18:49:17 nas-EA-A6-42 kernel: ata4: COMRESET failed (errno=-16)
Mar 6 18:49:17 nas-EA-A6-42 kernel: ata4: limiting SATA link speed to 1.5 Gbps
Mar 6 18:49:17 nas-EA-A6-42 kernel: ata4: hard resetting link
Mar 6 18:49:22 nas-EA-A6-42 kernel: ata4: COMRESET failed (errno=-16)
Mar 6 18:49:22 nas-EA-A6-42 kernel: ata4: reset failed, giving up
Mar 6 18:49:22 nas-EA-A6-42 kernel: ata4.00: disabled
Mar 6 18:49:22 nas-EA-A6-42 kernel: ata4: EH complete
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Write(10): 2a 00 00 00 00 48 00 00 02 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 72
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 72
Mar 6 18:49:22 nas-EA-A6-42 kernel: **************** super written barrier kludge on md0: error==IO 0xfffffffb
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Write(10): 2a 00 00 93 9e 80 00 00 08 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 9674368
Mar 6 18:49:22 nas-EA-A6-42 kernel: raid5: Disk failure on sdd5, disabling device.
Mar 6 18:49:22 nas-EA-A6-42 kernel: raid5: Operation continuing on 5 devices.
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Write(10): 2a 00 34 c5 68 48 00 00 80 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 885352520
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Write(10): 2a 00 34 c6 f0 c8 00 00 50 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 885453000
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 00 91 28 c8 00 00 38 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 9513160
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 00 91 29 10 00 00 10 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 9513232
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 00 91 29 28 00 00 10 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 9513256
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 00 91 29 40 00 00 08 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 9513280
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 00 93 88 48 00 00 08 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 9668680
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 00 93 a1 90 00 00 10 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 9675152
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 34 c5 38 48 00 00 08 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 885340232
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 34 c5 64 48 00 00 80 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 885351496
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 34 c6 f1 18 00 00 30 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 885453080
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Write(10): 2a 00 00 80 00 48 00 00 02 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 8388680
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 8388680
Mar 6 18:49:22 nas-EA-A6-42 kernel: **************** super written barrier kludge on md1: error==IO 0xfffffffb
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 00 31 8d 58 00 00 28 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 3247448
Mar 6 18:49:22 nas-EA-A6-42 kernel: raid1: sdd1: rescheduling sector 3245336
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB:
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Write(10)Write(10): 2a 00 00 00 00 48 00 00 02 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 72
Mar 6 18:49:22 nas-EA-A6-42 kernel: :md: super_written gets error=-5, uptodate=0
Mar 6 18:49:22 nas-EA-A6-42 kernel: 2a
Mar 6 18:49:22 nas-EA-A6-42 kernel: raid1: Disk failure on sdd1, disabling device.
Mar 6 18:49:22 nas-EA-A6-42 kernel: raid1: Operation continuing on 5 devices.
Mar 6 18:49:22 nas-EA-A6-42 kernel: 00 00 80 00 48 00 00 02 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 8388680
Mar 6 18:49:22 nas-EA-A6-42 kernel: md: super_written gets error=-5, uptodate=0
Mar 6 18:49:22 nas-EA-A6-42 kernel: raid5: Disk failure on sdd2, disabling device.
Mar 6 18:49:22 nas-EA-A6-42 kernel: raid5: Operation continuing on 5 devices.
Mar 6 18:49:23 nas-EA-A6-42 kernel: RAID1 conf printout:
Mar 6 18:49:23 nas-EA-A6-42 kernel: --- wd:5 rd:6
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 0, wo:0, o:1, dev:sda1
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 1, wo:0, o:1, dev:sdb1
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 2, wo:0, o:1, dev:sdc1
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 3, wo:1, o:0, dev:sdd1
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 4, wo:0, o:1, dev:sde1
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 5, wo:0, o:1, dev:sdf1
Mar 6 18:49:23 nas-EA-A6-42 kernel: RAID1 conf printout:
Mar 6 18:49:23 nas-EA-A6-42 kernel: --- wd:5 rd:6
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 0, wo:0, o:1, dev:sda1
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 1, wo:0, o:1, dev:sdb1
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 2, wo:0, o:1, dev:sdc1
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 4, wo:0, o:1, dev:sde1
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 5, wo:0, o:1, dev:sdf1
Mar 6 18:49:23 nas-EA-A6-42 kernel: RAID5 conf printout:
Mar 6 18:49:23 nas-EA-A6-42 kernel: --- rd:6 wd:5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 0, o:1, dev:sda5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 1, o:1, dev:sdb5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 2, o:1, dev:sdc5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 3, o:0, dev:sdd5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 4, o:1, dev:sde5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 5, o:1, dev:sdf5
Mar 6 18:49:23 nas-EA-A6-42 kernel: RAID5 conf printout:
Mar 6 18:49:23 nas-EA-A6-42 kernel: --- rd:6 wd:5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 0, o:1, dev:sda5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 1, o:1, dev:sdb5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 2, o:1, dev:sdc5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 4, o:1, dev:sde5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 5, o:1, dev:sdf5
Mar 6 18:49:23 nas-EA-A6-42 kernel: RAID5 conf printout:
Mar 6 18:49:23 nas-EA-A6-42 kernel: --- rd:6 wd:5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 0, o:1, dev:sda2
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 1, o:1, dev:sdb2
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 2, o:1, dev:sdc2
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 3, o:0, dev:sdd2
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 4, o:1, dev:sde2
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 5, o:1, dev:sdf2
Mar 6 18:49:23 nas-EA-A6-42 kernel: raid1: sdb1: redirecting sector 3245336 to another mirror
Mar 6 18:49:23 nas-EA-A6-42 kernel: RAID5 conf printout:
Mar 6 18:49:23 nas-EA-A6-42 kernel: --- rd:6 wd:5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 0, o:1, dev:sda2
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 1, o:1, dev:sdb2
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 2, o:1, dev:sdc2
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 4, o:1, dev:sde2
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 5, o:1, dev:sdf2
Mar 6 18:49:53 nas-EA-A6-42 RAIDiator: Disk failure detected.\n\nIf the failed disk is used in a RAID level 1, 5, or X-RAID volume, please note that volume is now unprotected, and an additional disk failure may render that volume dead. If this disk is a part of a RAID 6 volume, your volume is still protected if this is your first failure. A 2nd disk failure will make your volume unprotected. It is recommended that you replace the failed disk as soon as possible to maintain optimal protection of your volume.\n\n[Sun Mar 6 18:49:51 WET 2011]
Message 1 of 145
Labels:
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-03-14
06:02 PM
2011-03-14
06:02 PM
Re: Disk Failure Detected...
It looks like those disks 2 and 4 might be duds. I'd run a disk test on the NAS just to make sure (it'll run an extended SMART test on the drives, rather than the short test that's run usually).
Power off the NAS, boot with the reset button in the back depressed until the LCD shows boot menu then release, press the backup button until "Disk Test" shows, then depress the reset button one more time and it'll start the test.
Power off the NAS, boot with the reset button in the back depressed until the LCD shows boot menu then release, press the backup button until "Disk Test" shows, then depress the reset button one more time and it'll start the test.
Message 2 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-03-16
08:30 AM
2011-03-16
08:30 AM
Re: Disk Failure Detected...
I ran the extended disk test as you suggested... it ran overnight (and then I guess it finished it's resync)... it hasn't reported any problems. Can I see the results of the test in any of the logs?
Everything seems to be working fine now.
Should I ignore the problems I had with disk 2 and 4, or are there other tests I should run?
Everything seems to be working fine now.
Should I ignore the problems I had with disk 2 and 4, or are there other tests I should run?
Message 3 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-03-16
08:38 AM
2011-03-16
08:38 AM
Re: Disk Failure Detected...
bluewomble wrote: I ran the extended disk test as you suggested... it ran overnight (and then I guess it finished it's resync)... it hasn't reported any problems. Can I see the results of the test in any of the logs?
Everything seems to be working fine now.
Should I ignore the problems I had with disk 2 and 4, or are there other tests I should run?
Out of interest, how did you run this extended disk test and is it available on the 1100 model?
Thanks.. Karl
Message 4 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-03-16
08:51 AM
2011-03-16
08:51 AM
Re: Disk Failure Detected...
I just followed siigna's instructions from above:
I'm not sure what models it's available on...
Power off the NAS, boot with the reset button in the back depressed until the LCD shows boot menu then release, press the backup button until "Disk Test" shows, then depress the reset button one more time and it'll start the test.
I'm not sure what models it's available on...
Message 5 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-03-16
10:08 AM
2011-03-16
10:08 AM
Re: Disk Failure Detected...
bluewomble wrote: I ran the extended disk test as you suggested... it ran overnight (and then I guess it finished it's resync)... it hasn't reported any problems. Can I see the results of the test in any of the logs?
Everything seems to be working fine now.
Should I ignore the problems I had with disk 2 and 4, or are there other tests I should run?
If there were any major problems in the disk test the NAS would have let you known that the disk was bad on the LCD, you should be able to pull the SMART logs under Status > Health and see if anything's out of the blue. Take a look at the reallocated sector counts, CRC errors and ATA errors, they should be at or close to 0. If everything's looking OK then keep an eye on the NAS at least for the next week or two and make sure nothing odd happens, after that I'd say you're good to go.
dataactive wrote: Out of interest, how did you run this extended disk test and is it available on the 1100 model?
Thanks.. Karl
Unfortunately the sparc based units don't have this boot option. If you have SSH access to the NAS you can always run "smartctl -t long /dev/hdX". hdc would be disk 1, hde is disk 2, hdg is disk 3, hdi is disk 4.
Otherwise you can test the disks using the manufacturer's testing tools (see the FAQ here).
Message 6 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-03-17
06:09 AM
2011-03-17
06:09 AM
Re: Disk Failure Detected...
Thanks, I'll give it a go.
I'm also having problems with disk 4. When I try fdisk -l /dev/hdi nothing happens. Is there another command that I can use to list all of the devices? Perhaps it's not listed as hdi?
I'm also having problems with disk 4. When I try fdisk -l /dev/hdi nothing happens. Is there another command that I can use to list all of the devices? Perhaps it's not listed as hdi?
Message 7 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-03-17
02:25 PM
2011-03-17
02:25 PM
Re: Disk Failure Detected...
dataactive wrote: I'm also having problems with disk 4. When I try fdisk -l /dev/hdi nothing happens. Is there another command that I can use to list all of the devices? Perhaps it's not listed as hdi?
Yes, just run fdisk -l with no arguments.
bluewomble wrote: Everything seems to be working fine now.
Should I ignore the problems I had with disk 2 and 4, or are there other tests I should run?
Have you experienced any more issues with this setup or have you been running smoothly since your initial two failures? My ReadyNAS Ultra 6 Plus with 6x ST2000DL003-9VT166 (CC32) experienced a similar failure. /dev/sdb disappeared on hour 42 of operation. Health listed the Status as Dead (not Failed) with the SMART+ button grayed out, and Volume Settings had a missing entry with no Locate button.
Pulling and reinserting the drive did nothing. Inserting a fresh drive did nothing. The disk interface was just dead. I cold rebooted the ReadyNAS, the disk happily resynced, and SMART reported no errors.
Message 8 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-03-21
02:04 PM
2011-03-21
02:04 PM
Re: Disk Failure Detected...
Everything has been running fine since the 16th March... until today -- this time disk 5 was reported as failed.
Just like you, the health status was listed as 'dead' (not failed) with 'SMART+' greyed out. I have exactly the same disks ( ST2000DL003-9VT166) and exactly the same firmware (CC32).
I powered down the NAS, pulled the disk out, switched it back on and reinserted the same disk and the disk came back to life and started to resync. It's interesting that you found that even a hot swap of a new drive didn't work... which would (presumably) suggest a problem with the disk interface on the NAS rather than the disks?
Surely I can't have 3 failed disks out of 6? Once the volume has resynced, I'll do another extended disk test, but the first one came back fine after disks 2 and 4 had failed.
What's the official advice here? I can't really send back 3 disks which to all appearances work perfectly well and have no errors (and pass the extended test)... but I don't want to risk having multiple disks 'fail' in this manner and loosing my data. Is there some kind of incompatibility between the CC32 formware and the NAS? What version of the disk / firmware was tested when the disk was put onto the HCL?
Thanks,
Ash.
Just like you, the health status was listed as 'dead' (not failed) with 'SMART+' greyed out. I have exactly the same disks ( ST2000DL003-9VT166) and exactly the same firmware (CC32).
I powered down the NAS, pulled the disk out, switched it back on and reinserted the same disk and the disk came back to life and started to resync. It's interesting that you found that even a hot swap of a new drive didn't work... which would (presumably) suggest a problem with the disk interface on the NAS rather than the disks?
Surely I can't have 3 failed disks out of 6? Once the volume has resynced, I'll do another extended disk test, but the first one came back fine after disks 2 and 4 had failed.
What's the official advice here? I can't really send back 3 disks which to all appearances work perfectly well and have no errors (and pass the extended test)... but I don't want to risk having multiple disks 'fail' in this manner and loosing my data. Is there some kind of incompatibility between the CC32 formware and the NAS? What version of the disk / firmware was tested when the disk was put onto the HCL?
Thanks,
Ash.
Message 9 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-03-21
09:22 PM
2011-03-21
09:22 PM
Re: Disk Failure Detected...
I just hit this myself.
I've had my Pro Pioneer for a few weeks and running smoothly for the last of those (after all the bring up etc.)
I've got 4 of those same seagates and two Hitachi HDS722020ALA330's but it was one of the ST2000DL003-9VT166's that reported as failed.
I did the shut down, pulled the drive, pushed back in and booted and it was recognized and the ReadyNAS started resyncing. The SMART+ info showed no issues whatsoever...
Kevin
I've had my Pro Pioneer for a few weeks and running smoothly for the last of those (after all the bring up etc.)
I've got 4 of those same seagates and two Hitachi HDS722020ALA330's but it was one of the ST2000DL003-9VT166's that reported as failed.
I did the shut down, pulled the drive, pushed back in and booted and it was recognized and the ReadyNAS started resyncing. The SMART+ info showed no issues whatsoever...
Kevin
Message 10 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-03-27
01:08 PM
2011-03-27
01:08 PM
Re: Disk Failure Detected...
Just had what sounds exactly like the problem the original poster describes happen twice here in the last two days. Two different drives listed as dead, then rescying and acting fine.
- ReadyNAS Ultra 6 Plus, just set up 5 days ago.
- with 6 Seagate ST2000DL003-9VT166 drives (on HCL, ordered from Newegg) also with the same firmware - CC32.
- latest firmware: RAIDiator-x86 4.2.15
- dual redundancy
The NAS has been up and running for a couple days and on a UPS. I'd enabled Time Machine backup on my Macbook Pro (~200 GB worth of data) and let that get up to date over the course of a day or so. Everything seemed fine except a minor chirping noise coming from it's fan.
Yesterday I began copying data from my old RAID to the Ultra 6+. Halfway through the file transfer via the AFP share, I received alerts that drive 5 was dead. As others mentioned, no SMART info available, just greyed-out "dead" listed for that drive. I canceled the file copy, shut down the RAID, pulled the drive and reinserted it (also suspecting a connection problem as drive 5 was the only one I had any difficulty seating in the enclosure initially). The RAID booted back up, ran a scan with no errors, then re-synced over the course of 8 or 10 hours. All seemed well.
Then, today, the same thing happened to drive 2. In the middle of a trying to recopy the data (roughly half a terabyte), I was alerted that drive 2 was dead. My file copy froze, I noticed that Time Machine was attempting backup activity, I stopped Time Machine and the file transfer immediately restarted while the array was degraded. Again, I've shut down, pulled the drive, re-inserted, and it ran a check on drive 2 and is currently re-syncing.
SMART info for the drives doesn't show any errors.
Is there any input from Netgear on what to do here? Should I be submitting a report to Netgear support? I also don't want to have to worry about this continuing to be a thing any time I do a big file copy -- my data is too precious. I wonder if there are compatibility problems with these drives? It doesn't make sense for one to be listed as dead and the boot right back up, resync, and be off to the races.
Like Bluewombie, I downloaded the logs but I'm not sure what to do with them.
Logs from the Frontview:
System logs for the first dead drive (Disk 5):
System.log for the second dead drive (Disk 2):
Thanks for any input someone might be able to offer. This certainly seems like a pattern...
Best,
- CitizenPlain
- ReadyNAS Ultra 6 Plus, just set up 5 days ago.
- with 6 Seagate ST2000DL003-9VT166 drives (on HCL, ordered from Newegg) also with the same firmware - CC32.
- latest firmware: RAIDiator-x86 4.2.15
- dual redundancy
The NAS has been up and running for a couple days and on a UPS. I'd enabled Time Machine backup on my Macbook Pro (~200 GB worth of data) and let that get up to date over the course of a day or so. Everything seemed fine except a minor chirping noise coming from it's fan.
Yesterday I began copying data from my old RAID to the Ultra 6+. Halfway through the file transfer via the AFP share, I received alerts that drive 5 was dead. As others mentioned, no SMART info available, just greyed-out "dead" listed for that drive. I canceled the file copy, shut down the RAID, pulled the drive and reinserted it (also suspecting a connection problem as drive 5 was the only one I had any difficulty seating in the enclosure initially). The RAID booted back up, ran a scan with no errors, then re-synced over the course of 8 or 10 hours. All seemed well.
Then, today, the same thing happened to drive 2. In the middle of a trying to recopy the data (roughly half a terabyte), I was alerted that drive 2 was dead. My file copy froze, I noticed that Time Machine was attempting backup activity, I stopped Time Machine and the file transfer immediately restarted while the array was degraded. Again, I've shut down, pulled the drive, re-inserted, and it ran a check on drive 2 and is currently re-syncing.
SMART info for the drives doesn't show any errors.
Is there any input from Netgear on what to do here? Should I be submitting a report to Netgear support? I also don't want to have to worry about this continuing to be a thing any time I do a big file copy -- my data is too precious. I wonder if there are compatibility problems with these drives? It doesn't make sense for one to be listed as dead and the boot right back up, resync, and be off to the races.
Like Bluewombie, I downloaded the logs but I'm not sure what to do with them.
Logs from the Frontview:
Sun Mar 27 12:15:53 PDT 2011 If the failed disk is used in a RAID level 1, 5, or X-RAID volume, please note that volume is now unprotected, and an additional disk failure may render that volume dead. If this disk is a part of a RAID 6 volume, your volume is still protected if this is your first failure. A 2nd disk failure will make your volume unprotected. It is recommended that you replace the failed disk as soon as possible to maintain optimal protection of your volume.
Sun Mar 27 12:15:53 PDT 2011 Disk failure detected.
Sat Mar 26 23:53:59 PDT 2011 RAID sync finished on volume C.
Sat Mar 26 12:40:26 PDT 2011 System is up.
Sat Mar 26 12:40:10 PDT 2011 Volume scan found no errors.
Sat Mar 26 12:37:13 PDT 2011 Rebooting device...
Sat Mar 26 12:37:13 PDT 2011 Please close this browser session and use RAIDar to reconnect to the device. System rebooting...
Sat Mar 26 12:28:06 PDT 2011 If the failed disk is used in a RAID level 1, 5, or X-RAID volume, please note that volume is now unprotected, and an additional disk failure may render that volume dead. If this disk is a part of a RAID 6 volume, your volume is still protected if this is your first failure. A 2nd disk failure will make your volume unprotected. It is recommended that you replace the failed disk as soon as possible to maintain optimal protection of your volume.
Sat Mar 26 12:28:06 PDT 2011 Disk failure detected.
Fri Mar 25 16:54:19 PDT 2011 System is up.
System logs for the first dead drive (Disk 5):
Mar 26 12:00:06 harrison ntpdate[20094]: no server suitable for synchronization found
Mar 26 12:24:39 harrison cnid_dbd[20697]: Set syslog logging to level: LOG_NOTE
Mar 26 12:27:39 harrison kernel: ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Mar 26 12:27:39 harrison kernel: ata5.00: failed command: FLUSH CACHE EXT
Mar 26 12:27:39 harrison kernel: ata5.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
Mar 26 12:27:39 harrison kernel: res 40/00:ff:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
Mar 26 12:27:39 harrison kernel: ata5.00: status: { DRDY }
Mar 26 12:27:39 harrison kernel: ata5: hard resetting link
Mar 26 12:27:39 harrison kernel: ata5: link is slow to respond, please be patient (ready=0)
Mar 26 12:27:39 harrison kernel: ata5: COMRESET failed (errno=-16)
Mar 26 12:27:39 harrison kernel: ata5: hard resetting link
Mar 26 12:27:39 harrison kernel: ata5: link is slow to respond, please be patient (ready=0)
Mar 26 12:27:39 harrison kernel: ata5: COMRESET failed (errno=-16)
Mar 26 12:27:39 harrison kernel: ata5: hard resetting link
Mar 26 12:27:39 harrison kernel: ata5: link is slow to respond, please be patient (ready=0)
Mar 26 12:27:39 harrison cnid_dbd[15520]: error reading message header: Connection reset by peer
Mar 26 12:27:39 harrison kernel: ata5: COMRESET failed (errno=-16)
Mar 26 12:27:39 harrison kernel: ata5: limiting SATA link speed to 1.5 Gbps
Mar 26 12:27:39 harrison kernel: ata5: hard resetting link
Mar 26 12:27:39 harrison kernel: ata5: COMRESET failed (errno=-16)
Mar 26 12:27:39 harrison kernel: ata5: reset failed, giving up
Mar 26 12:27:39 harrison kernel: ata5.00: disabled
Mar 26 12:27:39 harrison kernel: ata5.00: device reported invalid CHS sector 0
Mar 26 12:27:39 harrison kernel: ata5: EH complete
Mar 26 12:27:39 harrison kernel: end_request: I/O error, dev sde, sector 0
Mar 26 12:27:39 harrison kernel: sd 4:0:0:0: [sde] Unhandled error code
Mar 26 12:27:39 harrison kernel: sd 4:0:0:0: [sde] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 26 12:27:39 harrison kernel: sd 4:0:0:0: [sde] CDB: Write(10): 2a 00 00 90 00 50 00 00 02 00
Mar 26 12:27:39 harrison kernel: end_request: I/O error, dev sde, sector 9437264
Mar 26 12:27:39 harrison kernel: end_request: I/O error, dev sde, sector 9437264
Mar 26 12:27:39 harrison kernel: **************** super written barrier kludge on md2: error==IO 0xfffffffb
Mar 26 12:27:39 harrison kernel: sd 4:0:0:0: [sde] Unhandled error code
Mar 26 12:27:39 harrison kernel: sd 4:0:0:0: [sde] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 26 12:27:39 harrison kernel: sd 4:0:0:0: [sde] CDB: Write(10): 2a 00 00 00 00 48 00 00 02 00
Mar 26 12:27:39 harrison kernel: end_request: I/O error, dev sde, sector 72
Mar 26 12:27:39 harrison kernel: end_request: I/O error, dev sde, sector 72
Mar 26 12:27:39 harrison kernel: **************** super written barrier kludge on md0: error==IO 0xfffffffb
Mar 26 12:27:39 harrison kernel: sd 4:0:0:0: [sde] Unhandled error code
Mar 26 12:27:39 harrison kernel: sd 4:0:0:0: [sde] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 26 12:27:39 harrison kernel: sd 4:0:0:0: [sde] CDB: Read(10): 28 00 00 3d 19 20 00 00 30 00
Mar 26 12:27:39 harrison kernel: end_request: I/O error, dev sde, sector 4004128
Mar 26 12:27:39 harrison kernel: raid1: sde1: rescheduling sector 4002016
Mar 26 12:27:39 harrison kernel: sd 4:0:0:0: [sde] Unhandled error code
Mar 26 12:27:39 harrison kernel: sd 4:0:0:0: [sde] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 26 12:27:39 harrison kernel: sd 4:0:0:0: [sde] CDB: Write(10): 2a 00 00 90 00 50 00 00 02 00
Mar 26 12:27:39 harrison kernel: end_request: I/O error, dev sde, sector 9437264
Mar 26 12:27:39 harrison kernel: md: super_written gets error=-5, uptodate=0
Mar 26 12:27:39 harrison kernel: raid5: Disk failure on sde5, disabling device.
Mar 26 12:27:39 harrison kernel: raid5: Operation continuing on 5 devices.
Mar 26 12:27:39 harrison kernel: sd 4:0:0:0: [sde] Unhandled error code
Mar 26 12:27:39 harrison kernel: sd 4:0:0:0: [sde] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 26 12:27:39 harrison kernel: sd 4:0:0:0: [sde] CDB: Write(10): 2a 00 00 00 00 48 00 00 02 00
Mar 26 12:27:39 harrison kernel: end_request: I/O error, dev sde, sector 72
Mar 26 12:27:39 harrison kernel: md: super_written gets error=-5, uptodate=0
Mar 26 12:27:39 harrison kernel: raid1: Disk failure on sde1, disabling device.
Mar 26 12:27:39 harrison kernel: raid1: Operation continuing on 5 devices.
Mar 26 12:27:39 harrison kernel: sd 4:0:0:0: [sde] Unhandled error code
Mar 26 12:27:39 harrison kernel: sd 4:0:0:0: [sde] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 26 12:27:39 harrison kernel: sd 4:0:0:0: [sde] CDB: Read(10): 28 00 00 5d 0f a8 00 00 08 00
Mar 26 12:27:39 harrison kernel: end_request: I/O error, dev sde, sector 6098856
Mar 26 12:27:39 harrison kernel: raid1: sde1: rescheduling sector 6096744
Mar 26 12:27:39 harrison kernel: RAID5 conf printout:
Mar 26 12:27:39 harrison kernel: --- rd:6 wd:5
Mar 26 12:27:39 harrison kernel: disk 0, o:1, dev:sda5
Mar 26 12:27:39 harrison kernel: disk 1, o:1, dev:sdb5
Mar 26 12:27:39 harrison kernel: disk 2, o:1, dev:sdc5
Mar 26 12:27:39 harrison kernel: disk 3, o:1, dev:sdd5
Mar 26 12:27:39 harrison kernel: disk 4, o:0, dev:sde5
Mar 26 12:27:39 harrison kernel: disk 5, o:1, dev:sdf5
Mar 26 12:27:39 harrison kernel: RAID5 conf printout:
Mar 26 12:27:39 harrison kernel: --- rd:6 wd:5
Mar 26 12:27:39 harrison kernel: disk 0, o:1, dev:sda5
Mar 26 12:27:39 harrison kernel: disk 1, o:1, dev:sdb5
Mar 26 12:27:39 harrison kernel: disk 2, o:1, dev:sdc5
Mar 26 12:27:39 harrison kernel: disk 3, o:1, dev:sdd5
Mar 26 12:27:39 harrison kernel: disk 5, o:1, dev:sdf5
Mar 26 12:27:39 harrison kernel: RAID1 conf printout:
Mar 26 12:27:39 harrison kernel: --- wd:5 rd:6
Mar 26 12:27:39 harrison kernel: disk 0, wo:0, o:1, dev:sda1
Mar 26 12:27:39 harrison kernel: disk 1, wo:0, o:1, dev:sdb1
Mar 26 12:27:39 harrison kernel: disk 2, wo:0, o:1, dev:sdc1
Mar 26 12:27:39 harrison kernel: disk 3, wo:0, o:1, dev:sdd1
Mar 26 12:27:39 harrison kernel: disk 4, wo:1, o:0, dev:sde1
Mar 26 12:27:39 harrison kernel: disk 5, wo:0, o:1, dev:sdf1
Mar 26 12:27:39 harrison kernel: RAID1 conf printout:
Mar 26 12:27:39 harrison kernel: --- wd:5 rd:6
Mar 26 12:27:39 harrison kernel: disk 0, wo:0, o:1, dev:sda1
Mar 26 12:27:39 harrison kernel: disk 1, wo:0, o:1, dev:sdb1
Mar 26 12:27:39 harrison kernel: disk 2, wo:0, o:1, dev:sdc1
Mar 26 12:27:39 harrison kernel: disk 3, wo:0, o:1, dev:sdd1
Mar 26 12:27:39 harrison kernel: disk 5, wo:0, o:1, dev:sdf1
Mar 26 12:27:39 harrison kernel: raid1: sdc1: redirecting sector 4002016 to another mirror
Mar 26 12:27:39 harrison kernel: raid1: sdb1: redirecting sector 6096744 to another mirror
Mar 26 12:28:11 harrison RAIDiator: Disk failure detected.\n\nIf the failed disk is used in a RAID level 1, 5, or X-RAID volume, please note that volume is now unprotected, and an additional disk failure may render that volume dead. If this disk is a part of a RAID 6 volume, your volume is still protected if this is your first failure. A 2nd disk failure will make your volume unprotected. It is recommended that you replace the failed disk as soon as possible to maintain optimal protection of your volume.\n\n[Sat Mar 26 12:28:06 PDT 2011]
Mar 26 12:37:18 harrison shutdown[21227]: shutting down for system reboot
Mar 26 12:37:19 harrison init: Switching to runlevel: 6
Mar 26 12:37:20 harrison kernel: ata1.00: configured for UDMA/133
Mar 26 12:37:20 harrison kernel: ata1: EH complete
Mar 26 12:37:20 harrison kernel: sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
Mar 26 12:37:20 harrison kernel: ata2.00: configured for UDMA/133
Mar 26 12:37:20 harrison kernel: ata2: EH complete
Mar 26 12:37:20 harrison kernel: sd 1:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
Mar 26 12:37:21 harrison kernel: ata3.00: configured for UDMA/133
Mar 26 12:37:21 harrison kernel: ata3: EH complete
Mar 26 12:37:21 harrison kernel: sd 2:0:0:0: [sdc] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
Mar 26 12:37:21 harrison kernel: ata4.00: configured for UDMA/133
Mar 26 12:37:21 harrison kernel: ata4: EH complete
Mar 26 12:37:21 harrison kernel: sd 3:0:0:0: [sdd] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
Mar 26 12:37:21 harrison kernel: ata6.00: configured for UDMA/133
Mar 26 12:37:21 harrison kernel: ata6: EH complete
Mar 26 12:37:21 harrison kernel: sd 5:0:0:0: [sdf] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
Mar 26 12:37:24 harrison exiting on signal 15
Mar 26 12:40:15 harrison syslogd 1.4.1#18: restart.
Mar 26 12:40:15 harrison kernel: klogd 1.4.1#18, log source = /proc/kmsg started.
Mar 26 12:40:15 harrison kernel: Linux version 2.6.33.7.RNx86_64.2.2
System.log for the second dead drive (Disk 2):
Mar 27 12:00:06 harrison ntpdate[26530]: no server suitable for synchronization found
Mar 27 12:11:34 harrison cnid_dbd[26669]: Set syslog logging to level: LOG_NOTE
Mar 27 12:14:50 harrison kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Mar 27 12:15:52 harrison kernel: ata2.00: failed command: FLUSH CACHE EXT
Mar 27 12:15:52 harrison kernel: ata2.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
Mar 27 12:15:52 harrison kernel: res 40/00:20:98:40:39/00:00:01:00:00/40 Emask 0x4 (timeout)
Mar 27 12:15:52 harrison kernel: ata2.00: status: { DRDY }
Mar 27 12:15:52 harrison kernel: ata2: hard resetting link
Mar 27 12:15:52 harrison kernel: ata2: link is slow to respond, please be patient (ready=0)
Mar 27 12:15:52 harrison kernel: ata2: COMRESET failed (errno=-16)
Mar 27 12:15:52 harrison kernel: ata2: hard resetting link
Mar 27 12:15:52 harrison kernel: ata2: link is slow to respond, please be patient (ready=0)
Mar 27 12:15:52 harrison kernel: ata2: COMRESET failed (errno=-16)
Mar 27 12:15:52 harrison kernel: ata2: hard resetting link
Mar 27 12:15:52 harrison kernel: ata2: link is slow to respond, please be patient (ready=0)
Mar 27 12:15:52 harrison kernel: ata2: COMRESET failed (errno=-16)
Mar 27 12:15:52 harrison kernel: ata2: limiting SATA link speed to 1.5 Gbps
Mar 27 12:15:52 harrison kernel: ata2: hard resetting link
Mar 27 12:15:52 harrison kernel: ata2: COMRESET failed (errno=-16)
Mar 27 12:15:52 harrison kernel: ata2: reset failed, giving up
Mar 27 12:15:52 harrison kernel: ata2.00: disabled
Mar 27 12:15:52 harrison kernel: ata2: EH complete
Mar 27 12:15:52 harrison kernel: end_request: I/O error, dev sdb, sector 0
Mar 27 12:15:52 harrison kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 27 12:15:52 harrison kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 27 12:15:52 harrison kernel: sd 1:0:0:0: [sdb] CDB: Write(10): 2a 00 00 90 00 50 00 00 02 00
Mar 27 12:15:52 harrison kernel: end_request: I/O error, dev sdb, sector 9437264
Mar 27 12:15:52 harrison kernel: end_request: I/O error, dev sdb, sector 9437264
Mar 27 12:15:52 harrison kernel: **************** super written barrier kludge on md2: error==IO 0xfffffffb
Mar 27 12:15:52 harrison kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 27 12:15:52 harrison kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 27 12:15:52 harrison kernel: sd 1:0:0:0: [sdb] CDB: Write(10): 2a 00 00 00 00 48 00 00 02 00
Mar 27 12:15:52 harrison kernel: end_request: I/O error, dev sdb, sector 72
Mar 27 12:15:52 harrison kernel: end_request: I/O error, dev sdb, sector 72
Mar 27 12:15:52 harrison kernel: **************** super written barrier kludge on md0: error==IO 0xfffffffb
Mar 27 12:15:52 harrison kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 27 12:15:52 harrison kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 27 12:15:52 harrison kernel: sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 00 2b 89 b8 00 00 08 00
Mar 27 12:15:52 harrison kernel: end_request: I/O error, dev sdb, sector 2853304
Mar 27 12:15:52 harrison kernel: raid1: sdb1: rescheduling sector 2851192
Mar 27 12:15:52 harrison kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 27 12:15:52 harrison kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 27 12:15:52 harrison kernel: sd 1:0:0:0: [sdb] CDB: Write(10): 2a 00 00 90 00 50 00 00 02 00
Mar 27 12:15:52 harrison kernel: end_request: I/O error, dev sdb, sector 9437264
Mar 27 12:15:52 harrison kernel: md: super_written gets error=-5, uptodate=0
Mar 27 12:15:52 harrison kernel: raid5: Disk failure on sdb5, disabling device.
Mar 27 12:15:52 harrison kernel: raid5: Operation continuing on 5 devices.
Mar 27 12:15:52 harrison kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 27 12:15:52 harrison kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 27 12:15:52 harrison kernel: sd 1:0:0:0: [sdb] CDB: Write(10): 2a 00 00 00 00 48 00 00 02 00
Mar 27 12:15:52 harrison kernel: end_request: I/O error, dev sdb, sector 72
Mar 27 12:15:52 harrison kernel: md: super_written gets error=-5, uptodate=0
Mar 27 12:15:52 harrison kernel: raid1: Disk failure on sdb1, disabling device.
Mar 27 12:15:52 harrison kernel: raid1: Operation continuing on 5 devices.
Mar 27 12:15:52 harrison kernel: RAID5 conf printout:
Mar 27 12:15:52 harrison kernel: --- rd:6 wd:5
Mar 27 12:15:52 harrison kernel: disk 0, o:1, dev:sda5
Mar 27 12:15:52 harrison kernel: disk 1, o:0, dev:sdb5
Mar 27 12:15:52 harrison kernel: disk 2, o:1, dev:sdc5
Mar 27 12:15:52 harrison kernel: disk 3, o:1, dev:sdd5
Mar 27 12:15:52 harrison kernel: disk 4, o:1, dev:sde5
Mar 27 12:15:52 harrison kernel: disk 5, o:1, dev:sdf5
Mar 27 12:15:52 harrison kernel: RAID5 conf printout:
Mar 27 12:15:52 harrison kernel: --- rd:6 wd:5
Mar 27 12:15:52 harrison kernel: disk 0, o:1, dev:sda5
Mar 27 12:15:52 harrison kernel: disk 2, o:1, dev:sdc5
Mar 27 12:15:52 harrison kernel: disk 3, o:1, dev:sdd5
Mar 27 12:15:52 harrison kernel: disk 4, o:1, dev:sde5
Mar 27 12:15:52 harrison kernel: disk 5, o:1, dev:sdf5
Mar 27 12:15:52 harrison kernel: RAID1 conf printout:
Mar 27 12:15:52 harrison kernel: --- wd:5 rd:6
Mar 27 12:15:52 harrison kernel: disk 0, wo:0, o:1, dev:sda1
Mar 27 12:15:52 harrison kernel: disk 1, wo:1, o:0, dev:sdb1
Mar 27 12:15:52 harrison kernel: disk 2, wo:0, o:1, dev:sdc1
Mar 27 12:15:52 harrison kernel: disk 3, wo:0, o:1, dev:sdd1
Mar 27 12:15:52 harrison kernel: disk 4, wo:0, o:1, dev:sde1
Mar 27 12:15:52 harrison kernel: disk 5, wo:0, o:1, dev:sdf1
Mar 27 12:15:52 harrison kernel: RAID1 conf printout:
Mar 27 12:15:52 harrison kernel: --- wd:5 rd:6
Mar 27 12:15:52 harrison kernel: disk 0, wo:0, o:1, dev:sda1
Mar 27 12:15:52 harrison kernel: disk 2, wo:0, o:1, dev:sdc1
Mar 27 12:15:52 harrison kernel: disk 3, wo:0, o:1, dev:sdd1
Mar 27 12:15:52 harrison kernel: disk 4, wo:0, o:1, dev:sde1
Mar 27 12:15:52 harrison kernel: disk 5, wo:0, o:1, dev:sdf1
Mar 27 12:15:52 harrison kernel: raid1: sda1: redirecting sector 2851192 to another mirror
Mar 27 12:16:00 harrison RAIDiator: Disk failure detected.\n\nIf the failed disk is used in a RAID level 1, 5, or X-RAID volume, please note that volume is now unprotected, and an additional disk failure may render that volume dead. If this disk is a part of a RAID 6 volume, your volume is still protected if this is your first failure. A 2nd disk failure will make your volume unprotected. It is recommended that you replace the failed disk as soon as possible to maintain optimal protection of your volume.\n\n[Sun Mar 27 12:15:53 PDT 2011]
Mar 27 12:16:45 harrison kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 27 12:16:45 harrison kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 27 12:16:45 harrison kernel: sd 1:0:0:0: [sdb] CDB: Write(10): 2a 00 00 80 00 48 00 00 02 00
Mar 27 12:16:45 harrison kernel: end_request: I/O error, dev sdb, sector 8388680
Mar 27 12:16:45 harrison kernel: end_request: I/O error, dev sdb, sector 8388680
Mar 27 12:16:45 harrison kernel: **************** super written barrier kludge on md1: error==IO 0xfffffffb
Mar 27 12:16:45 harrison kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 27 12:16:45 harrison kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 27 12:16:45 harrison kernel: sd 1:0:0:0: [sdb] CDB: Write(10): 2a 00 00 80 00 48 00 00 02 00
Mar 27 12:16:45 harrison kernel: end_request: I/O error, dev sdb, sector 8388680
Mar 27 12:16:45 harrison kernel: md: super_written gets error=-5, uptodate=0
Mar 27 12:16:45 harrison kernel: raid5: Disk failure on sdb2, disabling device.
Mar 27 12:16:45 harrison kernel: raid5: Operation continuing on 5 devices.
Mar 27 12:16:45 harrison kernel: RAID5 conf printout:
Mar 27 12:16:45 harrison kernel: --- rd:6 wd:5
Mar 27 12:16:45 harrison kernel: disk 0, o:1, dev:sda2
Mar 27 12:16:45 harrison kernel: disk 1, o:0, dev:sdb2
Mar 27 12:16:45 harrison kernel: disk 2, o:1, dev:sdc2
Mar 27 12:16:45 harrison kernel: disk 3, o:1, dev:sdd2
Mar 27 12:16:45 harrison kernel: disk 4, o:1, dev:sde2
Mar 27 12:16:45 harrison kernel: disk 5, o:1, dev:sdf2
Mar 27 12:16:45 harrison kernel: RAID5 conf printout:
Mar 27 12:16:45 harrison kernel: --- rd:6 wd:5
Mar 27 12:16:45 harrison kernel: disk 0, o:1, dev:sda2
Mar 27 12:16:45 harrison kernel: disk 2, o:1, dev:sdc2
Mar 27 12:16:45 harrison kernel: disk 3, o:1, dev:sdd2
Mar 27 12:16:45 harrison kernel: disk 4, o:1, dev:sde2
Mar 27 12:16:45 harrison kernel: disk 5, o:1, dev:sdf2
Mar 27 12:25:51 harrison kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Mar 27 12:26:04 harrison kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Mar 27 12:26:09 harrison kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 27 12:26:09 harrison kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 27 12:26:09 harrison kernel: sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 20 00
Mar 27 12:26:09 harrison kernel: end_request: I/O error, dev sdb, sector 0
Mar 27 12:26:09 harrison kernel: Buffer I/O error on device sdb, logical block 0
Mar 27 12:26:09 harrison kernel: Buffer I/O error on device sdb, logical block 1
Mar 27 12:26:09 harrison kernel: Buffer I/O error on device sdb, logical block 2
Mar 27 12:26:09 harrison kernel: Buffer I/O error on device sdb, logical block 3
Mar 27 12:26:09 harrison kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 27 12:26:09 harrison kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 27 12:26:09 harrison kernel: sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Mar 27 12:26:09 harrison kernel: end_request: I/O error, dev sdb, sector 0
Mar 27 12:26:09 harrison kernel: Buffer I/O error on device sdb, logical block 0
Mar 27 12:26:09 harrison kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 27 12:26:09 harrison kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 27 12:26:09 harrison kernel: sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 e8 e0 88 a8 00 00 08 00
Mar 27 12:26:09 harrison kernel: end_request: I/O error, dev sdb, sector 3907029160
Mar 27 12:26:09 harrison kernel: Buffer I/O error on device sdb, logical block 488378645
Mar 27 12:26:09 harrison kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 27 12:26:09 harrison kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 27 12:26:09 harrison kernel: sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 e8 e0 88 a8 00 00 08 00
Mar 27 12:26:09 harrison kernel: end_request: I/O error, dev sdb, sector 3907029160
Mar 27 12:26:09 harrison kernel: Buffer I/O error on device sdb, logical block 488378645
Mar 27 12:26:09 harrison kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 27 12:26:09 harrison kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 27 12:26:09 harrison kernel: sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Mar 27 12:26:09 harrison kernel: end_request: I/O error, dev sdb, sector 0
Mar 27 12:26:09 harrison kernel: Buffer I/O error on device sdb, logical block 0
Thanks for any input someone might be able to offer. This certainly seems like a pattern...
Best,
- CitizenPlain
Message 11 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-03-27
01:30 PM
2011-03-27
01:30 PM
Re: Disk Failure Detected...
Hi,
I've raised a support call with Netgear for my problem (I made sure to mention this thread and that other people seemed to be having similar issues).
The support guy has recommended that first of all I run the SeaTools full test on all the disks -- I haven't gotten around to this yet because I need to dig out a PC (generally I only use MacBooks - I wonder if that is part of the pattern of failure?).
Once that is done, (assuming all disks pass) he recommended doing an OS reinstall on the NAS (apparently an OS reinstall does not wipe your data or config (except for network settings), whereas a factory reset would erase _everything_).
If that doesn't help, he reckoned that it might indicate a hardware problem on the NAS.
In the meantime, I would recommend that you raise a separate support call, but also point the support guys at this thread. Can you let us know of any results / progress?
I've raised a support call with Netgear for my problem (I made sure to mention this thread and that other people seemed to be having similar issues).
The support guy has recommended that first of all I run the SeaTools full test on all the disks -- I haven't gotten around to this yet because I need to dig out a PC (generally I only use MacBooks - I wonder if that is part of the pattern of failure?).
Once that is done, (assuming all disks pass) he recommended doing an OS reinstall on the NAS (apparently an OS reinstall does not wipe your data or config (except for network settings), whereas a factory reset would erase _everything_).
If that doesn't help, he reckoned that it might indicate a hardware problem on the NAS.
In the meantime, I would recommend that you raise a separate support call, but also point the support guys at this thread. Can you let us know of any results / progress?
Message 12 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-03-28
01:01 PM
2011-03-28
01:01 PM
Re: Disk Failure Detected...
Looking to finally upgrade my ReadyNAS 600 (YES, that OLD v1.0 Infrant box with the squirrel-cage fan!) to an Untra-6 Plus, and considering these 2TB Seagate ST2000DL003 drives to fill it out.
Anyone in core ReadyNAS/NetGear support weighing in on these drives yet?
Anyone in core ReadyNAS/NetGear support weighing in on these drives yet?
Message 13 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-04-01
02:40 AM
2011-04-01
02:40 AM
Re: Disk Failure Detected...
Just a quick update on this... I've run the SeaGate SeaTools long test on all 6 disks and all 6 passed.
I've done an OS reinstall on the NAS following the instuctions the Netgear support guy sent me.... not sure if that's fixed the problem or not.
I'll report back here if I have any more problems.
Anyone else had any more failures like this?
I've done an OS reinstall on the NAS following the instuctions the Netgear support guy sent me.... not sure if that's fixed the problem or not.
I'll report back here if I have any more problems.
Anyone else had any more failures like this?
Message 14 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-04-01
06:00 AM
2011-04-01
06:00 AM
Re: Disk Failure Detected...
So far so good with no new issues other than a couple of smart error alerts (now up to like four?).
I've got a second ReadyNAS at work doing rsync backups every night and a spare Seagate drive on the shelf in anticipation of failure...
Kevin
I've got a second ReadyNAS at work doing rsync backups every night and a spare Seagate drive on the shelf in anticipation of failure...
Kevin
Message 15 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-04-02
08:01 AM
2011-04-02
08:01 AM
Re: Disk Failure Detected...
Looks like I may have spoken too soon as I just read I have another disk failure event... 😞
Kevin
Kevin
Message 16 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-04-06
08:42 PM
2011-04-06
08:42 PM
Re: Disk Failure Detected...
Now I've had my third, this time on disk 3 (previously it was 4, then 6). All of these are the same Seagate drives. Disk 1&2 are Hitachi and have yet to report failure. Did the shutdown, pull and reseat and drive reports fine, no Smart errors and is resyncing.
Message 17 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-04-06
08:44 PM
2011-04-06
08:44 PM
Re: Disk Failure Detected...
The disk could still be failing/dead. I would suggest backing up your data, then powering down and running the "Test Disks" boot option: http://home.bott.ca/webserver/?p=252
Message 18 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-04-06
08:58 PM
2011-04-06
08:58 PM
Re: Disk Failure Detected...
mdgm wrote: The disk could still be failing/dead. I would suggest backing up your data, then powering down and running the "Test Disks" boot option: http://home.bott.ca/webserver/?p=252
Thanks mdgm, I'll give that a shot tonight. I've run it through the various standard boot tests available through the Web UI but haven't tried this yet...
Kevin
Message 19 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-04-06
09:13 PM
2011-04-06
09:13 PM
Re: Disk Failure Detected...
I just had another one yesterday during a half-terabyte file copy to the RAID. Disk 1 this time (previously 5 & 2, as described earlier by me in this thread) with the same Seagate ST2000DL003-9VT166 drives.
Haven't had a chance to run Seatools or report to Netgear support yet. Was able to pull the drive and it resynced without issue as before. Only seems to happen when I'm copying data to the drive -- that is to say, it'll sit there running without issue or do periodic Time Machine backups, but at some point during a large file transfer, I get a disk failure.
I'll try running the Test Disk option that mdgm describes now as well.
Haven't had a chance to run Seatools or report to Netgear support yet. Was able to pull the drive and it resynced without issue as before. Only seems to happen when I'm copying data to the drive -- that is to say, it'll sit there running without issue or do periodic Time Machine backups, but at some point during a large file transfer, I get a disk failure.
I'll try running the Test Disk option that mdgm describes now as well.
Message 20 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-04-07
09:27 AM
2011-04-07
09:27 AM
Re: Disk Failure Detected...
First bootup of six of these new 2TB drives in an Ultra 6 Plus, and already one, disk #4, is declared "dead", only 1% into the first xraid2 resync.
Hard disk #1: Seagate, Seagate ST2000DL003-9VT166, 5YD17K5Z, CC32
Hard disk #2: Seagate, Seagate ST2000DL003-9VT166, 5YD1ZELB, CC32
Hard disk #3: Seagate, Seagate ST2000DL003-9VT166, 5YD1YR60, CC32
Hard disk #4: Seagate, Seagate ST2000DL003-9VT166, 5YD1JQ7Y, CC32
Hard disk #5: Seagate, Seagate ST2000DL003-9VT166, 5YD1QM9A, CC32
Hard disk #6: Seagate, Seagate ST2000DL003-9VT166, 5YD1Z766, CC32
Oy.
Hard disk #1: Seagate, Seagate ST2000DL003-9VT166, 5YD17K5Z, CC32
Hard disk #2: Seagate, Seagate ST2000DL003-9VT166, 5YD1ZELB, CC32
Hard disk #3: Seagate, Seagate ST2000DL003-9VT166, 5YD1YR60, CC32
Hard disk #4: Seagate, Seagate ST2000DL003-9VT166, 5YD1JQ7Y, CC32
Hard disk #5: Seagate, Seagate ST2000DL003-9VT166, 5YD1QM9A, CC32
Hard disk #6: Seagate, Seagate ST2000DL003-9VT166, 5YD1Z766, CC32
Oy.
Message 21 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-04-07
09:31 AM
2011-04-07
09:31 AM
Re: Disk Failure Detected...
It happens. A courier drops the disks or there's some manufacturing defect or something else. There's a whole range of possible causes. Disks can and do fail at any time.
Message 22 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-04-07
09:41 AM
2011-04-07
09:41 AM
Re: Disk Failure Detected...
I ran the full boot menu disk check last night. It was done by the morning. No error messages, abnormal SMART info, or anything obvious in the logs/health. Is there a particular file it puts the logs from the boot menu disk check in the complete log ZIP file? Guess the next steps woulds till be Seatools and contacting Netgear support?
Message 23 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-04-07
09:45 AM
2011-04-07
09:45 AM
Re: Disk Failure Detected...
I'd buy the perspective that those with issues in this thread are just seeing the odd bad disk here or there, but the full story seems to show otherwise: disks aren't actually bad, but are errantly being flagged as bad for some reason. Sounds like a bug to me, at the OS or FW level. My next steps is likely to be like CitizenPlain's latest post: run a full (perhaps overkill considering the downtime it requires!) disk-check from the boot menu. I bet there will be no issues found. Then what? Wait until the next likely false-positive drive failure?
I run an R&D lab datacenter with hundreds of SATA, SCSI, and FC hard-drives. Drives do die. This issue sniffs of a code bug somewhere to me.
I run an R&D lab datacenter with hundreds of SATA, SCSI, and FC hard-drives. Drives do die. This issue sniffs of a code bug somewhere to me.
Message 24 of 145
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-04-07
09:50 AM
2011-04-07
09:50 AM
Re: Disk Failure Detected...
Its especially troublesome that all reports have been with Seagate ST2000DL003-9VT166 model drives. Which leads me to think its a firmware issue with these drives?
I'll run the full disk check tonight on mine but expect the same results as CitizenPlain.
Kevin
I'll run the full disk check tonight on mine but expect the same results as CitizenPlain.
Kevin
Message 25 of 145