NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
bluewomble
Mar 13, 2011Aspirant
Disk Failure Detected...
I've recently purchased a ReadyNAS Ultra 6 along with 6 2 Tb Seagate ST2000DL003 disks (which are on the HCL).
I've set up the NAS in a dual redundancy X-RAID2 configuration and have starting copying all the data over the network from my old ReadyNAS NV to the new ultra 6...
About half way through copying (on 6th March), I got a disk failure detected (on channel 4). I powered down the NAS took the disk out and reinserted it, assuming there might be some kind of connection problem... When I powered back up it detected the disk, tested it and started to resync (which takes about 24 hours)... I left it alone while it did that and then it seemed to be ok, so I started copying the rest of my data across. There is nothing in the SMART+ log for disk 4 which would indicate that there was ever a problem with that disk.
A few minutes ago, I just got another disk failure (this time on channel 2). Exactly the same story... powered down and then back up again, the disk comes back to life and the NAS starts testing it and resyncing it... again, there is nothing in the SMART+ log for disk 2 which indicates (to me at least) that there was ever a problem.
After both occasions, I've downloaded the system logs from the NAS, but I'm not sure what to do with them. Is there something in the log which would show what exactly failed?
Any idea what's going on here? Have I got a couple of dud disks which need to be sent back, or is there something else going on? If they are dud, I'd need to be able to prove to the retailer that they were... the only indication I have of a problem is that the ReadyNAS ultra 6 _said_ that they had failed... but they both seem to be working fine now.
Thanks,
Ash.
P.S. Here's the SMART+ report from disk 2:
This looks like the appropriate section of system.log for the failure which occurred today:
and here is what looks like the relevant part of the log from the failure on 6th March:
I've set up the NAS in a dual redundancy X-RAID2 configuration and have starting copying all the data over the network from my old ReadyNAS NV to the new ultra 6...
About half way through copying (on 6th March), I got a disk failure detected (on channel 4). I powered down the NAS took the disk out and reinserted it, assuming there might be some kind of connection problem... When I powered back up it detected the disk, tested it and started to resync (which takes about 24 hours)... I left it alone while it did that and then it seemed to be ok, so I started copying the rest of my data across. There is nothing in the SMART+ log for disk 4 which would indicate that there was ever a problem with that disk.
A few minutes ago, I just got another disk failure (this time on channel 2). Exactly the same story... powered down and then back up again, the disk comes back to life and the NAS starts testing it and resyncing it... again, there is nothing in the SMART+ log for disk 2 which indicates (to me at least) that there was ever a problem.
After both occasions, I've downloaded the system logs from the NAS, but I'm not sure what to do with them. Is there something in the log which would show what exactly failed?
Any idea what's going on here? Have I got a couple of dud disks which need to be sent back, or is there something else going on? If they are dud, I'd need to be able to prove to the retailer that they were... the only indication I have of a problem is that the ReadyNAS ultra 6 _said_ that they had failed... but they both seem to be working fine now.
Thanks,
Ash.
P.S. Here's the SMART+ report from disk 2:
SMART Information for Disk 2
Model: ST2000DL003-9VT166
Serial: 5YD2196G
Firmware: CC32
SMART Attribute
Spin Up Time 0
Start Stop Count 12
Reallocated Sector Count 0
Power On Hours 151
Spin Retry Count 0
Power Cycle Count 12
Reported Uncorrect 0
High Fly Writes 0
Airflow Temperature Cel 42
G-Sense Error Rate 0
Power-Off Retract Count 6
Load Cycle Count 12
Temperature Celsius 42
Current Pending Sector 0
Offline Uncorrectable 0
UDMA CRC Error Count 0
Head Flying Hours 221474283585687
ATA Error Count 0
This looks like the appropriate section of system.log for the failure which occurred today:
Mar 13 20:00:09 ultranas ntpdate[11162]: step time server 194.238.48.3 offset 0.310812 sec
Mar 13 20:16:27 ultranas kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Mar 13 20:16:27 ultranas kernel: ata2.00: failed command: FLUSH CACHE EXT
Mar 13 20:16:27 ultranas kernel: ata2.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
Mar 13 20:16:27 ultranas kernel: res 40/00:ff:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
Mar 13 20:16:27 ultranas kernel: ata2.00: status: { DRDY }
Mar 13 20:16:27 ultranas kernel: ata2: hard resetting link
Mar 13 20:16:33 ultranas kernel: ata2: link is slow to respond, please be patient (ready=0)
Mar 13 20:16:37 ultranas kernel: ata2: COMRESET failed (errno=-16)
Mar 13 20:16:37 ultranas kernel: ata2: hard resetting link
Mar 13 20:16:43 ultranas kernel: ata2: link is slow to respond, please be patient (ready=0)
Mar 13 20:16:47 ultranas kernel: ata2: COMRESET failed (errno=-16)
Mar 13 20:16:47 ultranas kernel: ata2: hard resetting link
Mar 13 20:16:53 ultranas kernel: ata2: link is slow to respond, please be patient (ready=0)
Mar 13 20:17:23 ultranas kernel: ata2: COMRESET failed (errno=-16)
Mar 13 20:17:23 ultranas kernel: ata2: limiting SATA link speed to 1.5 Gbps
Mar 13 20:17:23 ultranas kernel: ata2: hard resetting link
Mar 13 20:17:28 ultranas kernel: ata2: COMRESET failed (errno=-16)
Mar 13 20:17:28 ultranas kernel: ata2: reset failed, giving up
Mar 13 20:17:28 ultranas kernel: ata2.00: disabled
Mar 13 20:17:28 ultranas kernel: ata2.00: device reported invalid CHS sector 0
Mar 13 20:17:28 ultranas kernel: ata2: EH complete
Mar 13 20:17:28 ultranas kernel: end_request: I/O error, dev sdb, sector 0
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] CDB: Write(10): 2a 00 00 90 00 50 00 00 02 00
Mar 13 20:17:28 ultranas kernel: end_request: I/O error, dev sdb, sector 9437264
Mar 13 20:17:28 ultranas kernel: end_request: I/O error, dev sdb, sector 9437264
Mar 13 20:17:28 ultranas kernel: **************** super written barrier kludge on md2: error==IO 0xfffffffb
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] CDB: Write(10): 2a 00 00 00 00 48 00 00 02 00
Mar 13 20:17:28 ultranas kernel: end_request: I/O error, dev sdb, sector 72
Mar 13 20:17:28 ultranas kernel: end_request: I/O error, dev sdb, sector 72
Mar 13 20:17:28 ultranas kernel: **************** super written barrier kludge on md0: error==IO 0xfffffffb
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 00 51 8f 30 00 00 28 00
Mar 13 20:17:28 ultranas kernel: end_request: I/O error, dev sdb, sector 5345072
Mar 13 20:17:28 ultranas kernel: raid1: sdb1: rescheduling sector 5342960
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] CDB: Write(10): 2a 00 00 90 00 50 00 00 02 00
Mar 13 20:17:28 ultranas kernel: end_request: I/O error, dev sdb, sector 9437264
Mar 13 20:17:28 ultranas kernel: md: super_written gets error=-5, uptodate=0
Mar 13 20:17:28 ultranas kernel: raid5: Disk failure on sdb5, disabling device.
Mar 13 20:17:28 ultranas kernel: raid5: Operation continuing on 5 devices.
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Unhandled error code
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 13 20:17:28 ultranas kernel: sd 1:0:0:0: [sdb] CDB: Write(10): 2a 00 00 00 00 48 00 00 02 00
Mar 13 20:17:28 ultranas kernel: end_request: I/O error, dev sdb, sector 72
Mar 13 20:17:28 ultranas kernel: md: super_written gets error=-5, uptodate=0
Mar 13 20:17:28 ultranas kernel: raid1: Disk failure on sdb1, disabling device.
Mar 13 20:17:28 ultranas kernel: raid1: Operation continuing on 5 devices.
Mar 13 20:17:28 ultranas kernel: RAID5 conf printout:
Mar 13 20:17:28 ultranas kernel: --- rd:6 wd:5
Mar 13 20:17:28 ultranas kernel: disk 0, o:1, dev:sda5
Mar 13 20:17:28 ultranas kernel: disk 1, o:0, dev:sdb5
Mar 13 20:17:28 ultranas kernel: disk 2, o:1, dev:sdc5
Mar 13 20:17:28 ultranas kernel: disk 3, o:1, dev:sdd5
Mar 13 20:17:28 ultranas kernel: disk 4, o:1, dev:sde5
Mar 13 20:17:28 ultranas kernel: disk 5, o:1, dev:sdf5
Mar 13 20:17:28 ultranas kernel: RAID5 conf printout:
Mar 13 20:17:28 ultranas kernel: --- rd:6 wd:5
Mar 13 20:17:28 ultranas kernel: disk 0, o:1, dev:sda5
Mar 13 20:17:28 ultranas kernel: disk 2, o:1, dev:sdc5
Mar 13 20:17:28 ultranas kernel: disk 3, o:1, dev:sdd5
Mar 13 20:17:28 ultranas kernel: disk 4, o:1, dev:sde5
Mar 13 20:17:28 ultranas kernel: disk 5, o:1, dev:sdf5
Mar 13 20:17:28 ultranas kernel: RAID1 conf printout:
Mar 13 20:17:28 ultranas kernel: --- wd:5 rd:6
Mar 13 20:17:28 ultranas kernel: disk 0, wo:0, o:1, dev:sda1
Mar 13 20:17:28 ultranas kernel: disk 1, wo:1, o:0, dev:sdb1
Mar 13 20:17:28 ultranas kernel: disk 2, wo:0, o:1, dev:sdc1
Mar 13 20:17:28 ultranas kernel: disk 3, wo:0, o:1, dev:sdd1
Mar 13 20:17:28 ultranas kernel: disk 4, wo:0, o:1, dev:sde1
Mar 13 20:17:28 ultranas kernel: disk 5, wo:0, o:1, dev:sdf1
Mar 13 20:17:28 ultranas kernel: RAID1 conf printout:
Mar 13 20:17:28 ultranas kernel: --- wd:5 rd:6
Mar 13 20:17:28 ultranas kernel: disk 0, wo:0, o:1, dev:sda1
Mar 13 20:17:28 ultranas kernel: disk 2, wo:0, o:1, dev:sdc1
Mar 13 20:17:28 ultranas kernel: disk 3, wo:0, o:1, dev:sdd1
Mar 13 20:17:28 ultranas kernel: disk 4, wo:0, o:1, dev:sde1
Mar 13 20:17:28 ultranas kernel: disk 5, wo:0, o:1, dev:sdf1
Mar 13 20:17:28 ultranas kernel: raid1: sdf1: redirecting sector 5342960 to another mirror
Mar 13 20:17:32 ultranas RAIDiator: Disk failure detected.\n\nIf the failed disk is used in a RAID level 1, 5, or X-RAID volume, please note that volume is now unprotected, and an additional disk failure may render that volume dead. If this disk is a part of a RAID 6 volume, your volume is still protected if this is your first failure. A 2nd disk failure will make your volume unprotected. It is recommended that you replace the failed disk as soon as possible to maintain optimal protection of your volume.\n\n[Sun Mar 13 20:17:29 WET 2011]
Mar 13 20:20:24 ultranas kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
and here is what looks like the relevant part of the log from the failure on 6th March:
Mar 6 16:00:07 nas-EA-A6-42 ntpdate[12452]: step time server 62.84.188.34 offset -0.103568 sec
Mar 6 18:48:21 nas-EA-A6-42 kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Mar 6 18:48:22 nas-EA-A6-42 kernel: ata4.00: failed command: FLUSH CACHE EXT
Mar 6 18:48:22 nas-EA-A6-42 kernel: ata4.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
Mar 6 18:48:22 nas-EA-A6-42 kernel: res 40/00:00:b8:f7:0e/00:00:00:00:00/40 Emask 0x4 (timeout)
Mar 6 18:48:22 nas-EA-A6-42 kernel: ata4.00: status: { DRDY }
Mar 6 18:48:22 nas-EA-A6-42 kernel: ata4: hard resetting link
Mar 6 18:48:27 nas-EA-A6-42 kernel: ata4: link is slow to respond, please be patient (ready=0)
Mar 6 18:48:32 nas-EA-A6-42 kernel: ata4: COMRESET failed (errno=-16)
Mar 6 18:48:32 nas-EA-A6-42 kernel: ata4: hard resetting link
Mar 6 18:48:37 nas-EA-A6-42 kernel: ata4: link is slow to respond, please be patient (ready=0)
Mar 6 18:48:42 nas-EA-A6-42 kernel: ata4: COMRESET failed (errno=-16)
Mar 6 18:48:42 nas-EA-A6-42 kernel: ata4: hard resetting link
Mar 6 18:48:47 nas-EA-A6-42 kernel: ata4: link is slow to respond, please be patient (ready=0)
Mar 6 18:49:17 nas-EA-A6-42 kernel: ata4: COMRESET failed (errno=-16)
Mar 6 18:49:17 nas-EA-A6-42 kernel: ata4: limiting SATA link speed to 1.5 Gbps
Mar 6 18:49:17 nas-EA-A6-42 kernel: ata4: hard resetting link
Mar 6 18:49:22 nas-EA-A6-42 kernel: ata4: COMRESET failed (errno=-16)
Mar 6 18:49:22 nas-EA-A6-42 kernel: ata4: reset failed, giving up
Mar 6 18:49:22 nas-EA-A6-42 kernel: ata4.00: disabled
Mar 6 18:49:22 nas-EA-A6-42 kernel: ata4: EH complete
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Write(10): 2a 00 00 00 00 48 00 00 02 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 72
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 72
Mar 6 18:49:22 nas-EA-A6-42 kernel: **************** super written barrier kludge on md0: error==IO 0xfffffffb
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Write(10): 2a 00 00 93 9e 80 00 00 08 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 9674368
Mar 6 18:49:22 nas-EA-A6-42 kernel: raid5: Disk failure on sdd5, disabling device.
Mar 6 18:49:22 nas-EA-A6-42 kernel: raid5: Operation continuing on 5 devices.
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Write(10): 2a 00 34 c5 68 48 00 00 80 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 885352520
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Write(10): 2a 00 34 c6 f0 c8 00 00 50 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 885453000
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 00 91 28 c8 00 00 38 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 9513160
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 00 91 29 10 00 00 10 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 9513232
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 00 91 29 28 00 00 10 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 9513256
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 00 91 29 40 00 00 08 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 9513280
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 00 93 88 48 00 00 08 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 9668680
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 00 93 a1 90 00 00 10 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 9675152
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 34 c5 38 48 00 00 08 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 885340232
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 34 c5 64 48 00 00 80 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 885351496
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 34 c6 f1 18 00 00 30 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 885453080
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Write(10): 2a 00 00 80 00 48 00 00 02 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 8388680
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 8388680
Mar 6 18:49:22 nas-EA-A6-42 kernel: **************** super written barrier kludge on md1: error==IO 0xfffffffb
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 00 31 8d 58 00 00 28 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 3247448
Mar 6 18:49:22 nas-EA-A6-42 kernel: raid1: sdd1: rescheduling sector 3245336
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB:
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Unhandled error code
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Mar 6 18:49:22 nas-EA-A6-42 kernel: sd 3:0:0:0: [sdd] CDB: Write(10)Write(10): 2a 00 00 00 00 48 00 00 02 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 72
Mar 6 18:49:22 nas-EA-A6-42 kernel: :md: super_written gets error=-5, uptodate=0
Mar 6 18:49:22 nas-EA-A6-42 kernel: 2a
Mar 6 18:49:22 nas-EA-A6-42 kernel: raid1: Disk failure on sdd1, disabling device.
Mar 6 18:49:22 nas-EA-A6-42 kernel: raid1: Operation continuing on 5 devices.
Mar 6 18:49:22 nas-EA-A6-42 kernel: 00 00 80 00 48 00 00 02 00
Mar 6 18:49:22 nas-EA-A6-42 kernel: end_request: I/O error, dev sdd, sector 8388680
Mar 6 18:49:22 nas-EA-A6-42 kernel: md: super_written gets error=-5, uptodate=0
Mar 6 18:49:22 nas-EA-A6-42 kernel: raid5: Disk failure on sdd2, disabling device.
Mar 6 18:49:22 nas-EA-A6-42 kernel: raid5: Operation continuing on 5 devices.
Mar 6 18:49:23 nas-EA-A6-42 kernel: RAID1 conf printout:
Mar 6 18:49:23 nas-EA-A6-42 kernel: --- wd:5 rd:6
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 0, wo:0, o:1, dev:sda1
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 1, wo:0, o:1, dev:sdb1
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 2, wo:0, o:1, dev:sdc1
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 3, wo:1, o:0, dev:sdd1
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 4, wo:0, o:1, dev:sde1
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 5, wo:0, o:1, dev:sdf1
Mar 6 18:49:23 nas-EA-A6-42 kernel: RAID1 conf printout:
Mar 6 18:49:23 nas-EA-A6-42 kernel: --- wd:5 rd:6
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 0, wo:0, o:1, dev:sda1
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 1, wo:0, o:1, dev:sdb1
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 2, wo:0, o:1, dev:sdc1
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 4, wo:0, o:1, dev:sde1
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 5, wo:0, o:1, dev:sdf1
Mar 6 18:49:23 nas-EA-A6-42 kernel: RAID5 conf printout:
Mar 6 18:49:23 nas-EA-A6-42 kernel: --- rd:6 wd:5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 0, o:1, dev:sda5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 1, o:1, dev:sdb5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 2, o:1, dev:sdc5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 3, o:0, dev:sdd5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 4, o:1, dev:sde5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 5, o:1, dev:sdf5
Mar 6 18:49:23 nas-EA-A6-42 kernel: RAID5 conf printout:
Mar 6 18:49:23 nas-EA-A6-42 kernel: --- rd:6 wd:5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 0, o:1, dev:sda5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 1, o:1, dev:sdb5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 2, o:1, dev:sdc5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 4, o:1, dev:sde5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 5, o:1, dev:sdf5
Mar 6 18:49:23 nas-EA-A6-42 kernel: RAID5 conf printout:
Mar 6 18:49:23 nas-EA-A6-42 kernel: --- rd:6 wd:5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 0, o:1, dev:sda2
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 1, o:1, dev:sdb2
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 2, o:1, dev:sdc2
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 3, o:0, dev:sdd2
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 4, o:1, dev:sde2
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 5, o:1, dev:sdf2
Mar 6 18:49:23 nas-EA-A6-42 kernel: raid1: sdb1: redirecting sector 3245336 to another mirror
Mar 6 18:49:23 nas-EA-A6-42 kernel: RAID5 conf printout:
Mar 6 18:49:23 nas-EA-A6-42 kernel: --- rd:6 wd:5
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 0, o:1, dev:sda2
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 1, o:1, dev:sdb2
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 2, o:1, dev:sdc2
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 4, o:1, dev:sde2
Mar 6 18:49:23 nas-EA-A6-42 kernel: disk 5, o:1, dev:sdf2
Mar 6 18:49:53 nas-EA-A6-42 RAIDiator: Disk failure detected.\n\nIf the failed disk is used in a RAID level 1, 5, or X-RAID volume, please note that volume is now unprotected, and an additional disk failure may render that volume dead. If this disk is a part of a RAID 6 volume, your volume is still protected if this is your first failure. A 2nd disk failure will make your volume unprotected. It is recommended that you replace the failed disk as soon as possible to maintain optimal protection of your volume.\n\n[Sun Mar 6 18:49:51 WET 2011]
144 Replies
Replies have been turned off for this discussion
- opt2boutAspirantI'm on day 2 with the new 4.2.20-T15 so its been running 24 hours now. But in the past it would always take a couple of days before it would fail. The problem appears to be with the new firmware and these specific drives from Seagate. If you downgrade to 4.2.17 you won't have the problems (I ran two full months with 4.2.17 and two these drives with no issues at all). Of course if you have any computers that are running Mac OS X Lion, you can't do that. They are working on trying to figure out why the combination of the newer firmware is failing these drives.
- PianAspirantI was running .20-beta when my disk "failed" while being controlled by a Level 3 support.
- thestumperAspirantI'm almost convinced at this point that it is a Time Machine compatibility problem; there seems to be a lot of anecdotal evidence piling up that would suggest that the problem lies there. It doesn't make me feel any better - Time Machine AND these drives were supported when I bought the NAS. Right now I have my Ultra 4+ sitting here in "Debug Mode" waiting for a Level 3 engineer to log in and poke around. My guess is that they will find exactly what they found the last time they logged in: nothing.
I'm tempted to use another backup program to see if the problem persists. Carbon Copy Cloner would probably work for me, and it would be interesting to see how long it would go without failure running that instead of Time Machine. The problem with that is, well... I let Netgear off the hook and they end up closing my case without actually SOLVING the problem. Plus, I do like Time Machine for what it is and leverage it on multiple machines here.
Does everyone here have a case open for this? I would be curious to see how many there are, plus it might be helpful to reference other cases in our own cases to keep support focussed. I would especially like to see what Plan finds out, as his unit failed when the tech was logged in! - PianAspirantMy case is still open, and I chase my L3 contact once a week or so to see if here has been any progress (none so far).
One interesting aside - I bought four of these disks for my NVX. But I had my first "failure" after I had only put 3 in, so decided against using the fourth and was going to buy another (pre Thai floods). While it was lying around I though I would use it in my Time Capsule - into which I had already installed a 1TB non-Apple disk with no issues a year or two ago. I put it in and have been using Time Machine to it from a number of machines for a number of weeks. BUT ... every week to ten days something goes wrong and the Time Machines cannot find the Time Capsule (NAConnectToServerSync error 64), even though I can access the Time Machine using Time Machine Utility. The problem is solved by rebooting the Time Capsule.
Anecdotal evidence to support your thesis of an incompatibility between these drives and Time Machine. - bokvastAspirantGot an answer from a l3 that they are working on a fix and it is expected to arrive in Q1 2012
- thestumperAspirantThat is reasonably good news. Q1 2012 is technically only a few weeks away. Of course, it could be the END of Q1, which would put us into March, but it's a good sign. Do you mind posting your case number so that I can reference it in mine?
My case #: 17077472
I worry that we could have various L3 techs running in different directions on this. Probably just paranoia, but if I get confirmation that there IS an acknowledged problem and it WILL be fixed, I will leverage another solution for Time Machine that will at least keep my NAS running. I don't want to let anyone off the hook until we know what the issue is and that it is solvable. My NAS still sitting in Debug mode at the moment....
-Eric - bokvastAspirantYeah sure
17259922 - opt2boutAspirant
thestumper wrote: Does everyone here have a case open for this? I would be curious to see how many there are, plus it might be helpful to reference other cases in our own cases to keep support focussed. I would especially like to see what Plan finds out, as his unit failed when the tech was logged in!
Case open now for 4 months now...since September. Still open. 4.2.20-T15 lasted 3 days...it seems 3 days and ... bam! Dead drive. - PianAspirantI am 16756789, case open since 23 Sep - but my L3 is pushing me to close the case even though I still have a problem.
He says that Engineering say that the problem is with Seagate, not Netgear. And Netgear may take the drive off the HCL if Seagate don't solve ... - opt2boutAspirant
He says that Engineering say that the problem is with Seagate, not Netgear. And Netgear may take the drive off the HCL if Seagate don't solve ...
Puts me in a hard place if this is true. Another comment earlier stated that there is a fix planned for Q1 2012. I have four of these drives. Only two installed--afraid to pursue replacing older/smaller drives with these.
However a tech did refer to a link to a discussion on Seagate's user forum, but there is no official statement from seagate. And to put things into perspective, if Netgear firmware 4.2.17 doesn't fail these drives (this all started with 4.2.19 and above) then how is it the drive's fault exactly? I put 4.2.17 back on and the drives lasted two months...only when I updated again to the release of 4.2.19 will the drive fail in a few days(4.2.18 actually worked but was retracted because of a security bug with AFP?).
I can't ship these drives back to anyone and ask for a replacement or refund, so I'm committed. I really hope Negear's final position is to point fingers and walk away from this--but they should update their HCL so others won't follow this disaster as we have.
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!