Reply

Bad Drive on Readynas 516

ddoming73
Aspirant

Bad Drive on Readynas 516

Hi,

My Readynas 516 has been running very slow for time, but the web interface did not show any alert.

I accessed with SSH and see the following messages filling up the system logs.


Logs begin at Sun, 08 Mar 2015 20:05:33 +0100, end at Sun, 08 Mar 2015 21:01:49 +0100.
Mar 08 20:05:33 domnet kernel: ata6.00: status: { DRDY ERR }
Mar 08 20:05:33 domnet kernel: ata6.00: error: { UNC }
Mar 08 20:05:33 domnet kernel: ata6.00: configured for UDMA/133
Mar 08 20:05:33 domnet kernel: ata6: EH complete
Mar 08 20:05:34 domnet kernel: ata6.00: exception Emask 0x0 SAct 0x1000000 SErr 0x0 action 0x0
Mar 08 20:05:34 domnet kernel: ata6.00: irq_stat 0x40000008
Mar 08 20:05:34 domnet kernel: ata6.00: failed command: READ FPDMA QUEUED
Mar 08 20:05:34 domnet kernel: ata6.00: cmd 60/08:c0:48:00:00/00:00:00:00:00/40 tag 24 ncq 4096 in
Mar 08 20:05:34 domnet kernel: res 41/40:00:49:00:00/00:00:00:00:00/40 Emask 0x409 (media error) <F>
Mar 08 20:05:34 domnet kernel: ata6.00: status: { DRDY ERR }
Mar 08 20:05:34 domnet kernel: ata6.00: error: { UNC }
Mar 08 20:05:34 domnet kernel: ata6.00: configured for UDMA/133
Mar 08 20:05:34 domnet kernel: sd 5:0:0:0: [sdf] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Mar 08 20:05:34 domnet kernel: sd 5:0:0:0: [sdf] Sense Key : Medium Error [current] [descriptor]
Mar 08 20:05:34 domnet kernel: Descriptor sense data with sense descriptors (in hex):
Mar 08 20:05:34 domnet kernel: 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Mar 08 20:05:34 domnet kernel: 00 00 00 49
Mar 08 20:05:34 domnet kernel: sd 5:0:0:0: [sdf] Add. Sense: Unrecovered read error - auto reallocate failed
Mar 08 20:05:34 domnet kernel: sd 5:0:0:0: [sdf] CDB: Read(10): 28 00 00 00 00 48 00 00 08 00
Mar 08 20:05:34 domnet kernel: end_request: I/O error, dev sdf, sector 73
Mar 08 20:05:34 domnet kernel: Buffer I/O error on device sdf1, logical block 1
Mar 08 20:05:34 domnet kernel: ata6: EH complete
Mar 08 20:05:34 domnet kernel: ata6.00: exception Emask 0x0 SAct 0x2000000 SErr 0x0 action 0x0
Mar 08 20:05:34 domnet kernel: ata6.00: irq_stat 0x40000008
Mar 08 20:05:34 domnet kernel: ata6.00: failed command: READ FPDMA QUEUED
Mar 08 20:05:34 domnet kernel: ata6.00: cmd 60/08:c8:48:00:00/00:00:00:00:00/40 tag 25 ncq 4096 in
Mar 08 20:05:34 domnet kernel: res 41/40:00:49:00:00/00:00:00:00:00/40 Emask 0x409 (media error) <F>
Mar 08 20:05:34 domnet kernel: ata6.00: status: { DRDY ERR }
Mar 08 20:05:34 domnet kernel: ata6.00: error: { UNC }
Mar 08 20:05:34 domnet kernel: ata6.00: configured for UDMA/133
Mar 08 20:05:34 domnet kernel: ata6: EH complete
Mar 08 20:05:34 domnet kernel: ata6.00: exception Emask 0x0 SAct 0x4000000 SErr 0x0 action 0x0
Mar 08 20:05:34 domnet kernel: ata6.00: irq_stat 0x40000008
Mar 08 20:05:34 domnet kernel: ata6.00: failed command: READ FPDMA QUEUED
Mar 08 20:05:34 domnet kernel: ata6.00: cmd 60/08:d0:48:00:00/00:00:00:00:00/40 tag 26 ncq 4096 in
Mar 08 20:05:34 domnet kernel: res 41/40:00:49:00:00/00:00:00:00:00/40 Emask 0x409 (media error) <F>
Mar 08 20:05:34 domnet kernel: ata6.00: status: { DRDY ERR }
Mar 08 20:05:34 domnet kernel: ata6.00: error: { UNC }
Mar 08 20:05:34 domnet kernel: ata6.00: configured for UDMA/133
Mar 08 20:05:34 domnet kernel: ata6: EH complete
Mar 08 20:05:34 domnet kernel: ata6.00: exception Emask 0x0 SAct 0x8000000 SErr 0x0 action 0x0
Mar 08 20:05:34 domnet kernel: ata6.00: irq_stat 0x40000008
Mar 08 20:05:34 domnet kernel: ata6.00: failed command: READ FPDMA QUEUED
Mar 08 20:05:34 domnet kernel: ata6.00: cmd 60/08:d8:48:00:00/00:00:00:00:00/40 tag 27 ncq 4096 in
Mar 08 20:05:34 domnet kernel: res 41/40:00:49:00:00/00:00:00:00:00/40 Emask 0x409 (media error) <F>
Mar 08 20:05:34 domnet kernel: ata6.00: status: { DRDY ERR }
Mar 08 20:05:34 domnet kernel: ata6.00: error: { UNC }
Mar 08 20:05:34 domnet kernel: ata6.00: configured for UDMA/133
Mar 08 20:05:34 domnet kernel: ata6: EH complete
Mar 08 20:05:34 domnet kernel: ata6.00: exception Emask 0x0 SAct 0x10000000 SErr 0x0 action 0x0
Mar 08 20:05:34 domnet kernel: ata6.00: irq_stat 0x40000008
Mar 08 20:05:34 domnet kernel: ata6.00: failed command: READ FPDMA QUEUED
Mar 08 20:05:34 domnet kernel: ata6.00: cmd 60/08:e0:48:00:00/00:00:00:00:00/40 tag 28 ncq 4096 in
Mar 08 20:05:34 domnet kernel: res 41/40:00:49:00:00/00:00:00:00:00/40 Emask 0x409 (media error) <F>
Mar 08 20:05:34 domnet kernel: ata6.00: status: { DRDY ERR }
Mar 08 20:05:34 domnet kernel: ata6.00: error: { UNC }
Mar 08 20:05:34 domnet kernel: ata6.00: configured for UDMA/133
Mar 08 20:05:34 domnet kernel: ata6: EH complete
Mar 08 20:05:34 domnet kernel: ata6.00: exception Emask 0x0 SAct 0x20000000 SErr 0x0 action 0x0
Mar 08 20:05:34 domnet kernel: ata6.00: irq_stat 0x40000008
Mar 08 20:05:34 domnet kernel: ata6.00: failed command: READ FPDMA QUEUED
Mar 08 20:05:34 domnet kernel: ata6.00: cmd 60/08:e8:48:00:00/00:00:00:00:00/40 tag 29 ncq 4096 in
Mar 08 20:05:34 domnet kernel: res 41/40:00:49:00:00/00:00:00:00:00/40 Emask 0x409 (media error) <F>
Mar 08 20:05:34 domnet kernel: ata6.00: status: { DRDY ERR }
Mar 08 20:05:34 domnet kernel: ata6.00: error: { UNC }
Mar 08 20:05:34 domnet kernel: ata6.00: configured for UDMA/133
Mar 08 20:05:34 domnet kernel: ata6: EH complete
Mar 08 20:05:34 domnet kernel: ata6.00: exception Emask 0x0 SAct 0x40000000 SErr 0x0 action 0x0
Mar 08 20:05:34 domnet kernel: ata6.00: irq_stat 0x40000008
Mar 08 20:05:34 domnet kernel: ata6.00: failed command: READ FPDMA QUEUED


It seems to me that I have a bad drive. I then paid attention to the fron panel on the unit, and it does seem like one of the drives (the bottom one) is being accessed rather more than the rest (the LED lights up a lot). Migth this drive be bas?
Message 1 of 10
StephenB
Guru

Re: Bad Drive on Readynas 516

You could try booting up w/o it, and see if the log continues.
Message 2 of 10
vandermerwe
Master

Re: Bad Drive on Readynas 516

Is the smart data for that drive ( and the others) normal?

Back it up. Use UPS.
Infrant Readynas NV+ | Readynas Ultra 6 Plus | Readynas 316
Message 3 of 10
ddoming73
Aspirant

Re: Bad Drive on Readynas 516

Stephen: Rebooting makes no difference.

vandermerwe: The SMART data does not seem to be accessible, I set the cursor over the drive LED in the drive list and and a pop-up appears, but it is blank, with a rolling circle (typical please wait...).
Message 4 of 10
ddoming73
Aspirant

Re: Bad Drive on Readynas 516

Sorry Stephen,

I initially did not understand your message. I have not yet tried to remove the drive and reboot. I was hoping for more input from experts here before messing with the RAID array.

But it certainly is something to try.
Message 5 of 10
StephenB
Guru

Re: Bad Drive on Readynas 516

ddoming73 wrote:
vandermerwe: The SMART data does not seem to be accessible, I set the cursor over the drive LED in the drive list and and a pop-up appears, but it is blank, with a rolling circle (typical please wait...).
It does this on every drive, or just the bottom one?

You can download the logs, and look in smart_history.log and disk_info.log.
Message 6 of 10
mdgm-ntgr
NETGEAR Employee Retired

Re: Bad Drive on Readynas 516

Your smart_history log does show one of the disks has a non-zero current pending sector count, though the count is only 1 and that happened back in early January.
Message 7 of 10
ddoming73
Aspirant

Re: Bad Drive on Readynas 516

After some help from mdgm, we reached the conclusion that one of the drives is faulty (/dev/sdf).

I'll replace it and see if things improve.

Thanks everybody for your input.
Message 8 of 10
ddoming73
Aspirant

Re: Bad Drive on Readynas 516

Hi,

Got a new drive, and after a rebuild my NAS is back to normal.

Just some comments:

The failure mode I experienced was very confusing. I had a degraded NAS that gave no expicit warning as to what the problem was.

Netgear should look into why a drive with an apparently simple error can bring the NAS to its knees in perofrmance, and also bring down the error diagnostics system. maybe some improvements need to be made.
Message 9 of 10
StephenB
Guru

Re: Bad Drive on Readynas 516

ddoming73 wrote:
...The failure mode I experienced was very confusing. I had a degraded NAS that gave no expicit warning as to what the problem was.

Netgear should look into why a drive with an apparently simple error can bring the NAS to its knees in perofrmance, and also bring down the error diagnostics system. maybe some improvements need to be made.
A lot of things can go wrong with drives - I had an unusual error a few months ago, where the drive suddenly failed in a way that locked up the SATA bus. Hot-inserting a new disk had no effect, because the SATA failure resulted in the driver shutting down the interface altogether.

I know this was the cause because Netgear analyzed the logs. So this is an area that they definitely keep an eye on.

Fixing all of these unusual cases is likely not possible though, unless you are willing to rewrite linux drivers and kernel code. Changing that level of code has its own risks (and bugs there would have a lot of bad consequences).
Message 10 of 10
Top Contributors
Discussion stats
  • 9 replies
  • 2888 views
  • 0 kudos
  • 4 in conversation
Announcements