NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

sdchew's avatar
sdchew
Aspirant
Apr 16, 2013

SMART errors - What do you do?

I got a message from my READYNAS informing me one of the disk is experiencing a SMART event which may point to an impending drive failure. I looked at the SMART table and its basically the 'Current Pending Sector', 'Offline Uncorrectable' and the 'Multi Zone Error Rate' count which is rising.

I run a 5 disk dual redundancy array. I would like to seek some consensus on what can be done to push out replacing the drive if possible.

Should I:

a) don't mess around and replace the drive
b) pull the drive, low level format it and pop it back.
c) increase the disk scrubbing and volume consistency check frequency to safe guard against data corruption and ignore it until it fails

Any advise would be appreciated. Thank you

10 Replies

Replies have been turned off for this discussion
  • Smart table just incase someone is interested:
    SMART Attribute
    Raw Read Error Rate 0
    Spin Up Time 6233
    Start Stop Count 210
    Reallocated Sector Count 0
    Seek Error Rate 0
    Power On Hours 11376
    Spin Retry Count 0
    Calibration Retry Count 0
    Power Cycle Count 208
    Power-Off Retract Count 15
    Load Cycle Count 507035
    Temperature Celsius 40
    Reallocated Event Count 0
    Current Pending Sector 17
    Offline Uncorrectable 17
    UDMA CRC Error Count 0
    Multi Zone Error Rate 27
    ATA Error Count 0

    By the way, if I choose option B and the array rebuild fails, data should be still intact due to dual redundancy right?
  • Well the SMART could be wrong but the "Load Cycle Count" seems very, very high. Can't say more without knowing what drive it is, but I'd check the mfrs specs. My guess is - if the number is correct - that the disk is parking itself after a short time-out and then un-parking. But check - SMART data can vary between manufacturers.

    RMA the disk - I assume it's still in warranty - but replace ASAP.
  • StephenB's avatar
    StephenB
    Guru - Experienced User
    The key errors are the "current pending sector" and "offline uncorrectable". "Current Pending sectors" is incremented when the disk can't read a sector. On a write request, the sector would have been reallocated.

    17 sectors is probably not high enough for the vendor tools to say the drive is bad. I would plan to replace it anyway - especially if the count suddenly jumped from near-0 up to 17.

    If it is under warranty, you would get a refurbished drive. Generally I won't put refurbished drives in my RAID array, I put them somewhere else.

    You should probably check the warranty status, as you may only have a 12 month warranty (which would be expired) anyway. If it is under warranty, then do a full diagnostic (read and write tests). The bad sector count will likely go up. Note that the write test would be destructive, so you should probably replace the drive first.

    On the load cycle count, I have one WD30EZRX with >785000 load cycles, and it is still working fine. My own experience with green drives [so far] is that the load cycle specs are extremely conservative. However, I no longer put them in RAID arrays, I have replaced them in my primary NAS with WDC Red drives.

    sdchew wrote:
    ...a) don't mess around and replace the drive
    b) pull the drive, low level format it and pop it back.
    c) increase the disk scrubbing and volume consistency check frequency to safe guard against data corruption and ignore it until it fails

    (a) is what I would do if it were my drive. Since I have full backups, I might watch it a bit longer. But usually the failure counts rapidly accelerate once they begin climbing into double digits.

    A variant of (b) is to run full read/write diagnostics with vendor tools, and then reexamine the SMART stats. Though I wouldn't put it back into the array in any case.

    (c) I wouldn't increase the frequencies. Scrubbing in particular will increase the stress on all the drives.
  • Yeah they are WD Green Drive and I'm been putting Red Drives since they appeared on the market. I did modify the Green drives to ensure they don't park so often.

    Warranty is probably expired on the drive so I'll chuck it then...
  • StephenB wrote:
    No harm in checking on-line at https://westerndigital.secure.force.com ... ck?lang=en

    Your power-on hours suggest you purchased very close to the time when WDC reduced the warranty to 1 year. You might have gotten the tail-end of the 3 year warranty.


    You're guess is spot on... I did a quick check using the link and it indeed has a 3 year warranty on it. Ends 2014.

    I'll probably RMA the drive and think of what to do with it later...
  • StephenB's avatar
    StephenB
    Guru - Experienced User
    sdchew wrote:
    I'll probably RMA the drive and think of what to do with it later...
    That's my usual approach. Usually it ends up in a desktop or something less critical than my NAS. Sometimes it becomes an emergency spare.
  • I had a similar issue with the load/unload on a Green drive, my take is that they are either not tested beyond the spec (unlikely) or that beyond the spec there are reliability issues. I know the WD are rated @ 300k cycles, hence I would think that at 170% of that it is flag I would pay attention to. But of course it could go for another 10 years or fail in 10 seconds! :-)

    Just to note that replacement drives are "re-certified", not refurbished. One (not I) could argue that since they have undergone even more testing than a new drive they should be at least as reliable.... :-) Most likely they are just "no fault" returns, replaced mobo or firmware upgraded. The only "unknown" is how much action they have seen, since I guess they wipe the SMART stats. That's the only reason I won't put my data on them!
  • StephenB's avatar
    StephenB
    Guru - Experienced User
    Recertified is the right word of course. It wouldn't be cost effective to try and repair them.

    They certainly do wipe the SMART stats. I got one a couple weeks ago from Seagate with 1 reallocated sector, so I guess they leave some of them alone. But not power on hours, spin ups, load cycle, etc.
  • Well I bought a new WD Red to replace it. Rebuilding the array went smoothly.

    I ran WD's Datalife tool on the bad drive and it won't even complete. Says multiple unrecoverable sectors. I wonder how it was even functional in the NAS in the first place.

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More