NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

dmahon1's avatar
dmahon1
Aspirant
Jul 02, 2015

ATA errors on one disk, failed other disk

On 6/1/15 I got the following email:

Detected increasing ATA errors on disk 4[ST32000542AS, 5XW1M471] 40 times in the past 30 days. This often indicates an impending failure. Please be prepared to replace this disk to maintain data redundancy.


On 10/1/15 I got:

Detected increasing ATA errors on disk 4[ST32000542AS, 5XW1M471] 171 times in the past 30 days. This often indicates an impending failure. Please be prepared to replace this disk to maintain data redundancy.


On 1/2/15 I got a differently formatted message:

ATA error count has increased in the last day.

Disk 4:
Previous count: 5
Current count: 170

Growing SMART errors indicate a disk that may fail soon. If the errors continue to increase, you should be prepared to replace the disk.


On 27/6/15 I got another:


Detected increasing ATA errors on disk 4[ST32000542AS, 5XW1M471] 182 times in the past 30 days. This often indicates an impending failure. Please be prepared to replace this disk to maintain data redundancy.


This is all same disk.

The SMART data for the disk suggests:


Model: ST32000542AS
Serial: 5XW1M471
Firmware: CC34


SMART Attribute

Spin Up Time 0
Start Stop Count 91
Reallocated Sector Count 0
Power On Hours 34760
Spin Retry Count 0
Power Cycle Count 91
Runtime Bad Block 0
End-to-End Error 0
Reported Uncorrect 198
Command Timeout 0
High Fly Writes 37
Airflow Temperature Cel 41
Temperature Celsius 41
Current Pending Sector 0
Offline Uncorrectable 0
UDMA CRC Error Count 0
Head Flying Hours 118626996750541
Total LBAs Written 4246113807
Total LBAs Read 3075062916

ATA Error Count 181


1) I don't understand the numbers (the ATA errors don't add up)
2) Why the differing message formats?
3) Does this level and type of error matter?

I also saw in the log from 21/6/15 (but received no email):


Reallocated sector count has increased in the last day. Disk 2: Previous count: 0 Current count: 233 ATA error count has increased in the last day. Disk 2: Previous count: 0 Current count: 4 Growing SMART errors indicate a disk that may fail soon. If the errors continue to increase, you should be prepared to replace the disk.


And today received:

Disk failure detected.


For Disk 2.

A replacement for Disk 2 (2TB WD Red) will arrive in the morning to replace it.

I have 2 USB disks connected which backup on alternate days from a daily snapshot. A successful backup was carried out on Wednesday and one is currently in progress for today (they take an age, even for incremental backups, I'm glad that there is only about 1.5TB of data). I've turned the backup job off for the disk that completed successfully on Wednesday and will only enable it when the array has been rebuilt.

What is the chance of Disk 4 failing during the rebuild? Should it be replaced once the array has been rebuilt?

12 Replies

Replies have been turned off for this discussion
  • vandermerwe wrote:
    The backup issue could have many causes. I would start by looking at the backup configuration to make sure that's right, testing the USB disk, and looking at what's being baked up. You say no data on some days and the job still takes so long. Look at the backup job logs. The difference in time between your full and incremental jobs is not large, suggesting that the incremental jobs are backing up a large amount of data. What backup protocol are you using ?


    An update, in case anyone searches in future and finds this:

    I have just installed a new hard drive in my PC. I now still do a backup job from the NAS to the USB drive but also do another one to the new PC drive over the network.

    The backup to PC is finished in under 10 minutes. It would appear to be a piss poor USB implementation on the NAS that is the trouble (this is replicated on two NAS and on four USB drives [EXT3 formatted] from different manufacturers).

    The RAID array in question successfully completed the rebuild twice, with two replacement 2GB disks done separately and sequentially. It took about 12h for each rebuild.
  • mdgm-ntgr's avatar
    mdgm-ntgr
    NETGEAR Employee Retired
    USB is resource intensive which is a significant factor in the time USB backups take to run.

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More