NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
dmahon1
Jul 02, 2015Aspirant
ATA errors on one disk, failed other disk
On 6/1/15 I got the following email:
On 10/1/15 I got:
On 1/2/15 I got a differently formatted message:
On 27/6/15 I got another:
This is all same disk.
The SMART data for the disk suggests:
1) I don't understand the numbers (the ATA errors don't add up)
2) Why the differing message formats?
3) Does this level and type of error matter?
I also saw in the log from 21/6/15 (but received no email):
And today received:
For Disk 2.
A replacement for Disk 2 (2TB WD Red) will arrive in the morning to replace it.
I have 2 USB disks connected which backup on alternate days from a daily snapshot. A successful backup was carried out on Wednesday and one is currently in progress for today (they take an age, even for incremental backups, I'm glad that there is only about 1.5TB of data). I've turned the backup job off for the disk that completed successfully on Wednesday and will only enable it when the array has been rebuilt.
What is the chance of Disk 4 failing during the rebuild? Should it be replaced once the array has been rebuilt?
Detected increasing ATA errors on disk 4[ST32000542AS, 5XW1M471] 40 times in the past 30 days. This often indicates an impending failure. Please be prepared to replace this disk to maintain data redundancy.
On 10/1/15 I got:
Detected increasing ATA errors on disk 4[ST32000542AS, 5XW1M471] 171 times in the past 30 days. This often indicates an impending failure. Please be prepared to replace this disk to maintain data redundancy.
On 1/2/15 I got a differently formatted message:
ATA error count has increased in the last day.
Disk 4:
Previous count: 5
Current count: 170
Growing SMART errors indicate a disk that may fail soon. If the errors continue to increase, you should be prepared to replace the disk.
On 27/6/15 I got another:
Detected increasing ATA errors on disk 4[ST32000542AS, 5XW1M471] 182 times in the past 30 days. This often indicates an impending failure. Please be prepared to replace this disk to maintain data redundancy.
This is all same disk.
The SMART data for the disk suggests:
Model: ST32000542AS
Serial: 5XW1M471
Firmware: CC34
SMART Attribute
Spin Up Time 0
Start Stop Count 91
Reallocated Sector Count 0
Power On Hours 34760
Spin Retry Count 0
Power Cycle Count 91
Runtime Bad Block 0
End-to-End Error 0
Reported Uncorrect 198
Command Timeout 0
High Fly Writes 37
Airflow Temperature Cel 41
Temperature Celsius 41
Current Pending Sector 0
Offline Uncorrectable 0
UDMA CRC Error Count 0
Head Flying Hours 118626996750541
Total LBAs Written 4246113807
Total LBAs Read 3075062916
ATA Error Count 181
1) I don't understand the numbers (the ATA errors don't add up)
2) Why the differing message formats?
3) Does this level and type of error matter?
I also saw in the log from 21/6/15 (but received no email):
Reallocated sector count has increased in the last day. Disk 2: Previous count: 0 Current count: 233 ATA error count has increased in the last day. Disk 2: Previous count: 0 Current count: 4 Growing SMART errors indicate a disk that may fail soon. If the errors continue to increase, you should be prepared to replace the disk.
And today received:
Disk failure detected.
For Disk 2.
A replacement for Disk 2 (2TB WD Red) will arrive in the morning to replace it.
I have 2 USB disks connected which backup on alternate days from a daily snapshot. A successful backup was carried out on Wednesday and one is currently in progress for today (they take an age, even for incremental backups, I'm glad that there is only about 1.5TB of data). I've turned the backup job off for the disk that completed successfully on Wednesday and will only enable it when the array has been rebuilt.
What is the chance of Disk 4 failing during the rebuild? Should it be replaced once the array has been rebuilt?
12 Replies
Replies have been turned off for this discussion
- dmahon1AspirantPS:
The backup jobs start at 00:05
They finish at around 11:30 or 15:00 (depending on which device). Both are EXT3 formatted. There is a full backup (with remove) every 4 weeks, these times are just for the regular incremental backup (which many days is 0 bytes). It seems to go to 17:00 or 21:00 when a full backup is carried out.
Is this normal? I'm wondering what will happen when I move to bigger disks and use more storage - will backup be a problem? Will it take more than a day to back up the array?
ReadyNAS NVX with latest firmware. - vandermerweMasterRegarding disk 4, you should have replaced this in January.
Regarding disk 2, this also needs to be replaced, as you are planning.
Make sure your backup is right up to date and verified before you put in the first replacement disk, the resync may well finish off the other bad disk.
Regarding your backups, what protocol are you using, and what are you backing up to, another nas or a USB drive?
It is definitely taking too long for incremental, unless you have a lot of data added each day. - dmahon1AspirantI appear to be in the wrong form - can a mod move this to general questions please?
What about these ATA errors? They appear not to be SMART errors at all. How many are significant? Why does the total number not tally up correctly?
Why didn't I get an email about the other errors on disk 2 which were SMART errors and did indeed turn out to be significant? - dmahon1Aspirant
vandermerwe wrote: Regarding disk 4, you should have replaced this in January.
Regarding disk 2, this also needs to be replaced, as you are planning.
Make sure your backup is right up to date and verified before you put in the first replacement disk, the resync may well finish off the other bad disk.
Regarding your backups, what protocol are you using, and what are you backing up to, another nas or a USB drive?
It is definitely taking too long for incremental, unless you have a lot of data added each data
USB drive (by port). Formatted EXT3.
The NAS in question belongs to a computer illiterate friend, who lives 300 miles away but happens to be staying here (going home tomorrow). I help him with it.
My NVX isn't much faster (6h for 700gb) to its USB drive formatted EXT3. - StephenBGuru - Experienced User
I moved it (though I thought boot, installation, upgrade, expansion was the right place).dmahon wrote: I appear to be in the wrong form - can a mod move this to general questions please? - vandermerweMasterI don't know why the ATA errors don't seem to match the alerts' but ATA errors almost always indicate a failing disk. It's possible that if there are ATA errors on more than one drive in the same slot, then there could be a problem with the drive bay. That's not the case here though.
1 ATA error is significant , AFAIK, drives will be replaced if they have 1 ATA error. The rate of increase is important, some may consider leaving a drive with ATA errors if they are not increasing or increasing very slowly. It depends on your attitude to risk, how important the data is, and how robust and convenient your backups are.
Also not sure about the email alerts. You could look at your logs to see if there were alerts sent which were either not received or there was a problem sending.
The backup issue could have many causes. I would start by looking at the backup configuration to make sure that's right, testing the USB disk, and looking at what's being baked up. You say no data on some days and the job still takes so long. Look at the backup job logs. The difference in time between your full and incremental jobs is not large, suggesting that the incremental jobs are backing up a large amount of data. What backup protocol are you using ? - dmahon1AspirantNew disk in. Nothing. Rebooted - array rebuilding.
Another email too:ATA error count has increased in the last day.
Disk 4:
Previous count: 170
Current count: 181
Growing SMART errors indicate a disk that may fail soon. If the errors continue to increase, you should be prepared to replace the disk.
I don't believe the numbers at all. Up and down, they don't add up. Should I believe there are actually any ATA errors? It was 181 several days ago, how can it have gone from 170-181 in the last day?
Obviously, I will change the disk. But should I change the NAS (to a different brand)? - vandermerweMasterYou should run a thorough disk test using vendor tools.
Is the disk under warranty?
if it is not under warranty and it tests OK, then really it's up to you what you do about it. I would have replaced it, as I said earlier. - dmahon1AspirantThe disk is being replaced once the array has rebuilt.
But should I have faith in the NAS with these error messages that don't make sense? I'm also upset there was no email about the sector reallocations, which are something I would have taken notice of as these have led to failures before. - vandermerweMasterHave you checked the logs to see if there were any messages sent that failed? You will need to ssh in to see these logs.
I think the email log is at var/log/frontview/msmtp.log
Related Content
NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!