NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
wfdexter
Sep 14, 2011Aspirant
ATA error count has increased in the last day.
I have a ReadyNAS Duo.
Drives from "Status":
I had a "Disk fail event occurred on SATA channel 1." on May 3, that I finally noticed and replaced the drive on May 31. The RAID rebuilt, and it was all good. I had had two of the Hitachi drives, but I put in the WD drive then.
Starting the next day, intermittently, but at an increasing frequency, I get these in the log:
I hate to replace a drive that isn't bad, but I do have a second Hitachi drive I could put in.
Suggestions?
Drives from "Status":
- Disk 1 WDC WD10EARS-00MVWB0 931 GB , 39 C / 102 F , Write-cache ON OK
Disk 2 Hitachi HDS721010CLA332 931 GB , 37 C / 98 F , Write-cache ON OK
I had a "Disk fail event occurred on SATA channel 1." on May 3, that I finally noticed and replaced the drive on May 31. The RAID rebuilt, and it was all good. I had had two of the Hitachi drives, but I put in the WD drive then.
Starting the next day, intermittently, but at an increasing frequency, I get these in the log:
ATA error count has increased in the last day. Disk 1: Previous count: 84 Current count: 90 Growing SMART errors indicate a disk that may fail soon. If the errors continue to increase, you should be prepared to replace the disk.
I hate to replace a drive that isn't bad, but I do have a second Hitachi drive I could put in.
Suggestions?
11 Replies
- mdgm-ntgrNETGEAR Employee RetiredPlease update to RAIDiator 4.1.7 or later. You should stop getting regular ATA errors then (do a few reboots to confirm).
- wfdexterAspirantI upgraded to whatever the latest is just tonight, will see what happens. Thanks!
- aleksey256AspirantHi there.
I have got the same ATA erorrs.
What does it mean exactly? Is it a problem with bad sectors?
Do I need to change a drive or NAS itself? Everything is new.
I have got ReadyNAS 4 PRO with 2 Seagate drives (Radiator 4.2.17).
Disk Write-cache OFF (because I do not have UPS) - mdgm-ntgrNETGEAR Employee Retiredaleksey what brand and model disk? What does your error count look like (some examples of log entries would be good). If it's not caused by a compatibility issue then ATA errors indicate a failing disk.
- aleksey256Aspirant
mdgm wrote: aleksey what brand and model disk? What does your error count look like (some examples of log entries would be good). If it's not caused by a compatibility issue then ATA errors indicate a failing disk.
Thank you for the prompt response.
I have got two drives by Seagate SV35 (2GB). SMART check reports about some ATA errors with the first drive in RAID 1.
This capacity (2GB) is not in the compatibility list but smaller drives (SV35) are in the list. However, I suspect the drive has bad sectors (or sectors ready to die).
All errors relate to Current_Pending_Sector and Offline_Uncorrectable.
Disks are new (2 month old). And I am wandering - is it a case to return (or exchange (RMA)) the bad drive to the seller?
Is it enough to exchange the drive?
I have done 2 times low-level offline disks check using ReadyNAS but each time I have seen a message on LCD screen like that "1 bad drive has been found" (do not remember the message exactly). But, from FrontView RAID works good (redundant).
Just in case I want to add a part of extended ReadyNAS log where I have found all ATA errors:- ***** smartctl output for sda *****
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Device Model: ST2000VX002-1AH166
Serial Number: 5YD3T789
Firmware Version: CV01
User Capacity: 2,000,398,934,016 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Fri Sep 16 18:05:57 2011 NZST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 612) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 255) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x10b3) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 115 099 006 Pre-fail Always - 94207920
3 Spin_Up_Time 0x0003 093 092 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 58
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 067 060 030 Pre-fail Always - 5606993
9 Power_On_Hours 0x0032 099 099 000 Old_age Always - 1071
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 58
184 Unknown_Attribute 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 139
188 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 086 086 000 Old_age Always - 14
190 Airflow_Temperature_Cel 0x0022 072 066 045 Old_age Always - 28 (Lifetime Min/Max 25/28)
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 5
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 58
194 Temperature_Celsius 0x0022 028 040 000 Old_age Always - 28 (0 13 0 0)
195 Hardware_ECC_Recovered 0x001a 019 019 000 Old_age Always - 94207920
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 8
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 8
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
SMART Error Log Version: 1
ATA Error Count: 139 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 139 occurred at disk power-on lifetime: 1030 hours (42 days + 22 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 38 a8 0b 79 40 00 17:09:15.269 READ FPDMA QUEUED
60 00 20 40 d4 79 40 00 17:09:15.226 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 17:09:15.226 READ FPDMA QUEUED
61 00 10 d8 7a 63 40 00 17:09:15.225 WRITE FPDMA QUEUED
27 00 00 00 00 00 e0 00 17:09:15.225 READ NATIVE MAX ADDRESS EXT
Error 138 occurred at disk power-on lifetime: 1030 hours (42 days + 22 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
61 00 08 58 6f 63 40 00 17:09:14.178 WRITE FPDMA QUEUED
61 00 10 d8 7a 63 40 00 17:09:14.146 WRITE FPDMA QUEUED
61 00 08 60 76 63 40 00 17:09:14.146 WRITE FPDMA QUEUED
60 00 08 ff ff ff 4f 00 17:09:14.145 READ FPDMA QUEUED
61 00 10 50 6f 63 40 00 17:09:14.145 WRITE FPDMA QUEUED
Error 137 occurred at disk power-on lifetime: 1030 hours (42 days + 22 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
61 00 10 50 6f 63 40 00 17:09:13.108 WRITE FPDMA QUEUED
61 00 08 60 76 63 40 00 17:09:13.108 WRITE FPDMA QUEUED
61 00 10 d8 7a 63 40 00 17:09:13.108 WRITE FPDMA QUEUED
60 00 a0 78 3f 4e 40 00 17:09:13.090 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 17:09:13.064 READ FPDMA QUEUED
Error 136 occurred at disk power-on lifetime: 1030 hours (42 days + 22 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 68 10 3f 4e 40 00 17:09:12.034 READ FPDMA QUEUED
61 00 08 48 6f 63 40 00 17:09:12.034 WRITE FPDMA QUEUED
60 00 08 ff ff ff 4f 00 17:09:12.033 READ FPDMA QUEUED
27 00 00 00 00 00 e0 00 17:09:12.033 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 00 17:09:12.033 IDENTIFY DEVICE
Error 135 occurred at disk power-on lifetime: 1030 hours (42 days + 22 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
61 00 08 48 6f 63 40 00 17:09:10.993 WRITE FPDMA QUEUED
60 00 08 ff ff ff 4f 00 17:09:10.893 READ FPDMA QUEUED
61 00 02 48 00 00 40 00 17:09:10.892 WRITE FPDMA QUEUED
27 00 00 00 00 00 e0 00 17:09:10.892 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 00 17:09:10.891 IDENTIFY DEVICE
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 50% 1067 2163751792
# 2 Extended offline Completed: read failure 50% 1056 2163751792
# 3 Short offline Completed without error 00% 22 -
# 4 Short offline Completed without error 00% 11 -
# 5 Short offline Completed without error 00% 0 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
***** smartctl output for sdb *****
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Device Model: ST2000VX002-1AH166
Serial Number: 5YD3THE8
Firmware Version: CV01
User Capacity: 2,000,398,934,016 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Fri Sep 16 18:05:58 2011 NZST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 623) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 255) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x10b3) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 120 099 006 Pre-fail Always - 239500720
3 Spin_Up_Time 0x0003 093 092 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 58
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 067 060 030 Pre-fail Always - 6078145
9 Power_On_Hours 0x0032 099 099 000 Old_age Always - 1071
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 58
184 Unknown_Attribute 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 067 067 000 Old_age Always - 33
190 Airflow_Temperature_Cel 0x0022 072 066 045 Old_age Always - 28 (Lifetime Min/Max 25/28)
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 5
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 58
194 Temperature_Celsius 0x0022 028 040 000 Old_age Always - 28 (0 13 0 0)
195 Hardware_ECC_Recovered 0x001a 024 024 000 Old_age Always - 239500720
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 1070 -
# 2 Extended offline Completed without error 00% 1059 -
# 3 Short offline Completed without error 00% 22 -
# 4 Short offline Completed without error 00% 11 -
# 5 Short offline Completed without error 00% 0 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
- aleksey256Aspirantto mdgm
Do you have any ideas about my question?
Is it a case to replace the drive?
I also suspect cache which was switched off on both drives because I do not have UPS so far. Could it be an issue?
I turned cache on several days ago like it was by default just to be sure 100% it does not affect SMART and ATA errors in negative way. - mdgm-ntgrNETGEAR Employee RetiredI would suspect that your disk is bad.
- aleksey256AspirantThank you for help MDGM. I will RMA the drive.
Cheers - Hi there. Newbie to this forum and indeed owning a ReadyNAS Duo.
I have recently setup my new empty DUO with a Western Digital disk (Western Digital WD20EARS 2TB Hard Drive SATAII 64MB Cache - OEM Caviar Green). My intention is to drop the second WD disk in over the next few days, once I've copied over a load more files onto the first disk.
I logged into RAIDar this evening and was presented with the same warning message as above - ATA Error Count increasing from 6 to 12 (see below). Oddly, the initial status screen has a green light against the disk, which I think suggests all is ok?? ((WDC WD20EARS-00MVWB0 1862 GB , 31 C / 87 F , Write-cache ON))
Any guidance appreciated (simple please!)
Many thanks
A WORRIED NEWBIE
"ATA error count has increased in the last day. Disk 1: Previous count: 6 Current count: 12 Growing SMART errors indicate a disk that may fail soon. If the errors continue to increase, you should be prepared to replace the disk."
Hostname: ReadyNASduo
Model: ReadyNAS Duo [X-RAID]
Serial: <deleted>
Firmware: RAIDiator 4.1.6 [1.00a043]
Memory: 256 MB [2.5-3-3-7]
IP address: <deleted>
Volume C: Online, X-RAID, Single disk, 2% of 1853 GB used - mdgm-ntgrNETGEAR Employee Retiredtaylord01 your ATA errors are due to a compatibility issue with WD drives that don't support TLER addressed in 4.1.7 (http://www.readynas.com/RAIDiator_4_1_8_Notes)
I would suggest you do the following:
1. Backup all data on the NAS
2. Verify backup is good
3. Upgrade to latest RAIDiator (currently 4.1.8: http://www.readynas.com/RAIDiator_4_1_8_Notes)
4. Verify update completed successfully
5. Do a factory default (wipes all data, settings, everything) e.g. via System > Update > Factory Default in Frontview.
This will give you a clean setup on the latest firmware and also give you 4k sector alignment (if you don't do the factory reset write performance will likely be poor).
Welcome to the forum!
Related Content
NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!