NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
chuckster_l
Nov 07, 2016Aspirant
Disk Test Failure
I got a Disk Test Failure mail from my NAS (RN104, firmware 6.4.2):
Disk test failed on disk in channel 2, model WDC_WD20EARX-00PASB0, serial WD-WMAZAYYYYXXX
I'd like to know how severe my problem is and if I'll need to replace my HD (I just bought a new HDD and that is not the faulty one...)
I got the full log, but I'm not sure where to look at.
Device: sdc
Controller: 0
Channel: 1
Model: WDC WD20EARX-00PASB0
Serial: WD-WMAZAYYYYXXX
Firmware: 51.0AB51
Class: SATA
Sectors: 3907029168
Pool: data
PoolType: RAID 5
PoolState: 1
PoolHostId: e3623ec
Health data
ATA Error Count: 0
Reallocated Sectors: 0
Reallocation Events: 0
Spin Retry Count: 0
Current Pending Sector Count: 0
Uncorrectable Sector Count: 0
Temperature: 35
Start/Stop Count: 6541
Power-On Hours: 16053
Power Cycle Count: 2011
Load Cycle Count: 176902
20 Replies
Replies have been turned off for this discussion
- kohdeeNETGEAR Expert
The data that generates that information doesn't output what reason a smart test fails for, unfortunately.
If you enable SSH and enter
smartctl -a /dev/sda
into the terminal (replacing sda with whatever disk ID is being questioned, sdb, sdc, sdd, etc...), you'll see an additonal section that says the short result of the test. Some data you might see in failed disk reports are failures due to rebooting or powering off the device, or a read failure at a specific sector or location of the disk. It's possible that when the disk test is running in the ReadyNAS, it is reading a sector of the disk that has been reallocated and fails at that spot. We'll know if that's the case if the read failure is at/near the same sector each time it has the read failure.
If you can copy/paste the output of smartctl -a /dev/sda to us, we can help decypher the reason for you.
*edit* I just went back and looked at your smart report from the software you used -- I noticed that you have 140 reallocated sectors, which may not be a huge deal unless that number starts climbing more and more over short amounts of time. Since you're using a RAID 5, you can afford to have a drive die as long as it is only one drive that is failing. The ReadyNAS already polls the disk and looks for thresholds on when to give you an alert about drives. If you're uncomfortable, just purchase a spare drive to have as a cold spare. Once you do get an e-mail alert that your ReadyHAS has a failed drive, then you can replace it same day and start the resync process.
- chuckster_lAspirant
Sorry for the gap in time guys, I hope we didn't loose the motion on this one, but I had to go away for a few days. Anyway, below is the result from the smartctl -a.
With regards to the relocated sectors, that was the element that came to me as as warning as well, but being within the limits and with the RAID5 redudancy, I'm okay with that for the moment, but preparing for the disk replacement in the near future. My intent here is to understand how my NAS system works and how to handle the information provided by it.
root@scooby:~# smartctl -a /dev/sdc
smartctl 6.4 2015-06-04 r4109 [armv7l-linux-4.1.16.armada.1] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Green
Device Model: WDC WD20EARX-00PASB0
Serial Number: WD-WMAXXXXX4736
LU WWN Device Id: 5 0014ee 0582955a4
Firmware Version: 51.0AB51
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS (minor revision not indicated)
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Fri Nov 18 09:38:35 2016 AWST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (36000) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 347) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x3035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 172 168 021 Pre-fail Always - 6375
4 Start_Stop_Count 0x0032 094 094 000 Old_age Always - 6632
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0
9 Power_On_Hours 0x0032 078 078 000 Old_age Always - 16232
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 098 098 000 Old_age Always - 2016
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 94
193 Load_Cycle_Count 0x0032 141 141 000 Old_age Always - 178251
194 Temperature_Celsius 0x0022 112 101 000 Old_age Always - 38
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 20
SMART Error Log Version: 1
ATA Error Count: 2
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 2 occurred at disk power-on lifetime: 16055 hours (668 days + 23 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 80 08 ef 7f e6 Error: UNC 128 sectors at LBA = 0x067fef08 = 109047560
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 80 c0 ee 7f e6 08 9d+00:50:21.918 READ DMA
c8 00 f0 c0 f9 7f e6 08 9d+00:50:21.904 READ DMA
c8 00 80 c0 f6 7f e6 08 9d+00:50:21.903 READ DMA
ef 10 02 00 00 00 a0 08 9d+00:50:21.902 SET FEATURES [Enable SATA feature]
Error 1 occurred at disk power-on lifetime: 16055 hours (668 days + 23 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 80 08 ef 7f e6 Error: UNC 128 sectors at LBA = 0x067fef08 = 109047560
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 80 c0 ee 7f e6 08 9d+00:50:19.163 READ DMA
c8 00 50 70 f6 7f e6 08 9d+00:50:19.126 READ DMA
c8 00 00 c0 ed 7f e6 08 9d+00:50:19.100 READ DMA
c8 00 00 40 e2 7f e6 08 9d+00:50:18.872 READ DMA
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 90% 16043 109047560
# 2 Extended offline Completed: read failure 90% 15875 108938968
# 3 Extended offline Completed without error 00% 15605 -
# 4 Extended offline Completed without error 00% 15438 -
# 5 Extended offline Completed without error 00% 15270 -
# 6 Extended offline Completed without error 00% 15102 -
# 7 Extended offline Completed without error 00% 14934 -
# 8 Extended offline Completed without error 00% 14766 -
# 9 Extended offline Completed without error 00% 14598 -
#10 Extended offline Completed without error 00% 14431 -
#11 Extended offline Completed without error 00% 14263 -
#12 Extended offline Completed without error 00% 14095 -
#13 Extended offline Completed without error 00% 13927 -
#14 Extended offline Completed without error 00% 13760 -
#15 Extended offline Completed without error 00% 13592 -
#16 Extended offline Completed without error 00% 13424 -
#17 Extended offline Completed without error 00% 13256 -
#18 Extended offline Completed without error 00% 12921 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.Thanks for the help in advance :)
Charles
- StephenBGuru - Experienced User
Does the log give any information about the disk in channel 2? You posted the data only for channel 1.
chuckster_l wrote:
I got a Disk Test Failure mail from my NAS (RN104, firmware 6.4.2):
Disk test failed on disk in channel 2, model WDC_WD20EARX-00PASB0, serial WD-WMAZAYYYYXXX
...
Device: sdc
Controller: 0
Channel: 1- chuckster_lAspirant
I found that funny. The Serial given by the email does not match what was in channel 2, then I though it would start at #1 instead of channel 0. But here are the results for all 4:
==============================================
Disks
==============================================
Disk sdd:
HostID: 0e3623ec
Flags: 0x0
Size: 5860533168 (2794 GB)
Free: 1953508054
Controller 0
Channel: 0
Model: WDC WD30EZRZ-00Z5HB0
Serial: WD-WCCXXXXX5U5Y
Firmware: 80.00A80
RPM: 5400
SMART data
Reallocated Sectors: 0
Reallocation Events: 0
Spin Retry Count: 0
Current Pending Sector Count: 0
Uncorrectable Sector Count: 0
Temperature: 31
Start/Stop Count: 188
Power-On Hours: 309
Power Cycle Count: 2
Load Cycle Count: 6154
Latest Self Test: PassedDisk sdc:
HostID: 0e3623ec
Flags: 0x0
Size: 3907029168 (1863 GB)
Free: 4054
Controller 0
Channel: 1
Model: WDC WD20EARX-00PASB0
Serial: WD-WMAXXXXX4736
Firmware: 51.0AB51
SMART data
Reallocated Sectors: 0
Reallocation Events: 0
Spin Retry Count: 0
Current Pending Sector Count: 0
Uncorrectable Sector Count: 0
Temperature: 35
Start/Stop Count: 6541
Power-On Hours: 16053
Power Cycle Count: 2011
Load Cycle Count: 176902
Latest Self Test: FailedDisk sdb:
HostID: 0e3623ec
Flags: 0x0
Size: 3907029168 (1863 GB)
Free: 4054
Controller 0
Channel: 2
Model: WDC WD20EFRX-68EUZN0
Serial: WD-WCCXXXXX6082
Firmware: 82.00A82
RPM: 5400
SMART data
Reallocated Sectors: 0
Reallocation Events: 0
Spin Retry Count: 0
Current Pending Sector Count: 0
Uncorrectable Sector Count: 0
Temperature: 35
Start/Stop Count: 4458
Power-On Hours: 9686
Power Cycle Count: 44
Load Cycle Count: 4454
Latest Self Test: PassedDisk sda:
HostID: 0e3623ec
Flags: 0x0
Size: 3907029168 (1863 GB)
Free: 4054
Controller 0
Channel: 3
Model: WDC WD20EFRX-68EUZN0
Serial: WD-WCCXXXXX3VAE
Firmware: 82.00A82
RPM: 5400
SMART data
Reallocated Sectors: 0
Reallocation Events: 0
Spin Retry Count: 0
Current Pending Sector Count: 0
Uncorrectable Sector Count: 0
Temperature: 33
Start/Stop Count: 5237
Power-On Hours: 9686
Power Cycle Count: 44
Load Cycle Count: 5233
Latest Self Test: Passed
- mdgm-ntgrNETGEAR Employee Retired
The SMART stats look fine, so those don't explain why it failed the test.
- chuckster_lAspirant
So where should I look at to find out my problem?
- mdgm-ntgrNETGEAR Employee Retired
What did the email you received say?
Well you could power down, remove the disk, hook it up to your PC and test it using WD Data LifeGuard Diagnostics.
Related Content
NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!