NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
JDFuzz
Oct 26, 2020Aspirant
Volume Degraded after disk replacement
Hi I hope somebody can help me!
I got a message one day saying one of my two disks have failed (both 4xTB WD reds).
So I replaced the bad one with a 4TB Seagate Ironwolf as the newer WD reds had bad reviews,
After re-syncing the first time, the volume was still degraded. I had this message in my logs.
"Volume: The resync operation finished on volume data. However, the volume is still degraded."
Everytime I shut this thing down and it boots up again, it begins to resync which takes all day. At this point I'm not sure what to do. I'd try a factory reset after backing up my data but I do not have the storage capacity to do that without the NAS.
I saw a similar post asking for logs, so below is disk_info and mdstat.
Thank you!
disk_info
Device: sda Controller: 0 Channel: 0 Model: WDC WD40EFRX-68WT0N0 Serial: WD-WCC4E3RDS3R9 Firmware: 82.00A82W Class: SATA RPM: 5400 Sectors: 7814037168 Pool: data PoolType: RAID 1 PoolState: 3 PoolHostId: 1177b65c Health data ATA Error Count: 0 Reallocated Sectors: 0 Reallocation Events: 0 Spin Retry Count: 0 Current Pending Sector Count: 2163 Uncorrectable Sector Count: 0 Temperature: 34 Start/Stop Count: 4057 Power-On Hours: 23338 Power Cycle Count: 754 Load Cycle Count: 4080 Device: sdb Controller: 0 Channel: 1 Model: ST4000VN008-2DR166 Serial: ZGY7LFFH Firmware: SC60 Class: SATA RPM: 5980 Sectors: 7814037168 Pool: data PoolType: RAID 1 PoolState: 3 PoolHostId: 1177b65c Health data ATA Error Count: 0 Reallocated Sectors: 0 Reallocation Events: 0 Spin Retry Count: 0 End-to-End Errors: 0 Command Timeouts: 0 Current Pending Sector Count: 0 Uncorrectable Sector Count: 0 Temperature: 30 Start/Stop Count: 8 Power-On Hours: 45 Power Cycle Count: 5 Load Cycle Count: 51
mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md1 : active raid1 sdb2[1] sda2[0]
523264 blocks super 1.2 [2/2] [UU]
md127 : active raid1 sdb3[2](S) sda3[0]
3902166784 blocks super 1.2 [2/1] [U_]
md0 : active raid1 sdb1[2] sda1[0]
4190208 blocks super 1.2 [2/2] [UU]
unused devices: <none>
/dev/md/0:
Version : 1.2
Creation Time : Tue Jan 3 18:50:48 2017
Raid Level : raid1
Array Size : 4190208 (4.00 GiB 4.29 GB)
Used Dev Size : 4190208 (4.00 GiB 4.29 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Update Time : Mon Oct 26 20:57:19 2020
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Consistency Policy : unknown
Name : 1177b65c:0 (local to host 1177b65c)
UUID : 9fc60477:074fbd3b:2f491de7:c513f18b
Events : 5324
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
2 8 17 1 active sync /dev/sdb1
/dev/md/data-0:
Version : 1.2
Creation Time : Tue Jan 3 18:50:48 2017
Raid Level : raid1
Array Size : 3902166784 (3721.40 GiB 3995.82 GB)
Used Dev Size : 3902166784 (3721.40 GiB 3995.82 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Update Time : Mon Oct 26 20:48:18 2020
State : clean, degraded
Active Devices : 1
Working Devices : 2
Failed Devices : 0
Spare Devices : 1
Consistency Policy : unknown
Name : 1177b65c:data-0 (local to host 1177b65c)
UUID : da7276bb:de92440c:3630b8ff:39b87372
Events : 5565
Number Major Minor RaidDevice State
0 8 3 0 active sync /dev/sda3
- 0 0 1 removed
2 8 19 - spare /dev/sdb34 Replies
Replies have been turned off for this discussion
- StephenBGuru - Experienced User
This confirms that the data volume is degraded:
md127 : active raid1 sdb3[2](S) sda3[0] 3902166784 blocks super 1.2 [2/1] [U_]It should say [UU].
But it's hard to say why, since the new disk looks healthy. Try downloading the full log zip file, and look for errors in system.log and kernel.log.
You might also look at the bottom of volume.log, and see if there is anything useful listed there (in the === maintenance history === section).
You shouldn't post the full log zip here. But you can ask the mods ( JohnCM_S and Marc_V ) to review them for you. What you need to do is first upload the log zip to cloud storage (google drive, onedrive, dropbox, etc). Then send a private message (PM) to the mods, using the envelope icon in the upper right of the forum page. Include a download link to the log zip, and also a link to this forum thread.
- SandsharkSensei
One possibility is that the new drive is unhealthy, though not totally dead. It does happen. I recommend testing it in a PC (connected via USB dock or internal SATA) with Seatools.
- JDFuzzAspirant
Hey, Thanks for taking the time to reply!
It turns out that my essentials files were only around 300GB which I could offload and then perform a factory reset.
After the fresh install and resync. It runs perfectly now!
So can we safely rule out hardware at this point ?
There only thing I had noticed prior to all this is my Videos share was showing that I was using 100,000's of GBs. Much more than I have on the NAS. I didn't pay it any attention as it didn't affect anything.
I would love to find out what happened here for others or future me. Luckily I downloaded the logs prior to the reset.
Below is the bottom of my volume.log
=== maintenance history === device operation start_time end_time result details ---------- --------- ------------------- ------------------- --------- ---------------------------------------------------------------- data resilver 2018-03-22 19:29:43 2018-03-23 05:38:19 completed data resilver 2020-10-19 19:23:52 2020-10-19 19:24:03 degraded data resilver 2020-10-25 00:00:51 data resilver 2020-10-25 11:19:14 2020-10-26 06:28:40 degraded data disk test 2020-10-26 09:41:47 data resilver 2020-10-26 10:42:41 2020-10-26 20:48:23 degraded
- StephenBGuru - Experienced User
I suggest running the disk test from the volume settings wheel (after the volume is fully synced). There might still be an issue with one of the two disks.
The incorrect size of the video share suggests that there was some btrfs file system corruption on the first disk.
As far as the logs go, I suggest examining system.log and kernel.log for btrfs or disk error messages. Look in the time window when the last resilvering was being done - 10:42:41 to 20:48:23 on 26 Oct.
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!