NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
Chappy316
Nov 11, 2021Aspirant
Chances of Recovery
Hoping to (hopefully) find some sliver of light at the end of what appears to be a very dark tunnel at this time. It appears that I have lost two of the four drives (in a very short time period) ...
Chappy316
Nov 12, 2021Aspirant
To avoid cluttering up someone else's thread, it appears that Matt-I is having similar issues to me.
Someone else tagged rn_enthusiast in his thread. I will be taking the suggestions he made to see what we can come up with. Hopefully things work out for the best.
- rn_enthusiastNov 12, 2021Virtuoso
Hi Chappy316
Your raid is broken because you have a dual disk failure in a raid 5.
Disk 3 started to show signs of failure back in December 2020. I don't think you were notified as it seems the alert system failed to send you messages (either not configured or misconfigured).
[20/12/01 22:47:30 EST] warning:system:LOGMSG_SENT_ALERT_MESG_FAILED Alert message failed to send.
Evident in the logs is that disk 3 was steadily getting worse throughout the year and eventually the disk was kicked from the raid.
[21/06/13 17:25:27 EDT] warning:volume:LOGMSG_HEALTH_VOLUME Volume data health changed from Redundant to Degraded. [21/06/13 17:25:31 EDT] err:disk:LOGMSG_ZFS_DISK_STATUS_CHANGED Disk in channel 3 (Internal) changed state from ONLINE to FAILED.
In July, it appears the disk was pulled from the bay and re-added which initiated a raid re-sync.
[21/07/08 19:23:58 EDT] warning:disk:LOGMSG_DELETE_DISK Disk Model:WDC WD4000FYYZ-01UL1B2 Serial:WD-WMC130D08XJ7 was removed from Channel 3 of the head unit. [21/07/08 19:26:45 EDT] notice:disk:LOGMSG_ADD_DISK Disk Model:WDC WD4000FYYZ-01UL1B2 Serial:WD-WMC130D08XJ7 was added to Channel 3 of the head unit. [21/07/08 19:26:58 EDT] notice:volume:LOGMSG_RESILVERSTARTED_VOLUME Resyncing started for Volume data.
However, the disk is too bad and resync never completed. From this moment forward, the raid is no longer redundant.
Then in October, disk 4 was kicked out of the raid and the volume was declared "dead" (dual disk failure).
[21/10/04 18:03:11 EDT] notice:volume:LOGMSG_HEALTH_VOLUME Volume data health changed from Degraded to Dead. [21/10/04 18:04:09 EDT] err:disk:LOGMSG_ZFS_DISK_STATUS_CHANGED Disk in channel 4 (Internal) changed state from ONLINE to FAILED.
Disk 4 is not 100% healthy but not too bad either. 19 bad sectors on the disk and I don't see any real complaints about the disk in the kernel logs. However, it is still not a healthy disk and clearly it encountered a failure on the 4th of Oct.
Below is the current state of your disk 3 and disk 4.
---> Disk 3 Device: sdb Controller: 0 Channel: 2 Model: WDC WD4000FYYZ-01UL1B2 Serial: WD-WMC130D08XJ7 Firmware: 01.01K03 Class: SATA RPM: 7200 Sectors: 7814037168 Pool: data-0 PoolType: RAID 5 PoolState: 5 PoolHostId: 2fe5a296 Health data ATA Error Count: 2554 Reallocated Sectors: 1300 Reallocation Events: 125 Spin Retry Count: 0 Current Pending Sector Count: 864 Uncorrectable Sector Count: 148 Temperature: 45 Start/Stop Count: 33 Power-On Hours: 36077 Power Cycle Count: 33 Load Cycle Count: 3 ---> Disk 4 Device: sdc Controller: 0 Channel: 3 Model: WDC WD4000FYYZ-01UL1B2 Serial: WD-WMC130D3MD5Z Firmware: 01.01K03 Class: SATA RPM: 7200 Sectors: 7814037168 Pool: data-0 PoolType: RAID 5 PoolState: 5 PoolHostId: 2fe5a296 Health data ATA Error Count: 0 Reallocated Sectors: 0 Reallocation Events: 0 Spin Retry Count: 0 Current Pending Sector Count: 19 Uncorrectable Sector Count: 19 Temperature: 44 Start/Stop Count: 32 Power-On Hours: 37414 Power Cycle Count: 32 Load Cycle Count: 3
Disk 1 and 2 are healthy as is. Disk 3 is likely a write-off at this stage (but keep it until data is recovered). The best option here is to clone disk 4 and use a healthy cloned disk to re-assemble the raid. That should absolutely be possible given that disk 4 is not completely dead, which I don't see it being.
My advise would be to opt for some paid data recovery support with Netgear and let them help clone disk 4 and re-assemble the raid. You could possibly even manually re-assemble the raid with the current disk 4 but I would not risk it as it is not a fully healthy disk. I think a paid support contract is worth it here as chances for successful recovery are very high, in my opinion.I would also advise that you consider getting the alerts setup to be working. This way you will be notified by email if disks are failing and you won't end up in the same situation again. Backups are of course also important and I am sure many people here on forum can give advise on that and which strategies they use.
Cheers
- Chappy316Nov 13, 2021Aspirant
Hey rn_enthusiast,
For starters, thank you very much for the help and insight to hopefully resolving this problem.
Just some follow up then a couple questions.
Disk 3 was removed from the array in July with some guidance from other uses on this forum to hopefully jump start it back to life. The suggestion was made, to hopefully make this fully redundant again, to start searching for replacement/upgrade options. I did not realize that it never fully reinitialized. In the process of determining what I wanted for a replacement, we get to where we are now unfortunately.
In the future, I would guess your suggestion would be to not pull a drive that is potentially, or likely from what it looked like, failing to avoid a chance of breaking the array?
Also, I am getting an external backup solution (some sort of external USB) for the highly sensitive files in the array. Do you have one here that you would recommend brand or size wise? I was looking at a WD Essentials as I have always used their internal drives in personal builds in the past and never had any major issues.
So a couple questions on trying to recovery what is left.
What is the process of going through NetGear for paid support to clone Disk 4 (and/or Disk 3 for that matter) and do you know a rough idea of cost on this process? (I know you were a former employee so its just a question, nothing I would hold you to. Just looking for a rough idea.)
Is the cloning process something I could do at home myself? If yes, would that be a faster and cheaper first attempt to fixing the array? Also, is there any more damage that can be caused if I made an initial attempt at home and then had to revert to NetGear?
When attempting to clone Disk 4 (and or Disk 3) either at home or with NetGear, can the drive size be upgraded at that time? The initial reason I came here was to look into expanding the size of the array. Currently they are 4tb drives, could I clone one or both of them to larger drives and restart the array with more size? The ultimate plan is to upgrade the whole array but I was cautioned away from it in July. The status of Disk 3 scared away a couple users in fear that another drive in the array may be close to end of life. Looks like they were right.
If upgrading in HDD size is not the case, would I need to buy a 4tb drive that "matches" what is currently in the array or would anything of similar size be acceptable to clone to regardless of brand or model?
As far as alerts go, apparently it doesn't want to play well with gmail or I am missing something simple. I tried getting it set up with my account so I can receive them and Google won't allow the simple one button sign in, even after I turned on the ability to use less secure apps. Attempting to manually enter the email credentials doesn't help either as it throws an SMTP error when sending a test message.
Again, thank you for any and all insight into this.
Chris
- rn_enthusiastNov 13, 2021Virtuoso
Hi Chappy316
As for external USBs, I have had WD elements USB drives attached to my NAS for a while with no problems. StephenB and Sandshark might know more about USB compatibility in general but I have had good success with WD elements 2TB and 4TB, personally.
The cost of the data recovery contract, I think was something in the the region of $100-$150 but it has been a good few years since I worked there. More extensive data recovery work could require extra cost from what I remember. Maybe StephenB knows the price better? In any case, it will be far cheaper than any regular data recovery service.
As for doing it yourself, you theoretically can. Netgear aren't doing anything magical here but it requires a bit of knowledge. The other advantages of using Netgear to do it, would be that they can use the NAS itself for the cloning process. Makes it easier for you. One would need to examine the raid super blocks to ensure that you are cloning and using the correct disk to re-assemble the raid. Based on logs, it looks like disk 4 is the one we need to use and clone for the raid assembly but examination of the raid super-block would still be prudent. Next will be to monitor the cloning process and assess whether the clone was fully successful and then manually assemble the raid array using disk 1, 2 and the cloned disk 4.
The issue is that it requires some knowledge and/or experience to do this. There are pitfalls, as cloning in the wrong direction or incorrect re-assembling the raid, can lead to total data loss. In any circumstance, it would imagine that you would need Netgear to at least help assemble the raid so having then also start the cloning process (which takes many hours to finish) would probably make sense as I don't imagine that would add a lot to the cost of the work + it makes it safer for you. The replacement drive and can be a drive of same size or larger. I am 99% sure of this. Either should do fine but checking with Netgear is the best thing here but I don't imagine the clone process will cause any issues using a larger target drive.
As for the alerts, I don't use gmail myself (due to privacy stance and tinfoil hats and so on :) ). But the NAS email service is like a forwarding agent that essentially log in into your gmail account and send the mail to yourself. This is something I think gmail might have blocked by default. I am sure it is possible to get working and the two guys that I tagged earlier in the post probably knows more about this, than I do. I will let them chime in on that.
Cheers
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!