- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
Re: ReadyNas 526X "Data: dead"?
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ReadyNAS 526x
Firmware: 6.9.4 Hotfix 1 (will update to latest version once all this is sorted, or I'm happily convinced I'm not going to have to do a factory default.
6 drives, all WD Reds, in X-RAID.
Alright.. so ran into a goofy issue here, that's VERY similar to the situation reported, here:
https://community.netgear.com/t5/Using-your-ReadyNAS-in-Business/ReadyNAS-in-Lifesupport/td-p/175258...
Had a drive die on me a few days ago, and installed a new drive, this morning. It ran for approximately 9 hours, resyncing.. no issues at all.
However, once completed, the device immediately reports that drive #1 failed, and is now reporting 'data: DEAD' on the front panel. Additionally, the unit is showing red LEDs for the brand new drive, and the drive in slot 1.
Admin panel is up, and is showing drive 2 is good, drive one is showing red. (Again, not what the chasis is reporting, via the LEDs, however.
Resync info reads as such:
May 25, 2019 08:11:09 PM |
|
Volume: Volume data is resynced. |
May 25, 2019 08:09:36 PM |
|
Disk: Disk in channel 2 (Internal) changed state from RESYNC to ONLINE. |
May 25, 2019 08:09:15 PM |
|
Disk: Disk in channel 1 (Internal) changed state from ONLINE to FAILED. |
May 25, 2019 08:08:54 PM |
|
Volume: Volume data health changed from Degraded to Dead. |
May 25, 2019 12:28:17 PM |
|
Volume: Resyncing started for Volume data. |
May 25, 2019 12:27:46 PM |
|
Disk: Disk Model:WDC WD40EFRX-68N32N0 Serial:WD-XXXXXXX was added to Channel 2 of the head unit. |
May 25, 2019 12:24:33 PM |
|
System: ReadyNASOS background service started. |
May 25, 2019 12:24:32 PM |
|
Volume: Volume data is Degraded. |
I can still access all files, everything is intact, even went so far as to randomly pull data, and run MD5 checks against it.
Drive channel 0 shows a pending sector count of 20, which makes me think this thing is convinced there's bad sectors on drive one... which is possible, but shouldn't it show the volume as just degraded again?
The timestamps on the resync is what is throwing me for a loop. Literally, within 60 seconds of a completed resync we get a mysterious 'dead' indicator.
I'm willing to concede, that perhaps drive 1 is dealing with some bad sectors, but if the unit is, indeed, 'dead', then data shoud be inacessible.
Currently in the process of grabbing a backup of data off the unit, that we can't easily reproduce. Will provide logs if needed, just seems odd that everything is intact, and everything went 'wrong' right at the end of a resync.
Solved! Go to Solution.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Wow... So I finally got ahold of tech support, and Rene with tech support got the issue sorted out.
They actually discovered another issue with the unit.. here's the full info:
"There was some disputes on the superblock of /dev/sda so we adjusted it to sync with the other working drives. However /dev/sdb has not synced fully so I suggested to remove it from the array. Try to format /dev/sdb and hotplug it to the unit to sync again properly. Once you sync the new formatted drive, the array will be complete again and should not be degraded anymore."
Holy. Cow. My weekend is saved.
Now, then... to start this backup, before I issue a resync. 🙂
All Replies
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: ReadyNas 526X "Data: dead"?
Update to this... while backing up data off of this device.. THIS has now happened:
May 25, 2019 10:00:47 PM |
|
Volume: Volume data health changed from Dead to Redundant. |
May 25, 2019 08:11:09 PM |
|
Volume: Volume data is resynced. |
May 25, 2019 08:09:36 PM |
|
Disk: Disk in channel 2 (Internal) changed state from RESYNC to ONLINE. |
May 25, 2019 08:09:15 PM |
|
Disk: Disk in channel 1 (Internal) changed state from ONLINE to FAILED. |
May 25, 2019 08:08:54 PM |
|
Volume: Volume data health changed from Degraded to Dead. |
(see attached image) Not exactly sure what's going on, now... oO It appears to have created another volume, with a smaller size... ???
To coin a phrase... there's 3 letter.. a letter W... a letter T... and a letter F.....
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: ReadyNas 526X "Data: dead"?
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: ReadyNas 526X "Data: dead"?
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: ReadyNas 526X "Data: dead"?
Hi @Shadowlore
It seems there are issues with the RAID configuration or with the disks that failed but I would recommend contacting NETGEAR Support so they can properly assist you with fixing the RAID. Please note though that in the event Data recovery is needed, there would be a separate charge. Also, if you are out of warranty you may have to purchase a Support contract like a Pay-per-incident contract would do ($75).
You can contact them through my.netgear.com by creating an online case.
HTH
Regards
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: ReadyNas 526X "Data: dead"?
Yeah, already looking into that route, as well as a few others.
If the drives had come back as bad, via WD, then I'd at least understand it... but the part about it I don't really get, is that the data was mostly intact (thumbnail files were reporting as bad...) so I was able to copy MOST of the data off.
Powered up another ReadyNAS unit, to start a 10Gb Backup, but wanted to move the device to a different UPS, just in case. (got some storms rolling in), and after the reboot, it seems to have forgotten it's volumes, even when I've tried to mount them in read-only.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: ReadyNas 526X "Data: dead"?
This definitely is weird. Have you grabbed the log zip file?
Did you always have 6x4TB in the unit? I'm thinking that you might have started with some smaller drives (which would give you multiple RAID groups in the array). Then something happened that put one of multiple RAID groups out of sync.
FWIW, one thing I recently discovered with my WD Reds - sometimes unrecoverable reads don't end up incrementing the pending sector counts. I found several UNCs when I ran smartctl -x on one of the drives with ssh. When I tested that drive with Lifeguard, it failed. Someone else here tried that test (at my suggestion) and also uncovered a failed drive that way.
Also, in general I've found that the destructive write-zeros test in Lifeguard sometimes finds issues that the non-destructive test misses (and vice versa). Though you shouldn't run that test until you sort out your data loss.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: ReadyNas 526X "Data: dead"?
Yeah.. got the log prior to everything going sideways, and after. (if ya wanna look at them, let me know.. happy to share)
You are 100% correct. I originally had 4 drives in the unit (ported over from an Ultra4) and then over the years I've expanded and replaced drives as they die.
Right now, I'm seeing 3 'volumes'.
Volume 1 (data) shows as being 0TB.
Volume 2 (data-0) shows as being 8TB
Volume 3 (data-1) shows as being 8TB
The original volume (just 'data') was 19TB (actually running 6 drives, 4TB each, obviously)
Right now, my wife and daughter are about to kill me, since the backup hadn't ran in awhile (partially my fault, partially theirs)... and my daughter's entire senior year of school was on that volume.
I was able to recover a large number of files, prior to the reboot, but the photos were so large, the plan was just to reboot into read-only mode, and let all of the photos sync to our cloud storage... but needless to say, the reboot caused this weird split, now.
Ran an indepth scan on the drives, last night, and found that the 'failed' second drive is now showing a pending SMART error (but it wasn't showing it last night.. which is odd...), so I'm on my way to Microcenter to buy yet another drive.
If anyone has any recommendations, I'm more than up for suggestions, at this point. I know at least a chunk of the data is likely gone.. but at this point, I'm just trying to recover what I can. *facedesk*
mdstat.log prior to reboot:
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md1 : active raid10 sdf2[5] sde2[4] sdd2[3] sdc2[2] sdb2[1] sda2[0]
1569792 blocks super 1.2 512K chunks 2 near-copies [6/6] [UUUUUU]
md126 : active raid5 sdf4[6](S) sdd4[0] sda4[5](F) sdb4[3] sdc4[2] sde4[1]
9766874560 blocks super 1.2 level 5, 64k chunk, algorithm 2 [6/4] [UUUU__]
md127 : active raid5 sdf3[10] sda3[11] sde3[7] sdd3[8] sdc3[9] sdb3[6]
9743313920 blocks super 1.2 level 5, 64k chunk, algorithm 2 [6/6] [UUUUUU]
bitmap: 0/15 pages [0KB], 65536KB chunk
md0 : active raid1 sdf1[10] sda1[11] sde1[7] sdd1[8] sdc1[9] sdb1[6]
4190208 blocks super 1.2 [7/6] [UUUUUU_]
unused devices: <none>
/dev/md/0:
Version : 1.2
Creation Time : Tue Oct 8 22:29:30 2013
Raid Level : raid1
Array Size : 4190208 (4.00 GiB 4.29 GB)
Used Dev Size : 4190208 (4.00 GiB 4.29 GB)
Raid Devices : 7
Total Devices : 6
Persistence : Superblock is persistent
Update Time : Sat May 25 20:49:03 2019
State : clean, degraded
Active Devices : 6
Working Devices : 6
Failed Devices : 0
Spare Devices : 0
Name : ***REMOVED:0 (local to host ***REMOVED)
UUID : ***REMOVED
Events : 4368189
Number Major Minor RaidDevice State
11 8 1 0 active sync /dev/sda1
10 8 81 1 active sync /dev/sdf1
6 8 17 2 active sync /dev/sdb1
9 8 33 3 active sync /dev/sdc1
8 8 49 4 active sync /dev/sdd1
7 8 65 5 active sync /dev/sde1
- 0 0 6 removed
/dev/md/data-0:
Version : 1.2
Creation Time : Tue Oct 8 22:29:31 2013
Raid Level : raid5
Array Size : 9743313920 (9291.95 GiB 9977.15 GB)
Used Dev Size : 1948662784 (1858.39 GiB 1995.43 GB)
Raid Devices : 6
Total Devices : 6
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Sat May 25 16:40:56 2019
State : clean
Active Devices : 6
Working Devices : 6
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
Name : ***REMOVED:data-0 (local to host ***REMOVED)
UUID : ***REMOVED
Events : 49070
Number Major Minor RaidDevice State
11 8 3 0 active sync /dev/sda3
10 8 83 1 active sync /dev/sdf3
6 8 19 2 active sync /dev/sdb3
9 8 35 3 active sync /dev/sdc3
8 8 51 4 active sync /dev/sdd3
7 8 67 5 active sync /dev/sde3
/dev/md/data-1:
Version : 1.2
Creation Time : Mon Jul 27 20:52:39 2015
Raid Level : raid5
Array Size : 9766874560 (9314.42 GiB 10001.28 GB)
Used Dev Size : 1953374912 (1862.88 GiB 2000.26 GB)
Raid Devices : 6
Total Devices : 6
Persistence : Superblock is persistent
Update Time : Sat May 25 20:48:30 2019
State : clean, FAILED
Active Devices : 4
Working Devices : 5
Failed Devices : 1
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 64K
Name : *** REMOVED:data-1 (local to host ***REMOVED)
UUID : *** REMOVED
Events : 39160
Number Major Minor RaidDevice State
0 8 52 0 active sync /dev/sdd4
1 8 68 1 active sync /dev/sde4
2 8 36 2 active sync /dev/sdc4
3 8 20 3 active sync /dev/sdb4
- 0 0 4 removed
- 0 0 5 removed
5 8 4 - faulty /dev/sda4
6 8 84 - spare /dev/sdf4
MDSTAT.LOG POST REBOOT:
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md1 : active raid10 sde2[4] sdd2[3] sdc2[2] sdb2[1] sda2[0]
1308160 blocks super 1.2 512K chunks 2 near-copies [5/5] [UUUUU]
md0 : active raid1 sda1[10] sdb1[6] sde1[7] sdd1[8] sdc1[9]
4190208 blocks super 1.2 [7/5] [U_UUUU_]
unused devices: <none>
/dev/md/0:
Version : 1.2
Creation Time : Tue Oct 8 22:29:30 2013
Raid Level : raid1
Array Size : 4190208 (4.00 GiB 4.29 GB)
Used Dev Size : 4190208 (4.00 GiB 4.29 GB)
Raid Devices : 7
Total Devices : 5
Persistence : Superblock is persistent
Update Time : Sun May 26 02:32:42 2019
State : clean, degraded
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Name : ***REMOVED:0 (local to host ***REMOVED)
UUID : ***REMOVED
Events : 4370599
Number Major Minor RaidDevice State
10 8 1 0 active sync /dev/sda1
- 0 0 1 removed
6 8 17 2 active sync /dev/sdb1
9 8 33 3 active sync /dev/sdc1
8 8 49 4 active sync /dev/sdd1
7 8 65 5 active sync /dev/sde1
- 0 0 6 removed
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: ReadyNas 526X "Data: dead"?
I suggest cloning the disks with SMART errors (using a utility that does sector by sector cloning). Probably Netgear support is your best pathway to get the volume to mount (and if needed do data recovery).
@Shadowlore wrote:
Yeah.. got the log prior to everything going sideways, and after. (if ya wanna look at them, let me know.. happy to share)
Probably @JohnCM_S or @Hopchen (former Netgear) are the right folks to take a look. Send the link in a PM.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: ReadyNas 526X "Data: dead"?
You want the logs, yourself, or you just wanting me to forward it to them? (I'd prefer to not just blindly send them logs, without their ahead knowledge of why I'm sending it)
As for having others do it for me, the fact is this is the sort of thing I do for a living, myself, so I always prefer to do the legwork, myself.
Actually witnessed a few of the SSH commands, here: https://community.netgear.com/t5/Using-your-ReadyNAS-in-Business/How-to-recreate-X-RAID-volume-after...
But.. for now, I suppose I need to wait for this clone to finish, first.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: ReadyNas 526X "Data: dead"?
@Shadowlore wrote:
You want the logs, yourself, or you just wanting me to forward it to them? (I'd prefer to not just blindly send them logs, without their ahead knowledge of why I'm sending it)
I think they probably can analyze them better than I can. They should get a notification since I used the @ to loop them in. So maybe wait a bit and see if they chime in.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: ReadyNas 526X "Data: dead"?
Cool deal.. thx Stephen. Been calmly waiting on this clone to finish. (Hardware replicators are so nice.. plug it in, start the clone, walk away.. no need to even leave the PC on. 😉
Currently showing about 75% done.. the fact that it didn't fail, immediately, at least tells me there's something to hope for.
Honestly think after I rebuild this, I'm gonna consider a different raid level.
I've had multiple raids over the years (work mainly) that have become degraded due to failed drives.. but after 25 years of IT work, I've never had the experience of having 2 drives die at the same time.. heard of it happening, and due to that, I've always build work devices to something more than raid 5, but never in a million years did I expect this to happen at home.
Also have a 4360X, here, that I don't use very often (usually it's the backup for the primary 526X NAS.. the thing is noisy as heck).. might even consider, after this is all done, doing a much higher level of raid on that thing... but will need to look into some quieter fans.. cause.. sheesh.. the thing is louder than my vSphere environment (3x HP DL380 Gen7s).
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: ReadyNas 526X "data dead"?
This is the part from the log that shows the problem:
md126 : active raid5 sdf4[6](S) sdd4[0] sda4[5](F) sdb4[3] sdc4[2] sde4[1]
9766874560 blocks super 1.2 level 5, 64k chunk, algorithm 2 [6/4] [UUUU__]
md127 : active raid5 sdf3[10] sda3[11] sde3[7] sdd3[8] sdc3[9] sdb3[6]
9743313920 blocks super 1.2 level 5, 64k chunk, algorithm 2 [6/6] [UUUUUU]
Those two RAID groups are where your data volume resides. the 6/4 means that you are missing that partition from two of the six drives. The sda4, sdb4, etc. means the partition that's missing is 4 (the newest one from when you expanded).
The data volume you see is the one the NAS still "remembers", but can't assemble. The data-0 and data-1 are remnants it "found" but also can't assemble, so it doesn't recognize them as being the correct "data".
It appears you have fallen victim to something that happens a lot, especially when all drives are the same age. The stress of a re-sync of a new drive has taken down another before the completion of the sync. That's what a complete backup before a drive replacment is always a good idea.
Another RAID level can be good. But doing a better job with your backups would have put you a lot farther from the door of that doghouse.. If it had been somethng other than a bad drive, a higher RAID l;evel might still not have saved you.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: ReadyNas 526X "data dead"?
@Sandshark wrote:
This is the part from the log that shows the problem:
md126 : active raid5 sdf4[6](S) sdd4[0] sda4[5](F) sdb4[3] sdc4[2] sde4[1]
9766874560 blocks super 1.2 level 5, 64k chunk, algorithm 2 [6/4] [UUUU__]
Yes. it looks like sda failed during the resync of sdf. So cloning sda and/or the original sdf might help.
Though it always seems odd to me that mdadm doesn't mark a failed disk in all the partitions.
@Shadowlore wrote:
md0 : active raid1 sdf1[10] sda1[11] sde1[7] sdd1[8] sdc1[9] sdb1[6]
4190208 blocks super 1.2 [7/6] [UUUUUU_]
This part looks puzzling though (suggesting 7 disks, though the system only has 6 bays).
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: ReadyNas 526X "data dead"?
@StephenB wrote:
@Sandshark wrote:
This is the part from the log that shows the problem:
md126 : active raid5 sdf4[6](S) sdd4[0] sda4[5](F) sdb4[3] sdc4[2] sde4[1]
9766874560 blocks super 1.2 level 5, 64k chunk, algorithm 2 [6/4] [UUUU__]Yes. it looks like sda failed during the resync of sdf. So cloning sda and/or the original sdf might help.
Though it always seems odd to me that mdadm doesn't mark a failed disk in all the partitions.
@Shadowlore wrote:
md0 : active raid1 sdf1[10] sda1[11] sde1[7] sdd1[8] sdc1[9] sdb1[6]
4190208 blocks super 1.2 [7/6] [UUUUUU_]This part looks puzzling though (suggesting 7 disks, though the system only has 6 bays).
When looking throught the log files, this is exactly what I was wondering.. how on EARTH the thing determined there were 7 disks.. I have no idea.
The clone for Disk #1(of 6) finished last night.. now to see if I can do anything with this to pull ANY data at all. *sigh*
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: ReadyNas 526X "Data: dead"?
@StephenB wrote:
I suggest cloning the disks with SMART errors (using a utility that does sector by sector cloning). Probably Netgear support is your best pathway to get the volume to mount (and if needed do data recovery).
Alright, I give.. how on earth do I contact support? The website is kind of awful.
Clicking on the support option, takes me to the place where they want me to purchase a support contract (fine) but then clicking that, brings me to a page where it says there's no support options available.
(This is where I miss some of my previous NG contacts, wouldn't have to jump through these hoops.. would just get a support person)
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: ReadyNas 526X "Data: dead"?
Found the 'chat' option... got in touch with a 'Yvan'... who said he didn't handle that, so he xferred me to another department.. and then I was disconnected.
(This is actually the main reason I hate dealing with tech support from large companies... they all want to move to a 'chat' option, which always seem to have a number of tech issues)
Now to find an actual phone number.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Wow... So I finally got ahold of tech support, and Rene with tech support got the issue sorted out.
They actually discovered another issue with the unit.. here's the full info:
"There was some disputes on the superblock of /dev/sda so we adjusted it to sync with the other working drives. However /dev/sdb has not synced fully so I suggested to remove it from the array. Try to format /dev/sdb and hotplug it to the unit to sync again properly. Once you sync the new formatted drive, the array will be complete again and should not be degraded anymore."
Holy. Cow. My weekend is saved.
Now, then... to start this backup, before I issue a resync. 🙂
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: ReadyNas 526X "Data: dead"?
@Shadowlore wrote:Holy. Cow. My weekend is saved.
Thx for the update (and I'm glad to hear that everything was sorted out).
Any idea on the details of what Rene did to adjust /dev/sda?