RN316 - Rejoin drive to dead volume?

Question

I've been using my RN316 (OS 6.10.7) for many years.

X-Raid with 4TB+4TB +4TB+6TB drives.

I got a new 10 TB drive I was going to use for backup. I had trouble with the RN detecting it via esata or in a usb dock so I plugged it into bay 6 of the nas to see if it would detect. It was in there for about 15 seconds, then I pulled it out.

The array said degraded, but I didn't see how those 15 seconds would have caused any significant changes.

I continued copying data from my RN to other targets using teracopy to verify all files without any issue.

When I started to run out of space, I did something stupid. I thought the raid should be able to continue with one drive failure, so I pulled the 6TB drive. It was only out for about 30 seconds before I saw the errors in the web console and LCD. I quickly reinserted the drive.

Now the data volume says "Volume is inactive or dead" but I can still access it. I see nearly everything. Some folders are empty and some return I/O error when I attempt to ls them in ssh.

The volumes page has a new "data-0" volume (14.54 TB) and "data-1" volume (1.82 TB) with the original data volume as inactive.

I know it's a long shot, but is there any linux/btrfs magic where I could rejoin that 6TB drive to the volume?

I'm linux savvy.

I have a log bundle.

StephenB · Answer

nifter&nbsp;wrote:
&nbsp;
I got a new 10 TB drive I was going to use for backup. I had trouble with the RN detecting it via esata or in a usb dock so I plugged it into bay 6 of the nas to see if it would detect. It was in there for about 15 seconds, then I pulled it out.
The array said degraded, but I didn't see how those 15 seconds would have caused any significant changes.
&nbsp;

Bad idea if you were running XRAID.&nbsp; The NAS would have tried to automatically add the 10 TB drive to the array. Once the NAS thinks the drive is part of the array, then the volume will be flagged as degraded if it is missing.
&nbsp;
nifter&nbsp;wrote:
&nbsp;
When I started to run out of space, I did something stupid. I thought the raid should be able to continue with one drive failure, so I pulled the 6TB drive. It was only out for about 30 seconds before I saw the errors in the web console and LCD. I quickly reinserted the drive.
&nbsp;

If the volume was already identified as degraded when you did this, then the NAS likely saw this as a two-drive failure, and therefore marked the volume as dead.
&nbsp;
nifter&nbsp;wrote:
&nbsp;
Now the data volume says "Volume is inactive or dead" but I can still access it. I see nearly everything. Some folders are empty and some return I/O error when I attempt to ls them in ssh.
The volumes page has a new "data-0" volume (14.54 TB) and "data-1" volume (1.82 TB) with the original data volume as inactive.
&nbsp;
I know it's a long shot, but is there any linux/btrfs magic where I could rejoin that 6TB drive to the volume?
I'm linux savvy.
I have a log bundle.
&nbsp;

You could try manually removing the 10 TB drive from the array with mdadm.&nbsp; That should restore the volume to degraded status, which would allow the 6 TB drive to be resynced.
&nbsp;
Sandshark&nbsp;has played with these commands more than I have, so hopefully he will chime in.
&nbsp;

Sandshark · Answer

The NAS has labeled two parts of the former "data" array as data-0 and data-1 because it can't have multiple volumes named "data". Those are also the names given to the MDADM RAID groups of a two-group volume named "data".

I don't know if anything I know of can fix your situation, as the commands normally need to operate on the un-mounted "data" volume. If you had come here before you removed that second drive, I could have helped. But look at my post Reducing-RAID-size-removing-drives-WITHOUT-DATA-LOSS-is-possible and post the results of commands like I use in the beginning to verify the configuration. You'll need at least these:

cat /proc/mdstat

btrfs filesystem show /data

btrfs filesystem show /data-0

btrfs filesystem show /data-1

mdadm --detail /dev/md127

mdadm --detail /dev/md126 (if md126 shows up in the btrfs command responses)

and the same for any other RAID volumes that show up.

From there, I'll see if it looks like something is possible, but I don't have high hopes and whatever you try will not be something I've tried before.

nifter · Answer

Sandshark&nbsp;Thanks for having a look.admin@Wingnut:/$ cat /proc/mdstatPersonalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]md1 : active raid10 sdf2[3] sdd2[2] sdc2[1] sdb2[0]1044480 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]md126 : active raid1 sde4[1](F) sda4[0]1953373888 blocks super 1.2 [2/1] [U_]md127 : active raid5 sdc3[0] sdb3[2] sdd3[1]15608667136 blocks super 1.2 level 5, 64k chunk, algorithm 2 [5/3] [UUU__]md0 : active raid1 sdf1[5] sdc1[0] sdb1[2] sdd1[1]4190208 blocks super 1.2 [5/4] [UUUU_]unused devices: &lt;none&gt;root@Wingnut:/# btrfs filesystem show /dataLabel: '43f65d04:data' uuid: 893ff390-861c-441d-b39c-8e6c707e0e1dTotal devices 2 FS bytes used 8.14TiBdevid 1 size 14.54TiB used 9.80TiB path /dev/md127devid 2 size 1.82TiB used 1.00GiB path /dev/md126root@Wingnut:/# btrfs filesystem show /data-0ERROR: not a valid btrfs filesystem: /data-0root@Wingnut:/# btrfs filesystem show /data-1ERROR: not a valid btrfs filesystem: /data-1root@Wingnut:/# mdadm --detail /dev/md127/dev/md127:Version : 1.2Creation Time : Fri Jul 3 15:58:26 2015Raid Level : raid5Array Size : 15608667136 (14885.58 GiB 15983.28 GB)Used Dev Size : 3902166784 (3721.40 GiB 3995.82 GB)Raid Devices : 5Total Devices : 3Persistence : Superblock is persistentUpdate Time : Wed May 25 20:33:37 2022State : clean, FAILEDActive Devices : 3Working Devices : 3Failed Devices : 0Spare Devices : 0Layout : left-symmetricChunk Size : 64KConsistency Policy : unknownName : 43f65d04:data-0 (local to host 43f65d04)UUID : e8efdcdc:ce0dd1ae:d5321061:c45a5391Events : 44167Number Major Minor RaidDevice State0 8 35 0 active sync /dev/sdc31 8 51 1 active sync /dev/sdd32 8 19 2 active sync /dev/sdb3- 0 0 3 removed- 0 0 4 removedroot@Wingnut:/# mdadm --detail /dev/md126/dev/md126:Version : 1.2Creation Time : Sat May 14 16:01:02 2022Raid Level : raid1Array Size : 1953373888 (1862.88 GiB 2000.25 GB)Used Dev Size : 1953373888 (1862.88 GiB 2000.25 GB)Raid Devices : 2Total Devices : 2Persistence : Superblock is persistentUpdate Time : Wed May 25 13:56:57 2022State : clean, degradedActive Devices : 1Working Devices : 1Failed Devices : 1Spare Devices : 0Consistency Policy : unknownNumber Major Minor RaidDevice State0 8 4 0 active sync- 0 0 1 removed1 8 68 - faulty&nbsp;

Sandshark · Answer

OK, from that I can see you have four drives installed. They are sdb, sdc, sde, and sdf. sdf is the one that's not a part of the array (along with one that's currently not installed) I think you pulled drive 1, which was sda at the time, then you re-inserted it and it became sdf (thast's normal for a removed and replaced/exchanged drive). But to be sure which drive is which, look at the results of get_disk_info.&nbsp; Note that the channels start at 0, not 1.&nbsp; So the device on channel 0 is bay 1.&nbsp; If my assumption above is wrong, then post the results and I'll modify the commands below.&nbsp;You have two MDADM RAID groups. md127 is the main one: RAID5 of what should be 5 drives but has only 3. md126 is the second layer: RAID1 with one of the two drives.&nbsp;So md126 is OK for now: one of two RAID drives is recoverable, and that's enough for RAID1. But since the BTRFS volume concatenates both RAID groups, you can't get anything from it by itself.&nbsp;But md127 is missing two components, so the RAID5 is dead.&nbsp;Your best bet at this point is to contact Netgear support, as I believe they can recover at least most of the volume (there may be errors in some files.&nbsp;If that's not an option and you want to try yourself, recognizing that anything you do could make matters worse instead of better, is this:&nbsp;mdadm&nbsp; --assemble&nbsp; --force&nbsp; /dev/md127&nbsp; /dev/sdb3&nbsp; /dev/sdc3&nbsp; /dev/sde3&nbsp; /dev/sdf3&nbsp;If it doesn't work, cycle power and see if cat /proc/mdstat now shows the contents of md1 (the OS partition) to be md1 : active raid10 sda2[0] sdb2[1] sdc2[2] sdd2[3]. If it does, then try the assemble command again with the updated device names:&nbsp;mdadm&nbsp; --assemble&nbsp; --force&nbsp; /dev/md127&nbsp; /dev/sda3&nbsp; /dev/sdb3 /dev/sce3&nbsp; /dev/sdd3&nbsp;If that doesn't work, and the drives are now in normal order, then try&nbsp;mdadm&nbsp;&nbsp;/dev/md127&nbsp; --re-add&nbsp; /dev/sda3&nbsp;And if it says that's not possible:mdadm&nbsp; /dev/md127&nbsp; --add&nbsp; /dev/sda3&nbsp;If this works, you'll have a degraded, but accessible volume from which you can copy the files (possibly with a few errors).&nbsp; From there, the best thing would be to factory default and start fresh, though there are other steps (rather complicated, like described in the post I referenced above) that might restore the volume to the state before you added the 10TB.

nifter · Answer

Sandshark&nbsp;Thanks for the help. I decided to try ReclaiMe software and it seems to be working. Data is being recovered, just very very slow.

Forum Discussion

RN316 - Rejoin drive to dead volume?

5 Replies

Related Content

ReadyNAS RN316 showing Volume Is read-only

RN316 - Volume inactive

Dead RN316

Moving complete volume from RN316 to RN316

Have to constantly forget and rejoin network

NETGEAR Academy

ProSupport for Business