- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
RN316 - Rejoin drive to dead volume?
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
RN316 - Rejoin drive to dead volume?
I've been using my RN316 (OS 6.10.7) for many years.
X-Raid with 4TB+4TB +4TB+6TB drives.
I got a new 10 TB drive I was going to use for backup. I had trouble with the RN detecting it via esata or in a usb dock so I plugged it into bay 6 of the nas to see if it would detect. It was in there for about 15 seconds, then I pulled it out.
The array said degraded, but I didn't see how those 15 seconds would have caused any significant changes.
I continued copying data from my RN to other targets using teracopy to verify all files without any issue.
When I started to run out of space, I did something stupid. I thought the raid should be able to continue with one drive failure, so I pulled the 6TB drive. It was only out for about 30 seconds before I saw the errors in the web console and LCD. I quickly reinserted the drive.
Now the data volume says "Volume is inactive or dead" but I can still access it. I see nearly everything. Some folders are empty and some return I/O error when I attempt to ls them in ssh.
The volumes page has a new "data-0" volume (14.54 TB) and "data-1" volume (1.82 TB) with the original data volume as inactive.
I know it's a long shot, but is there any linux/btrfs magic where I could rejoin that 6TB drive to the volume?
I'm linux savvy.
I have a log bundle.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: RN316 - Rejoin drive to dead volume?
@nifter wrote:
I got a new 10 TB drive I was going to use for backup. I had trouble with the RN detecting it via esata or in a usb dock so I plugged it into bay 6 of the nas to see if it would detect. It was in there for about 15 seconds, then I pulled it out.
The array said degraded, but I didn't see how those 15 seconds would have caused any significant changes.
Bad idea if you were running XRAID. The NAS would have tried to automatically add the 10 TB drive to the array. Once the NAS thinks the drive is part of the array, then the volume will be flagged as degraded if it is missing.
@nifter wrote:
When I started to run out of space, I did something stupid. I thought the raid should be able to continue with one drive failure, so I pulled the 6TB drive. It was only out for about 30 seconds before I saw the errors in the web console and LCD. I quickly reinserted the drive.
If the volume was already identified as degraded when you did this, then the NAS likely saw this as a two-drive failure, and therefore marked the volume as dead.
@nifter wrote:
Now the data volume says "Volume is inactive or dead" but I can still access it. I see nearly everything. Some folders are empty and some return I/O error when I attempt to ls them in ssh.
The volumes page has a new "data-0" volume (14.54 TB) and "data-1" volume (1.82 TB) with the original data volume as inactive.
I know it's a long shot, but is there any linux/btrfs magic where I could rejoin that 6TB drive to the volume?
I'm linux savvy.
I have a log bundle.
You could try manually removing the 10 TB drive from the array with mdadm. That should restore the volume to degraded status, which would allow the 6 TB drive to be resynced.
@Sandshark has played with these commands more than I have, so hopefully he will chime in.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: RN316 - Rejoin drive to dead volume?
The NAS has labeled two parts of the former "data" array as data-0 and data-1 because it can't have multiple volumes named "data". Those are also the names given to the MDADM RAID groups of a two-group volume named "data".
I don't know if anything I know of can fix your situation, as the commands normally need to operate on the un-mounted "data" volume. If you had come here before you removed that second drive, I could have helped. But look at my post Reducing-RAID-size-removing-drives-WITHOUT-DATA-LOSS-is-possible and post the results of commands like I use in the beginning to verify the configuration. You'll need at least these:
cat /proc/mdstat
btrfs filesystem show /data
btrfs filesystem show /data-0
btrfs filesystem show /data-1
mdadm --detail /dev/md127
mdadm --detail /dev/md126 (if md126 shows up in the btrfs command responses)
and the same for any other RAID volumes that show up.
From there, I'll see if it looks like something is possible, but I don't have high hopes and whatever you try will not be something I've tried before.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: RN316 - Rejoin drive to dead volume?
@Sandshark Thanks for having a look.
admin@Wingnut:/$ cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md1 : active raid10 sdf2[3] sdd2[2] sdc2[1] sdb2[0]
1044480 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
md126 : active raid1 sde4[1](F) sda4[0]
1953373888 blocks super 1.2 [2/1] [U_]
md127 : active raid5 sdc3[0] sdb3[2] sdd3[1]
15608667136 blocks super 1.2 level 5, 64k chunk, algorithm 2 [5/3] [UUU__]
md0 : active raid1 sdf1[5] sdc1[0] sdb1[2] sdd1[1]
4190208 blocks super 1.2 [5/4] [UUUU_]
unused devices: <none>
root@Wingnut:/# btrfs filesystem show /data
Label: '43f65d04:data' uuid: 893ff390-861c-441d-b39c-8e6c707e0e1d
Total devices 2 FS bytes used 8.14TiB
devid 1 size 14.54TiB used 9.80TiB path /dev/md127
devid 2 size 1.82TiB used 1.00GiB path /dev/md126
root@Wingnut:/# btrfs filesystem show /data-0
ERROR: not a valid btrfs filesystem: /data-0
root@Wingnut:/# btrfs filesystem show /data-1
ERROR: not a valid btrfs filesystem: /data-1
root@Wingnut:/# mdadm --detail /dev/md127
/dev/md127:
Version : 1.2
Creation Time : Fri Jul 3 15:58:26 2015
Raid Level : raid5
Array Size : 15608667136 (14885.58 GiB 15983.28 GB)
Used Dev Size : 3902166784 (3721.40 GiB 3995.82 GB)
Raid Devices : 5
Total Devices : 3
Persistence : Superblock is persistent
Update Time : Wed May 25 20:33:37 2022
State : clean, FAILED
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
Consistency Policy : unknown
Name : 43f65d04:data-0 (local to host 43f65d04)
UUID : e8efdcdc:ce0dd1ae:d5321061:c45a5391
Events : 44167
Number Major Minor RaidDevice State
0 8 35 0 active sync /dev/sdc3
1 8 51 1 active sync /dev/sdd3
2 8 19 2 active sync /dev/sdb3
- 0 0 3 removed
- 0 0 4 removed
root@Wingnut:/# mdadm --detail /dev/md126
/dev/md126:
Version : 1.2
Creation Time : Sat May 14 16:01:02 2022
Raid Level : raid1
Array Size : 1953373888 (1862.88 GiB 2000.25 GB)
Used Dev Size : 1953373888 (1862.88 GiB 2000.25 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Update Time : Wed May 25 13:56:57 2022
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 1
Spare Devices : 0
Consistency Policy : unknown
Number Major Minor RaidDevice State
0 8 4 0 active sync
- 0 0 1 removed
1 8 68 - faulty
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: RN316 - Rejoin drive to dead volume?
OK, from that I can see you have four drives installed. They are sdb, sdc, sde, and sdf. sdf is the one that's not a part of the array (along with one that's currently not installed) I think you pulled drive 1, which was sda at the time, then you re-inserted it and it became sdf (thast's normal for a removed and replaced/exchanged drive). But to be sure which drive is which, look at the results of get_disk_info. Note that the channels start at 0, not 1. So the device on channel 0 is bay 1. If my assumption above is wrong, then post the results and I'll modify the commands below.
You have two MDADM RAID groups. md127 is the main one: RAID5 of what should be 5 drives but has only 3. md126 is the second layer: RAID1 with one of the two drives.
So md126 is OK for now: one of two RAID drives is recoverable, and that's enough for RAID1. But since the BTRFS volume concatenates both RAID groups, you can't get anything from it by itself.
But md127 is missing two components, so the RAID5 is dead.
Your best bet at this point is to contact Netgear support, as I believe they can recover at least most of the volume (there may be errors in some files.
If that's not an option and you want to try yourself, recognizing that anything you do could make matters worse instead of better, is this:
mdadm --assemble --force /dev/md127 /dev/sdb3 /dev/sdc3 /dev/sde3 /dev/sdf3
If it doesn't work, cycle power and see if cat /proc/mdstat now shows the contents of md1 (the OS partition) to be md1 : active raid10 sda2[0] sdb2[1] sdc2[2] sdd2[3]. If it does, then try the assemble command again with the updated device names:
mdadm --assemble --force /dev/md127 /dev/sda3 /dev/sdb3 /dev/sce3 /dev/sdd3
If that doesn't work, and the drives are now in normal order, then try
mdadm /dev/md127 --re-add /dev/sda3
And if it says that's not possible:
mdadm /dev/md127 --add /dev/sda3
If this works, you'll have a degraded, but accessible volume from which you can copy the files (possibly with a few errors). From there, the best thing would be to factory default and start fresh, though there are other steps (rather complicated, like described in the post I referenced above) that might restore the volume to the state before you added the 10TB.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: RN316 - Rejoin drive to dead volume?
@Sandshark Thanks for the help. I decided to try ReclaiMe software and it seems to be working. Data is being recovered, just very very slow.