NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
borgchick
Mar 07, 2018Aspirant
RN104 says inactive volume
I found my RN104 frozen a few days ago. Last email I saw from it was that a balance had started. When I found it frozen, it was unreachble via ssh, or even responding to ping. Power button had no...
mdgm-ntgr
Mar 07, 2018NETGEAR Employee Retired
I can see you're on old firmware (don't update the firmware with the NAS in this state).
I can also see this in your dmesg.log
[ 17.575851] BTRFS: device label 0e343298:data devid 2 transid 146384 /dev/md126 [ 18.381179] BTRFS (device md126): bad tree block start 11637558493067324903 17973882290176 [ 18.382106] BTRFS (device md126): bad tree block start 11637558493067324903 17973882290176 [ 18.382157] BTRFS: failed to read tree root on md126 [ 18.430349] BTRFS: open_ctree failed
Have you tried booting into volume read-only mode?
Is your backup up to date?
A factory default (wipes all data, settings, everything) is fine to do if you have a good up to date backup that you can restore from afterwards.
If not, you may wish to contact support if booting into volume read-only mode doesn't resolve the problem.
borgchick
Mar 10, 2018Aspirant
So I tried the readonly mode, no effect. Same error in dmesg.
I did a bit more poking around. So each disk has 4 partitions, and if I use mdadm --examine on each set, there appear to be 4 raid set ups
/dev/sd*1 is raid 1 /dev/sd*2 is raid 6 /dev/sd*3 is raid 5 /dev/sd*4 is raid 5
If I look at the raid states:
root@nas:~# mdadm --examine /dev/sd*1|grep "State"
State : clean
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
State : clean
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
State : clean
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
State : clean
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
root@nas:~# mdadm --examine /dev/sd*2|grep "State"
State : clean
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
State : clean
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
State : clean
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
State : clean
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
root@nas:~# mdadm --examine /dev/sd*3|grep "State"
State : clean
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
State : clean
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
State : clean
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
State : clean
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
root@nas:~# mdadm --examine /dev/sd*4|grep "State"
State : clean
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
State : clean
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
State : clean
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
State : clean
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing) So everything is clean.
But if I try to mount the file system, even in degraded mode, it fails:
root@nas:~# mount -t btrfs -o degraded /dev/md127 /var/recover/
mount: wrong fs type, bad option, bad superblock on /dev/md127,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so.
root@nas:~# mount -t btrfs /dev/md127 /var/recover/
mount: wrong fs type, bad option, bad superblock on /dev/md127,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so.If I look at dmesg, it is the btrfs ctree error:
[ 3847.552053] BTRFS (device md126): bad tree block start 11637558493067324903 17973882290176 [ 3847.552446] BTRFS (device md126): bad tree block start 11637558493067324903 17973882290176 [ 3847.552481] BTRFS: failed to read tree root on md126 [ 3847.580455] BTRFS: open_ctree failed [ 3852.027911] BTRFS (device md126): bad tree block start 11637558493067324903 17973882290176 [ 3852.028306] BTRFS (device md126): bad tree block start 11637558493067324903 17973882290176 [ 3852.028342] BTRFS: failed to read tree root on md126 [ 3852.070223] BTRFS: open_ctree failed
So, it seems to me that this is some sort of btrfs corruption issue.
btrfs fi show
Label: '0e343298:data' uuid: 92ecd2e0-45da-425b-87a8-0f626e5ef889
Total devices 2 FS bytes used 5.04TiB
devid 1 size 2.71TiB used 2.53TiB path /dev/md127
devid 2 size 2.73TiB used 2.54TiB path /dev/md126Now I don't know why there are two md raid set ups, from other digging (dry run) I see md127 thinks device 2 is missing, whatever that is, is that the second drive? the second raid set (/dev/md126?)
oot@nas:~# btrfs restore -F -i -D -v /dev/md127 /dev/null checksum verify failed on 17973882290176 found 28166D04 wanted 8B8262A2 checksum verify failed on 17973882290176 found 28166D04 wanted 8B8262A2 checksum verify failed on 17973882290176 found 46A24E6E wanted 51485848 checksum verify failed on 17973882290176 found 46A24E6E wanted 51485848 bytenr mismatch, want=17973882290176, have=3707086136388172086 Couldn't read tree root Could not open root, trying backup super warning, device 2 is missing checksum verify failed on 17973882290176 found 46A24E6E wanted 51485848 checksum verify failed on 17973882290176 found 46A24E6E wanted 51485848 bytenr mismatch, want=17973882290176, have=3707086136388172086 Couldn't read tree root Could not open root, trying backup super warning, device 2 is missing checksum verify failed on 17973882290176 found 46A24E6E wanted 51485848 checksum verify failed on 17973882290176 found 46A24E6E wanted 51485848 bytenr mismatch, want=17973882290176, have=3707086136388172086 Couldn't read tree root Could not open root, trying backup super
Any suggestions?
I do have a backup, but I really would like to take this as a learning opportunity.
Thanks!
- mdgm-ntgrMar 10, 2018NETGEAR Employee Retired
As it's on multiple arrays a
# btrfs device scan
is needed before attempting to mount the data volume. However with the system booted normally the system would have already run that command.
If your backup is up to date I'd follow the suggestion I gave for that scenario in my previous post.- borgchickMar 11, 2018Aspirant
I did a btrfs device scan --all-devices, it returns nothing, just says:
Scanning for Btrfs filesystems
No additional errors in dmesg from the scan.
I am working on making a backup of my backup, just in case something fails.
Again, I reiterate my desire to learn more about this situation, because there is no guarantee that this won't happen again a week after I restore from backup. Since the drives appear fine, and the raid array appears fine, I'd rather learn how to with btrfs tools to perhaps remount the set up. Even if nothing more than so I know what I can do in the worse case scenario.Can you explain the purpose of the md127 and md126 arrays? Is this how the XRaid feature is implemented?
- StephenBMar 12, 2018Guru - Experienced User
borgchick wrote:
Can you explain the purpose of the md127 and md126 arrays? Is this how the XRaid feature is implemented?
Your volume has two RAID groups, that are both created with mdadm. One is md126, the second is md127. The two RAID groups are joined together into your volume.
That is how XRAID creates arrays with mixed disk sizes. If you ever vertically expanded your RAID array (by upgrading to larger disks), then the system created a new RAID group to use the additional space. Once created, that group is kept until the volume is destroyed (even if all the disks eventually are the same size).
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!