NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

borgchick's avatar
borgchick
Aspirant
Mar 07, 2018

RN104 says inactive volume

I found my RN104 frozen a few days ago.  Last email I saw from it was that a balance had started. 

When I found it frozen, it was unreachble via ssh, or even responding to ping.  Power button had no effect.  I had to pull the plug.

Upon restart, it froze again, during boot, at 39%.  I had to pull the plug again.

Next restart, it finishes booting, but the raid array is gone.  It sees all 4 drives, all are reported as healthy.

Log into admin, and it says "Remove inactive volumes to use the disk. Disk #1, 2, 3, 4."

What do I do?  I see some posts suggesting factory defaulting it, but I just don't want to do anything that might make the situation worse.

 

dmesg log at https://dumptext.com/6gt64u7v/raw/

 

Thanks in advance!

 

 

5 Replies

Replies have been turned off for this discussion
  • mdgm-ntgr's avatar
    mdgm-ntgr
    NETGEAR Employee Retired

    I can see you're on old firmware (don't update the firmware with the NAS in this state).

     

    I can also see this in your dmesg.log

     

    [   17.575851] BTRFS: device label 0e343298:data devid 2 transid 146384 /dev/md126
    [   18.381179] BTRFS (device md126): bad tree block start 11637558493067324903 17973882290176
    [   18.382106] BTRFS (device md126): bad tree block start 11637558493067324903 17973882290176
    [   18.382157] BTRFS: failed to read tree root on md126
    [   18.430349] BTRFS: open_ctree failed

    Have you tried booting into volume read-only mode? 

     

    Is your backup up to date?

     

    A factory default (wipes all data, settings, everything) is fine to do if you have a good up to date backup that you can restore from afterwards.

     

    If not, you may wish to contact support if booting into volume read-only mode doesn't resolve the problem.

    • borgchick's avatar
      borgchick
      Aspirant

      So I tried the readonly mode, no effect.  Same error in dmesg.

       

      I did a bit more poking around.  So each disk has 4 partitions, and if I use mdadm --examine on each set, there appear to be 4 raid set ups

       

      /dev/sd*1 is raid 1
      /dev/sd*2 is raid 6
      /dev/sd*3 is raid 5
      /dev/sd*4 is raid 5

      If I look at the raid states:

      root@nas:~# mdadm --examine /dev/sd*1|grep "State"                       
                State : clean                                                  
         Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)  
                State : clean                                                  
         Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)  
                State : clean                                                  
         Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)  
                State : clean                                                  
         Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)  
      root@nas:~# mdadm --examine /dev/sd*2|grep "State"                       
                State : clean                                                  
         Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)  
                State : clean                                                  
         Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)  
                State : clean                                                  
         Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)  
                State : clean                                                  
         Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)  
      root@nas:~# mdadm --examine /dev/sd*3|grep "State"                       
                State : clean                                                  
         Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)  
                State : clean                                                  
         Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)  
                State : clean                                                  
         Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)  
                State : clean                                                  
         Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)  
      root@nas:~# mdadm --examine /dev/sd*4|grep "State"                       
                State : clean                                                  
         Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)  
                State : clean                                                  
         Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)  
                State : clean                                                  
         Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)  
                State : clean                                                  
         Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)  

      So everything is clean.

       

      But if I try to mount the file system, even in degraded mode, it fails:

      root@nas:~# mount -t btrfs -o degraded /dev/md127 /var/recover/
      mount: wrong fs type, bad option, bad superblock on /dev/md127,
             missing codepage or helper program, or other error
             In some cases useful info is found in syslog - try
             dmesg | tail or so.
      root@nas:~# mount -t btrfs /dev/md127 /var/recover/
      mount: wrong fs type, bad option, bad superblock on /dev/md127,
             missing codepage or helper program, or other error
             In some cases useful info is found in syslog - try
             dmesg | tail or so.

      If I look at dmesg, it is the btrfs ctree error:

      [ 3847.552053] BTRFS (device md126): bad tree block start 11637558493067324903 17973882290176
      [ 3847.552446] BTRFS (device md126): bad tree block start 11637558493067324903 17973882290176
      [ 3847.552481] BTRFS: failed to read tree root on md126
      [ 3847.580455] BTRFS: open_ctree failed
      [ 3852.027911] BTRFS (device md126): bad tree block start 11637558493067324903 17973882290176
      [ 3852.028306] BTRFS (device md126): bad tree block start 11637558493067324903 17973882290176
      [ 3852.028342] BTRFS: failed to read tree root on md126
      [ 3852.070223] BTRFS: open_ctree failed

      So, it seems to me that this is some sort of btrfs corruption issue.

       

      btrfs fi show 

      Label: '0e343298:data'  uuid: 92ecd2e0-45da-425b-87a8-0f626e5ef889
              Total devices 2 FS bytes used 5.04TiB
              devid    1 size 2.71TiB used 2.53TiB path /dev/md127
              devid    2 size 2.73TiB used 2.54TiB path /dev/md126

      Now I don't know why there are two md raid set ups, from other digging (dry run) I see md127 thinks device 2 is missing, whatever that is, is that the second drive? the second raid set (/dev/md126?)

      oot@nas:~# btrfs restore -F -i -D -v /dev/md127 /dev/null
      checksum verify failed on 17973882290176 found 28166D04 wanted 8B8262A2
      checksum verify failed on 17973882290176 found 28166D04 wanted 8B8262A2
      checksum verify failed on 17973882290176 found 46A24E6E wanted 51485848
      checksum verify failed on 17973882290176 found 46A24E6E wanted 51485848
      bytenr mismatch, want=17973882290176, have=3707086136388172086
      Couldn't read tree root
      Could not open root, trying backup super
      warning, device 2 is missing
      checksum verify failed on 17973882290176 found 46A24E6E wanted 51485848
      checksum verify failed on 17973882290176 found 46A24E6E wanted 51485848
      bytenr mismatch, want=17973882290176, have=3707086136388172086
      Couldn't read tree root
      Could not open root, trying backup super
      warning, device 2 is missing
      checksum verify failed on 17973882290176 found 46A24E6E wanted 51485848
      checksum verify failed on 17973882290176 found 46A24E6E wanted 51485848
      bytenr mismatch, want=17973882290176, have=3707086136388172086
      Couldn't read tree root
      Could not open root, trying backup super

      Any suggestions?

      I do have a backup, but I really would like to take this as a learning opportunity.

       

      Thanks!

       

       

      • mdgm-ntgr's avatar
        mdgm-ntgr
        NETGEAR Employee Retired

        As it's on multiple arrays a

        # btrfs device scan

        is needed before attempting to mount the data volume. However with the system booted normally the system would have already run that command.

        If your backup is up to date I'd follow the suggestion I gave for that scenario in my previous post.

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More