NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

Whompin105's avatar
Whompin105
Aspirant
May 16, 2021
Solved

RN104 fw:6.9.0 fail startup after RAID expansion

Had 3x 4TB drives and was getting close to hitting capacity.  Added a 4th 4TB drive and waited a couple days for resync.  At some point I saw RETRY STARTUP, and cannot power off without pulling plug.  I've already tried removing the added drive, read only reboot, and boot menu OS reinstall, but still get failed startup.  RAIDar shows the device and 4 drives but says "Management Service is Offline" and I'm unable to get to admin page.   I can download logs, but I'm unsure how to interpret them.  Where should I look for clues? I guess I should add that it would be a real pain to lose the data, which also doesn't seem to be accessible in the current state. TIA

  • StephenB's avatar
    StephenB
    May 19, 2021

    Whompin105 wrote:

    Any ideas about what might cause the non-fresh disk being kicked from the array?


    Non-fresh means it's not in sync (meaning some writes never made it to the disk). So the real question is why it's not in sync.  Was the NAS forcibly shut down before this (or suffer a power failure).

     

    It is possible to force the array to assemble anyway.  Though being out-of-sync could result in some file system corruption/loss.

14 Replies

Replies have been turned off for this discussion
  • It's strongly recommended to update your regular backup before expanding a data volume. No important data should be stored on the one device. When you remove a disk the volume becomes degraded and your data is at heightened risk until the RAID array is rebuilt.

     

    A bit late for this now, but checking e.g. smart_history.log for clues as to whether any of the disks installed are failing or have failed, is a good idea for choosing which disk to replace first. Disks can and do fail at any time however.

    You'd want to check mdstat.log to see if the data volume RAID layers md126 and md127 have been started.

     

    You'd also want to check btrfs.log to see if the data volume is recognised there and also check if it's mounted e.g. in volume.log.

    If the data volume is mounted that's a good sign. If it's not the management service probably failed to start due to a problem with the RAID or data volume, which would suggest a data recovery situation.

    • Whompin105's avatar
      Whompin105
      Aspirant

      I really appreciate the reply.  Thank you.

      Ok, so I looked at smart_history and can see one of the drives had 8 pending sectors, 8 uncorrectible errors and 9 ata errors, which were all logged after the 4th drive was added (horizonal expansion) and failed boot.  I also used boot menu disk check before getting your response and it indicacted errors with disk1.  I removed disk 1 and can get the device to boot and let me in to the admin page. RAIDar indicates an inactive RAID5 data-0 volume with 10.9 of 10.9 TB used (actual data i risk losing is about 7TB) and innactive "RAID level unknown" data volume with 0MB of 0 MB used. Admin page says to remove innactive volumes to use disks #2, 3 and 4.  Was prompted for firmware update to 6.10.4, after which point the device will now boot with all 4 drives installed, but still shows inactive volume.  It seems the updated firmware allows it to finish booting and detects the disk errors rather than freezing up, so now Admin page shows "remove innactive volumes to use disk #1,2,3,4."  Anything else I can look into from here?  I'm not in a position to pursue paid data recovery, but would happily spend a bit of time trying out other options and learning some things in the process.  

      • StephenB's avatar
        StephenB
        Guru - Experienced User

        can you copy/paste mdstat.log into a reply here?

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More