Forum Discussion

Aspirant

May 16, 2021

Solved

RN104 fw:6.9.0 fail startup after RAID expansion

Had 3x 4TB drives and was getting close to hitting capacity. Added a 4th 4TB drive and waited a couple days for resync. At some point I saw RETRY STARTUP, and cannot power off without pulling plug....

StephenB
May 19, 2021
Whompin105 wrote:

Any ideas about what might cause the non-fresh disk being kicked from the array?

Non-fresh means it's not in sync (meaning some writes never made it to the disk). So the real question is why it's not in sync. Was the NAS forcibly shut down before this (or suffer a power failure).

It is possible to force the array to assemble anyway. Though being out-of-sync could result in some file system corruption/loss.

mdgm

Virtuoso

May 17, 2021

It's strongly recommended to update your regular backup before expanding a data volume. No important data should be stored on the one device. When you remove a disk the volume becomes degraded and your data is at heightened risk until the RAID array is rebuilt.

A bit late for this now, but checking e.g. smart_history.log for clues as to whether any of the disks installed are failing or have failed, is a good idea for choosing which disk to replace first. Disks can and do fail at any time however.

You'd want to check mdstat.log to see if the data volume RAID layers md126 and md127 have been started.

You'd also want to check btrfs.log to see if the data volume is recognised there and also check if it's mounted e.g. in volume.log.

If the data volume is mounted that's a good sign. If it's not the management service probably failed to start due to a problem with the RAID or data volume, which would suggest a data recovery situation.

Whompin105

Aspirant

May 17, 2021

I really appreciate the reply. Thank you.

Ok, so I looked at smart_history and can see one of the drives had 8 pending sectors, 8 uncorrectible errors and 9 ata errors, which were all logged after the 4th drive was added (horizonal expansion) and failed boot. I also used boot menu disk check before getting your response and it indicacted errors with disk1. I removed disk 1 and can get the device to boot and let me in to the admin page. RAIDar indicates an inactive RAID5 data-0 volume with 10.9 of 10.9 TB used (actual data i risk losing is about 7TB) and innactive "RAID level unknown" data volume with 0MB of 0 MB used. Admin page says to remove innactive volumes to use disks #2, 3 and 4. Was prompted for firmware update to 6.10.4, after which point the device will now boot with all 4 drives installed, but still shows inactive volume. It seems the updated firmware allows it to finish booting and detects the disk errors rather than freezing up, so now Admin page shows "remove innactive volumes to use disk #1,2,3,4." Anything else I can look into from here? I'm not in a position to pursue paid data recovery, but would happily spend a bit of time trying out other options and learning some things in the process.

StephenB
Guru - Experienced User
May 18, 2021
can you copy/paste mdstat.log into a reply here?
- Whompin105
  Aspirant
  May 18, 2021
  Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
  md1 : active raid10 sdd2[3] sdc2[2] sdb2[1] sda2[0]
  1044480 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
  
  md0 : active raid1 sdd1[7] sdc1[4] sda1[6] sdb1[5]
  4190208 blocks super 1.2 [4/4] [UUUU]
  
  unused devices: <none>
  /dev/md/0:
  Version : 1.2
  Creation Time : Tue Aug 19 04:37:49 2014
  Raid Level : raid1
  Array Size : 4190208 (4.00 GiB 4.29 GB)
  Used Dev Size : 4190208 (4.00 GiB 4.29 GB)
  Raid Devices : 4
  Total Devices : 4
  Persistence : Superblock is persistent
  Update Time : Mon May 17 15:46:29 2021
  State : clean
  Active Devices : 4
  Working Devices : 4
  Failed Devices : 0
  Spare Devices : 0
  Consistency Policy : unknown
  Name : 0e3603c2:0 (local to host 0e3603c2)
  UUID : 5fea52a3:c212a53f:5d24e9bb:6426d7ec
  Events : 69127
  Number Major Minor RaidDevice State
  4 8 33 0 active sync /dev/sdc1
  7 8 49 1 active sync /dev/sdd1
  5 8 17 2 active sync /dev/sdb1
  6 8 1 3 active sync /dev/sda1
- Whompin105
  Aspirant
  May 18, 2021
  I have tried to and my response post keeps disappearing - let's give it another shot:
  Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
  md1 : active raid10 sdd2[3] sdc2[2] sdb2[1] sda2[0]
  1044480 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
  
  md0 : active raid1 sdd1[7] sdc1[4] sda1[6] sdb1[5]
  4190208 blocks super 1.2 [4/4] [UUUU]
  
  unused devices: <none>
  /dev/md/0:
  Version : 1.2
  Creation Time : Tue Aug 19 04:37:49 2014
  Raid Level : raid1
  Array Size : 4190208 (4.00 GiB 4.29 GB)
  Used Dev Size : 4190208 (4.00 GiB 4.29 GB)
  Raid Devices : 4
  Total Devices : 4
  Persistence : Superblock is persistent
  Update Time : Mon May 17 15:46:29 2021
  State : clean
  Active Devices : 4
  Working Devices : 4
  Failed Devices : 0
  Spare Devices : 0
  Consistency Policy : unknown
  Name : 0e3603c2:0 (local to host 0e3603c2)
  UUID : 5fea52a3:c212a53f:5d24e9bb:6426d7ec
  Events : 69127
  Number Major Minor RaidDevice State
  4 8 33 0 active sync /dev/sdc1
  7 8 49 1 active sync /dev/sdd1
  5 8 17 2 active sync /dev/sdb1
  6 8 1 3 active sync /dev/sda1
- Whompin105
  Aspirant
  May 18, 2021
  I have tried to, but whenever I reload the page the reply post disappears. I've tried sending it in a PM
  - StephenB
    Guru - Experienced User
    May 18, 2021
    This tells us that the OS partition has expanded to all four disks. It's not clear why the horizonal expansion failed.
    
    The volume is likely out of sync at this point, but there could be other things wrong.
    
    Have you ever used the linux command line interface?
    
    Whompin105 wrote:
    
    I have tried to, but whenever I reload the page the reply post disappears. I've tried sending it in a PM
    
    There is an automatic spam filter that sometimes kicks in. Periodically the mods review the quarantine queue and release false positives.
    
    I can also release them, so you can PM me if it happens again.

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

Learn More

Forum Discussion

RN104 fw:6.9.0 fail startup after RAID expansion

Related Content

FW 6.9.0 updates advices

FW 6.9.0 OneDrive sync problem (RN204, RN104)

Volume Expansion question

RAX50 seems to startup normally but no internet

Fail to startup nach FW update

NETGEAR Academy

ProSupport for Business