× NETGEAR will be terminating ReadyCLOUD service by July 1st, 2023. For more details click here.
Orbi WiFi 7 RBE973
Reply

Issue Replacing Disks | RN316 automatically shut down instead of rebuilding RAID array

Platypus69
Luminary

Issue Replacing Disks | RN316 automatically shut down instead of rebuilding RAID array

So is this normal behaviour? Or is this a bug in the software?

 

I have a RN316 with 6 x 4TB HDDs ina a RAID 5 configuration.

There is less than 5% free space left.

So I want to replace all of them with 10TB HDDs so that the RN316 automatically resizes the RAID 5  volume from 24TB to 50TB.

 

  1. So first I upgraded the firmware to 6.9.4 (Hotfix 1).
  2. Then I powered up the RN316 and while it was running I took at the first 4TB HDD.
  3. The web app obviously said the RAID set was degraded
  4. I did notice that it did pop up a message in the web app that it would be shut down, but I ignored it.
  5. I inserted the new virgin 10TB HDD.
  6. It initialised it and then started to sync/rebuild the RAID set.
  7. An hour or so later I noticed that the RN316 had automatically shut down.
  8. At first I though it was because of my "power down policy", but I was wrong.
  9. It seems that the shutdown message in step 4 took precedence over the insertion of rebuilding the RAID set.
  10. So I powered up the RN316.
  11. It rebuild the RAID set in 18 hours or so
  12. Everything is fine.

But...

 

Is this a bug that the RN316 automatically shuts down instead of continuing the rebuild of the RAID set? (Even though it did not have a hot spare.)

 

Surely in cannot be "by design"?

 

Anyway, thought I might post this here...

 

Log below:

TurnOff.png

 

 

 

 

Model: RN31600|ReadyNAS 300 Series 6- Bay
Message 1 of 9
Retired_Member
Not applicable

Re: Issue Replacing Disks | RN316 automatically shut down instead of rebuilding RAID array

Hi @Platypus69, from your description I understand, that in step 4 the nas announced that shutdown you later experienced. So, finally all worked out as designed according to your post. I would probably have waited with inserting the new 10TB disk until the shutdown/reboot would have happened. Imho, you are lucky, that you are not in trouble now. Sometimes Netgear's nas are more tolerant against (unwise, I would call it) user activities as one could expect.

Kind regards

Message 2 of 9
StephenB
Guru

Re: Issue Replacing Disks | RN316 automatically shut down instead of rebuilding RAID array


@Platypus69 wrote:

 

Is this a bug that the RN316 automatically shuts down instead of continuing the rebuild of the RAID set? 

 

 

 


Please look on system->settings->alerts in the admin web ui, and let us know if the checkbox next to "Shut down the system when a disk fails or no longer responds" is checked.

 

It does sound like there is a bug (the disk insertion/RAID rebuild should have canceled the shutdown),  I suggest clearing that particular setting - at least while you are doing the upgrades.

 

 

Note that the expansion time will get much longer as you proceed (since every sector in the volume is either read or written during each resync).  Your data is more at risk when the volume is degraded, so I do recommend making sure you have an up-to-date backup.  FWIW, it would be faster to do a factory reset, rebuilt the NAS and then restore the data from a local backup - since you would only build the data volume once.

Message 3 of 9
Platypus69
Luminary

Re: Issue Replacing Disks | RN316 automatically shut down instead of rebuilding RAID array

Thanks!

 

You are correct, the "Shut down the system when a disk fails or no longer responds" is checked.

 

Although, similar to your reply, it seems to me like a bug as it should cancel the shutdown if a replacement HDD is inserted within the 30 minute grace period.

 

Not sure really why it should take longer to rebuild/resync the RAID as I add each disk. I would have thought that it would be roughly the same, as each time you have to read the other 5 x 4TB worth of stripes to calulate the missing stripe. And then expand the volume after the final HDD is synced. (There will be no data modifications during this entire process.)

 

I plan to wait 3 days between each disk replacement. And might foce a scrub after each one, just to be sure.

Message 4 of 9
Platypus69
Luminary

Re: Issue Replacing Disks | RN316 automatically shut down instead of rebuilding RAID array

Surely the alert is for the use case when you are not present and cannot replace the HDD relatively quickly, in which case you would argue that you shut down the NAS as another HDD could fail mathematically (so higher risk, than shutting them down and hoping they come up and are OK.)

 

In other words I have no idea!!!

 

Cat Happy

Message 5 of 9
Platypus69
Luminary

Re: Issue Replacing Disks | RN316 automatically shut down instead of rebuilding RAID array

Thanks for your reply, although I do not understand it. Cat Surprised

 

The recommendation seems to be to always replace a disk while the system is hot. as opposed to shuting down NAS, replacing HDD and booting it up.

 

Why would I wait for a shutdown? It was not a reboot.

 

Can you please elaborate on why I am lucky, or what I did that was unwise, as I seem to have followed best and/or recommend practice.

 

 

Message 6 of 9
StephenB
Guru

Re: Issue Replacing Disks | RN316 automatically shut down instead of rebuilding RAID array


@Platypus69 wrote:

 

Not sure really why it should take longer to rebuild/resync the RAID as I add each disk. I would have thought that it would be roughly the same, as each time you have to read the other 5 x 4TB worth of stripes to calulate the missing stripe. And then expand the volume after the final HDD is synced. (There will be no data modifications during this entire process.)

 


The volume expansion isn't deferred until the end.  It will begin with the second disk insertion.   BTW, you likely will need to reboot after the second disk syncs - and make sure the volume is fully expanded before you proceed to the third disk. 

 

So the first disk replacement requires 24 TB of disk i/o.  The last one will require 60 TB of disk i/o.  The time goes up simply because the volume is getting bigger as you proceed.  The total process will need 264 TB of i/o to complete. 

 

A factory reset would have built the volume once - requiring 60 TB of disk i/o, and another ~24 TB to restore the data.  So it would have taken approximately 30% of the time (assuming you already had an up-to-date backup).  You could have also used RAID-6 - many people prefer the extra protection with very large volumes.

 


@Platypus69 wrote:

nd might foce a scrub after each one, just to be sure.


Don't do that - the RAID scrub will just double the i/o (and the time).  Then there would be a BTRFS filesystem scrub on top of that.  

 


@Platypus69 wrote:

Surely the alert is for the use case when you are not present and cannot replace the HDD relatively quickly, in which case you would argue that you shut down the NAS as another HDD could fail mathematically (so higher risk, than shutting them down and hoping they come up and are OK.)

Likely that is the use case.  Of course a disk failure can look like a removal, so I understand why the timer would start then.  However, the insertion should cancel it.  

Message 7 of 9
Retired_Member
Not applicable

Re: Issue Replacing Disks | RN316 automatically shut down instead of rebuilding RAID array

@Platypus69 wrote: "Can you please elaborate on why I am lucky, or what I did that was unwise, as I seem to have followed best and/or recommend practice."

 

To my opinion it was unwise (lets call it "very brave", if you like that synonym instead), to not let a system complete an announced shutdown peacefully, which was alreay in degraded status at that point of time. But instead confuse it during its preparation for shutdown by changing the environment (pushing in the new disk).

To my opinion you were lucky, because you made an assumption about your system's behaviour, which turned out not to be true and you were not sufffering any consequences.

 

By the way, you might want to enable email notification in the future, which would alert you about this kind of issue (disk failure).

I think, you ended up with the best of all possible worlds, fortunately 🙂

Kind regards

Message 8 of 9
Platypus69
Luminary

Re: Issue Replacing Disks | RN316 automatically shut down instead of rebuilding RAID array

Thanks again.

 

I plan to restart after each disk is added and synced. Just to make sure.

 

My idea around doing a scrub was simply to do something during the "burn in period" of 3 days between disk swaps. If it gets too long I will not do it.

 

Message 9 of 9
Top Contributors
Discussion stats
  • 8 replies
  • 1784 views
  • 1 kudo
  • 3 in conversation
Announcements