NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
nickjames
Feb 04, 2017Luminary
ReadNAS 516, OS 6.6.1, Inconsistent boot behavior, snapshot restore, corrupt OS?
Hello,
I ran into an interesting scenario yesterday - I was trying to restore a snapshot, specifically 3 or 4 folders within, that were about 100-200GB of data that failed and locked the NAS up almost instantly. According to the System > Performance stats, the volume's operations per second dropped off, and at the same time, the network throughput dropped (which isn't normal; I've attached some screen shots during the moment)
The overall temperature according to the performance charts, was also increasing and never decreased. The device was cool to touch, the fans were not whirling excessively and it wasn't exhausting hot air either; things seemed normal temperature wise. This got as hot as 62*C.
Despite all this, the shares were not accessible or were extremely slow to the point the team was not productive any longer. I tried to restart the device from the WebUI as well as shut down the device from the WebUI but had no luck. The LCD on the NAS just said, "Rebooting, see you soon". During this time, the device was pingable but it never gracefully shut down. Finally I had no choice but to ungracefully shut the device down.
Upon boot up, it would saying Booting... and I could see the quick flash across the screen, "Checking Root FS", and it would start loading the OS. Here is where things get strange, sometimes the boot process would hang at random percentages. The only work around is to ungracefully shut down the device again and try again. We would have to do this 1-3 times before the device would actually be booted and accessible through the WebUI. When its accessible through the WebUI, everything operates as normal *unless* we try to restore a large snapshot (ie- Outlook PST file, 12.5GB), the process described earlier, starts back up again. I can restore small Excel files (200-300kB) without a problem. Most testing was done after hours while the network was idle too.
→ I tried to reinstall the OS but it hangs during the "Checking root FS" and just sits there.
I'm confused as to why the volume is never rebuilt/repaired/degraged/etc. from the ungraceful shutdowns as one would expect, which makes me think something is not right. It almost seems like this might be related to OS 6.6.1 but that is only assumption obviously. The otherthing I've wondered is if one of the disks has failed and there is no information being presented me stating the disk as failed, as the volume appears as healthy but again, after a ungraceful shutdown, it never appears as degraded or unhealthy (an ungraceful shut down, never presents a perfect health volume upon reboot).
We are using Replicate to another 516 that is still on 6.6.0 though and this device responds to shutdowns/restarts no problem. This device does not use snapshots though. So the only difference is 6.6.0 vs. 6.6.1 and snapshots (I think anyway).
→ Does anyone have any insight other than factory resetting the device to default settings, wiping the volume?
→ I have the configuration off of the device, could I load that on the Replicate device, and swap the drives? What if the configuration file is corrupt as well?
→ Does any of this point to one specific thing? It sounds like the acutal OS is the culprit here? Perhaps the volume is just fine? (But why would it ungracefully reboot, and still be in a "Healthy" status?)
Info on my devices-
- ReadNAS 516
- 4x4TB RAID 5
- OS 6.6.1
- Antivirus is *disabled*
-- Replicate device is mirrored and used only for Replication but only running 6.6.0 and virus scan is still enabled.
Many thanks in advance and thanks for reading.
Cheers,
Nick
9 Replies
Replies have been turned off for this discussion
- SandsharkSensei
I had a RAM stick go bad that caused some random issues, including booting. Try running the memory test. It's a stab in the dark, but easy to test.
- nickjamesLuminary
I will give that a shot. I'm having a hard time preforming current tests with the customer's environment. The more I think about it though, I think something is failing or has failed. I just hope we catch the culprit sooner than later.
Thanks for the help, Sandshark!
the 6.7.0 FW remease is expected soon and should solve those issues!
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!