× NETGEAR will be terminating ReadyCLOUD service by July 1st, 2023. For more details click here.
Orbi WiFi 7 RBE973
Reply

RN10400 Lockup after starting up - load average is very high

mikeng
Aspirant

RN10400 Lockup after starting up - load average is very high

Here is what happened.

Recently I upgraded to ReadyNAS OS 6.6.1

I believe it runs okay for approx. a week.

Starting just couple days ago, the box is not responding.  I am not able to access the Admin page, and I cannot access the data through NFS.

I looked at the LCD display and it says some Out of Memory error.  I understand that it is an error. However, I have seen this kind of error many times since at least OS 6.4. In fact my ReadyNAS crashed many times randomly for a while.  Luckily every time it just booted up fine after power cycle. 

 

This time, I powered cycle the machine and now it is hitting a boot up problem.  I have attempted to reboot it 5 times just today.  At first I thought it is not even able to boot.  Laster I monitored the boot process and I saw the machine did boot up to a point the Web Admin interface is available and RAIDer did find the device.  Just after a minute or so, it would die again.

 

I attempted to login to the device through SSH.  I ran a top command and I saw the load average of the of device is rampng up very quick.  Once it reached 25.74 (in another attempt, it reached 39.xx), the device locked up and stopped responding

"dmesg" command shows this (before it died)

[   21.060568] md127: detected capacity change from 0 to 11987456360448

[   21.077293] systemd[1]: Started Apply Kernel Variables.

[   21.095575] systemd[1]: Starting udev Kernel Device Manager...

[   21.149891] systemd[1]: Started Load/Save Random Seed.

[   21.302498] systemd[1]: Started udev Kernel Device Manager.

[   21.549562] systemd[1]: Started Journal Service.

[   21.820687] BTRFS: device label 2fe5133c:data devid 1 transid 115540 /dev/md127

[   21.881708] systemd-journald[1029]: Received request to flush runtime journal from PID 1

[   22.051964] Adding 1047420k swap on /dev/md1.  Priority:-1 extents:1 across:1047420k

[   26.750408] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready

[   31.120195] mvneta d0074000.ethernet eth1: Link is Up - 1Gbps/Full - flow control off

[   31.120244] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready

[   61.139039] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready

[   61.660287] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory

[   61.726340] NFSD: starting 90-second grace period (net c097fc40)

[   65.600218] mvneta d0074000.ethernet eth1: Link is Up - 1Gbps/Full - flow control off

[   65.600266] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready

[  174.389392] Unable to load target_core_iblock

[  174.430562] Unable to load target_core_pscsi

[  174.447741] Unable to load target_core_user

[  182.781162] nfsd: last server has exited, flushing export cache

[  183.360943] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory

[  183.371541] NFSD: starting 90-second grace period (net c097fc40)

 

 

I am able to PING the ReadyNAS, but I just cannot reach the Web Admin page. 

 

----

 

So, I'd need some advice from you.  

What should I try next?  I am currently loking at the support page on USB recovery tool.

The last thing I want to do is to reset the system default and reformat the drives.   Even though the data on the NAS may be able to recover from other places, it is a pain to reload so much data...

 

Also, I do want to mention is.  Since OS 6.4 (I think), the box has been crashing randomly and most of them I noticed that it complains Out of memory.  I did try to turn off some of unnecessary Apps, but apparently that didn't help much.  In this particular case, I recall I upgraded to 6.6.1 at least for few days.  I guess the upgrade is not the direct cause.

 

I wonder what has been happening after the system is booted up, which caused the high load on the system.

 

Thanks.

 

MNg

 

My setup is RN10400 + 4 x 4TB hard drives in X-RAID

 

 

 

Model: RN10400|ReadyNAS 100 Series 4- Bay (Diskless)
Message 1 of 5

Accepted Solutions
mikeng
Aspirant

Re: RN10400 Lockup after starting up - load average is very high

Thanks.

 

I use Linux, so it is a good news.

 

Anyway.  I came across another discussion thread.

https://community.netgear.com/t5/Using-your-ReadyNAS/RN104-immediately-quot-out-of-memory-390-quot-e...

 

I took the idea.  Boot my NAS into Read-only mode and followed the idea to remove the old snapshots.  I don't even know there are 40+ snapshots created so far.  I don't recall I set that up, so perhaps it is just from some automated scheduler.

 

Once I removed the old snapshots, "top" command shows the load average at approx 3-4.50. Even after I restarted the NAS normally, the load apparently stay at 5.00 or below.    

 

So far,  it has been up for longer than before (like 20+ min now).   I will keep it on and see if it is back to normal.  

 

However, as I mentioned earlier, the Out of Memory error does happen quite often with recent ReadyNAS OS.  There is definetly something not right.  Perhaps there is some memory leak or simply the memory footprint is too much for such model.  I hope more bugfix and optimzation can be done soon.

 

Given the very small amount of RAM on board (and no way to add more), I do worry the stability of the system and really skeptical on using it for any important data.  Even with a good backup, it is a hardship to restore the data.  Yes, I well understood it is probably just made as a consumer grade NAS (IMO).  Unfortnately, I do not have confidence on it at this point. 

 

Thanks.  I am cross my finger and hope it can stay up for a while.  I only use it as the storage for my video streaming purposes, but I still want it to be stable.  

 

 

View solution in original post

Message 5 of 5

All Replies
mikeng
Aspirant

Re: RN10400 Lockup after starting up - load average is very high

Forgot to mention that.  After it is lockup, all LEDs are ON, but there is no activities, no flashing.  LCD display is blank and dark.

If I press the power button, there is no response, LCD won't light up.   The only thing I could do is to unplug it from the power and let it restart.  However, it will repeat.and lock up again.

 

Please excuse me if I don't sound very polite.  I must admit that I am feeling very fustrated right now.  I think in the past 6 months, I have power cycle it almost every month or even every other weeks (and 5+ times today). This is to a point that I cannot trust the ReadyNAS any more.   I trust my NV+, but this new device is making me feeling insecure.

 

Thx

Message 2 of 5
mikeng
Aspirant

Re: RN10400 Lockup after starting up - load average is very high

I am replying to myself and just trying to record what I have been doing.

----

For the last few days, I have been rebooting it multiple times a day.

 

I tried boot into the usb recovery images.  It didn't say much.  Just boot and then powered off by itself.  Unfortunatley nothing changed at all.  It still crashed after boot up

 

In the short period of time during system startup and lockup and betwene reboots, I managed to turned off most of apps (only 1-2).  I was able to turned off Anti-virus, etc.

 

I did see a new error message shown in LCD display occasionally, which says Out of memory, 360 or some numbers

 

I attempted to go into Boot menu.

1.  Memory test.  Run for whole night and still showing as "0:00:01". The power LED is flashing, but it is not responding in any way.  After a while day, I powered it off.  This model has only 512M?  That shouldn't take that long to run memory tests and the progress indicator only shows "0:00:01" as if it has not even started the test at all.

 

2. Disk Test Mode.  The 4 drive LEDs and Power LED are flashing and that is about it.  The RAIDar app on my MacBook saying the NAS is testing the disks.  It had been like this for about a day and then went silent...saying "Out Of Memory". Great.  I cannot even test the disks with some meaningful results.

 

3. I also booted it into Read-Only disk mode from Boot-Menu.  Okay. That allows me to boot up and stay for a while.  I can even connect to the Web Admin interface.  Well.  It is read-only, and so I got nothing to do on it.  I can't mount the network share to copy the data out.

 

I guess I must connect a USB drive in order to copy the data out when the disk were read-only (if possible), but I didn't try yet.  As I said before, defauting the whole NAS and reformat the drives, restoring the data is the last thing I want to try.     I don't know what kind of enviroment other users are using. Perhaps, others have instant way to copy data.   I certainly don't want to reload multiple TB of data unless there is no choice.

 

BTW, is there some wy that the disks can be mounted and accessible from other "computers" like a PC?  If possible, that could be faster than relying on the ARM CPU in ReadyNAS to copy data.

 

Thx.

Model: RN10400|ReadyNAS 100 Series 4- Bay (Diskless)
Message 3 of 5
StephenB
Guru

Re: RN10400 Lockup after starting up - load average is very high


@mikeng wrote:

 

BTW, is there some wy that the disks can be mounted and accessible from other "computers" like a PC?  If possible, that could be faster than relying on the ARM CPU in ReadyNAS to copy data.

 


The disks use the btrfs file system, which is not supported in Windows.  They can be mounted on if the PC is running Linux.  There is one RAID recovery software package which can access the data from windows - ReclaiMe.  It is quite expensive though, and is intended for recovery, not as a general purpose btrfs package.

Message 4 of 5
mikeng
Aspirant

Re: RN10400 Lockup after starting up - load average is very high

Thanks.

 

I use Linux, so it is a good news.

 

Anyway.  I came across another discussion thread.

https://community.netgear.com/t5/Using-your-ReadyNAS/RN104-immediately-quot-out-of-memory-390-quot-e...

 

I took the idea.  Boot my NAS into Read-only mode and followed the idea to remove the old snapshots.  I don't even know there are 40+ snapshots created so far.  I don't recall I set that up, so perhaps it is just from some automated scheduler.

 

Once I removed the old snapshots, "top" command shows the load average at approx 3-4.50. Even after I restarted the NAS normally, the load apparently stay at 5.00 or below.    

 

So far,  it has been up for longer than before (like 20+ min now).   I will keep it on and see if it is back to normal.  

 

However, as I mentioned earlier, the Out of Memory error does happen quite often with recent ReadyNAS OS.  There is definetly something not right.  Perhaps there is some memory leak or simply the memory footprint is too much for such model.  I hope more bugfix and optimzation can be done soon.

 

Given the very small amount of RAM on board (and no way to add more), I do worry the stability of the system and really skeptical on using it for any important data.  Even with a good backup, it is a hardship to restore the data.  Yes, I well understood it is probably just made as a consumer grade NAS (IMO).  Unfortnately, I do not have confidence on it at this point. 

 

Thanks.  I am cross my finger and hope it can stay up for a while.  I only use it as the storage for my video streaming purposes, but I still want it to be stable.  

 

 

Message 5 of 5
Top Contributors
Discussion stats
  • 4 replies
  • 2397 views
  • 0 kudos
  • 2 in conversation
Announcements