× NETGEAR will be terminating ReadyCLOUD service by July 1st, 2023. For more details click here.
Orbi WiFi 7 RBE973
Reply

ReadyNAS crash on Balance task

dojobel
Tutor

ReadyNAS crash on Balance task

Hi guys,

 

A couple of times now my ReadyNAS Pro 6 has completely locked up during a BTRFS Balance task. It's only recently been set up, so I've never actually had a balance go to full completion due to this issue. I haven't tested from the CLI yet, just wanted to post here and see if anyone has experienced this or knows what logs I might be able to poke around in to get some answers (I can't seem to find the "usual" logs like /var/log/messages or similar). 

 

I've done the RN4to6 Upgrade, currently running ReadyNAS OS 6.9.4. The NAS is purely for iSCSI, running only a couple of disks from some really large VMs. The other disks for these VMs are on 15K vSAN disks that aren't affiliated with the NAS but obviously the VMs crash every time the NAS does which probably isn't doing them a lot of good.

 

At first, I thought it was memory exhaustion so I upgraded it to what seems to be the maximum (4GB), and the performance improved pretty dramatically but still stuck on this balance crash issue.

 

A balance was scheduled for late last night and I got up this morning to find it had crashed, I ran another balance task manually when I got home to see if it would crash again and it did immediately.

 

The I/O load on this NAS is pretty low; it's largely for big media and linux packages so doesn't really get a lot of storage hits.

 

The Web UI log really says nothing useful that I can see, the sequence is usually the task starting then the NAS booting back up after I perform a hard reset:

Volume: Balance started for volume array0.

System: ReadyNASOS service or process was restarted.

 

Some other things worth mentioning:

-The volume is Encrypted

-Total size is 4.69TB (RAID-10), currently 767.45GB free

-Every time the NAS crashes after a balance fail, it will do a resync of the RAID

 

Any advice is appreciated!

Model: ReadyNAS RNDP6000|ReadyNAS Pro 6 Chassis only
Message 1 of 12
StephenB
Guru

Re: ReadyNAS crash on Balance task

There probably isn't much need to balance a volume with only a few iSCSI volumes, but of course it shouldn't fail.

 

I am suspecting a disk issue of some kind.

 

Have you looked at the disk SMART stats?  If not, try downloading logs before you try a balance, and then download again after it fails/restarts.  Look in disk_info.log for the stats.

 

You might also try doing a scrub, and see if that completes ok.

Message 2 of 12
dojobel
Tutor

Re: ReadyNAS crash on Balance task

Ah, good to know. I only do them once a month as a "good measure" because I know how BTRFS can be when it doesn't get the necessary TLC.

 

Great point about the disks, I'm aware of one that had ATA errors but the number hasn't changed (sitting on 10 currently). Is that a cause for concern? I've now found 2 more with ATA errors though, sitting on a count of 1 and 2. All the other counters are sitting on 0 for all disks:

Reallocated Sectors: 0
Reallocation Events: 0
Spin Retry Count: 0
Current Pending Sector Count: 0
Uncorrectable Sector Count: 0

 

Once the resync completes I'll disconnect the LUNs and try a scrub, will report back here on how it goes.

 

Thanks!

Message 3 of 12
Sandshark
Sensei

Re: ReadyNAS crash on Balance task

I've seen the readysasd task jump to 100% during a scrub on my 516, effectively locking up the UI.  Maybe a balance could do it as well, especially if it was the first (and probably longer).  If you have SSH and itr should happen again, try going in that way and use TOP to see if something's taking all the CPU power.

Message 4 of 12
dojobel
Tutor

Re: ReadyNAS crash on Balance task

I've started the scrub now, I'll report back on the findings from that.

 


@Sandshark wrote:

I've seen the readysasd task jump to 100% during a scrub on my 516, effectively locking up the UI.  Maybe a balance could do it as well, especially if it was the first (and probably longer).  If you have SSH and itr should happen again, try going in that way and use TOP to see if something's taking all the CPU power.

I thought that may be the case and wanted to check top, so I tried a couple of things:

- Pinging the NAS (times out)

-SSH (times out)

-Single quick-press of the power button (usually changes the LCD to display a warning that a second press will power off the NAS, but the display stayed as-is).

 

All those things indicated to me that the OS had locked up

Message 5 of 12
StephenB
Guru

Re: ReadyNAS crash on Balance task


@dojobel wrote:

 

Great point about the disks, I'm aware of one that had ATA errors but the number hasn't changed (sitting on 10 currently). Is that a cause for concern? I've now found 2 more with ATA errors though, sitting on a count of 1 and 2. All the other counters are sitting on


I'd keep an eye on the ATA errors, but I wouldn't be concerned if the counts aren't rising. 

Message 6 of 12
dojobel
Tutor

Re: ReadyNAS crash on Balance task

Alright guys, the scrub has completed and everything is OK. I'll shut down my VMs later tonight as a precaution and try another Balance. 

 

As a side note, I actually have another NAS (ReadyNAS Pro 4) which is Kernel Panicking, which I thought was related to lack of RAM, but it's still occurring since upgrading to the max (2GB). I checked the logs after the most recent crash, and it's actually started a BTRFS Balance task just prior to the crash, the same as this one. The difference is the Pro 4 actually (very cleverly) prints part of the Kernel Panic reason onto the LCD, the message is:

_raw_spin_lock_bh+16

 

I have another thread going for that NAS, since at the time I didn't know the two problems were related:

https://community.netgear.com/t5/Using-your-ReadyNAS-in-Business/ReadyNAS-Pro-4-Kernel-Panic/m-p/167...

Message 7 of 12
bedlam1
Prodigy

Re: ReadyNAS crash on Balance task

2GB Memory is not the max for a Pro 4, I have 4GB happily installed

Message 8 of 12
dojobel
Tutor

Re: ReadyNAS crash on Balance task


@bedlam1 wrote:

2GB Memory is not the max for a Pro 4, I have 4GB happily installed


ah, I was going off what dmidecode was telling me 🙂 

 

# dmidecode --type 16
# dmidecode 2.12
SMBIOS 2.6 present.

Handle 0x0009, DMI type 16, 15 bytes
Physical Memory Array
        Location: System Board Or Motherboard
        Use: System Memory
        Error Correction Type: None
        Maximum Capacity: 2 GB
        Error Information Handle: Not Provided
        Number Of Devices: 1

Message 9 of 12
dojobel
Tutor

Re: ReadyNAS crash on Balance task

Not sure if this might be contributing to my problems, but I noticed one particular disk has a much higher await value in iostat than the others.

 

I've attached a screenshot from when the NAS is under load and what the disks look like. This disk (sdd) has 2 ATA errors, but it hasn't been increasing.

 

Will report back on the balance task.

Message 10 of 12
dojobel
Tutor

Re: ReadyNAS crash on Balance task

Sorry for the delay guys, the Scrub task completed fine but another scheduled Balance has since kicked off and crashed the NAS again.

Message 11 of 12
dojobel
Tutor

Re: ReadyNAS crash on Balance task

Hi everyone,

 

Likewise with my post about the Pro 4 locking up I've come back to do the right thing and fill you in on what I've done to get around the problem 🙂 

 

It turns out I had a VM running from this NAS that I was completely unaware of which could very likely be a highly transactional VM (for those familiar with it, it's an Archive Team Warrior VM). I found this, shut the VM down to test and all of a sudden the slowness stopped and the problems with the chassis locking up went away.

 

I've now moved on to a 3200 so this poor little NAS will get a break with some easier work now. Thanks to everyone that helped!

Message 12 of 12
Top Contributors
Discussion stats
  • 11 replies
  • 2114 views
  • 0 kudos
  • 4 in conversation
Announcements