6.4 makes our 516 to lock up during disk balance, need to abort. URGENT!

Question

The 6.4 firmware updated continues to screw up everything, it's becoming the worst update ever from Netgear.

During a disk balance task, our 516 completely locks up. No frontview access, no SSH, no ping responses. Nothing.

After manually restarting the device, the disk balance starts all over again and will lockup the system eventually (sometimes at 40%, others at 62%, it's completely random).

I have no way to cancel this job from frontview. I have ssh access but I need furhter instructions.

guigui_bebert · Answer

Same here, but as it is not a production unit I reduced the load from other services and it eventually finished syncing.

But a few hours later it locked up (because I copied a file from one share to another) and I had to power-shutdown and guess what... the sync began again!!

EKroboter · Answer

God. Looks like someone is gonna get fired for letting the 6.4 firmware out. It's been nothing but headaches.

EKroboter · Answer

After canceling the Balance job, performance and responsiveness is back to normal. I have deleted the scheduled defrag, scrub and balance jobs as precaution.&nbsp;

btaroli · Answer

Ugh... My 516 has been similarly afflicted. This is inexcusable.&nbsp;Happend during the day today, but I'm not sure this was a scheduled balance job, as I receive no alert email indicating that such a job was starting. It's been running 6.4 for a few days already, and I recall watching the initial updates for quota running just fine. And I happen to know it WAS running a balance because I tend to keep top running in a shell at home.&nbsp;top - 09:21:38 up 5 days, 15:38,  2 users,  load average: 2.21, 2.27, 1.82
Tasks: 225 total,   6 running, 219 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.3 us, 11.4 sy,  0.0 ni, 61.3 id, 27.0 wa,  0.0 hi,  0.2 si,  0.0 st
KiB Mem:  16324816 total, 15487560 used,   837256 free,      312 buffers
KiB Swap:  2093052 total,       16 used,  2093036 free, 13655956 cached
packet_write_wait: Connection to 192.168.23.16: Broken pipe
oscar:~ btaroli$ R  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND                                                   
oscar:~ btaroli$ 9  -1 15572 1020  880 R  41.9  0.0   1:55.63 btrfs balance start -dusage 79 -musage 79 /data           
oscar:~ btaroli$ 0   0  226m 9888 7052 D   1.3  0.1   0:03.95 /usr/sbin/afpd -d -F /etc/netatalk/afp.conf               
oscar:~ btaroli$ 0   0 3122m 328m  22m S   1.0  2.1 170:47.62 /apps/dvblink-tv-server/dvblink_server  So it appears to have lost network connection around 9:20 this morning. :( Weird thing is that none of this looks like the box was under any extreme load. The I/O wait time and CPU usage numbers look fine. So this smells like a hang or crash. (sigh)&nbsp;I'm on my second restart now, first was a "soft" down from the front panel but this time I did a full power cycle.&nbsp;The really frustrating thing is that I KNOW how to kill the balance, but I can't get in to the ssh interface to save my life. The box pings after startup but the front panel stays on "Booting...". Control pad lights up twice in the process and disk lights are blinking, so I know it's up to something at least.&nbsp;But with other services trying to fight against the balance to start up, I don't think sshd is getting it's turn. After 10-15 minutes the box stops pinging and I see how appreciable action on the disk activity lights.&nbsp;In lieu of being able to ssh in to shut this crap down, will it actually finish once it gets into this state or is it actually hanging? If I do a FP OS reinstall will this subvert the resumption of the balance at restart? Knowing btrfs from using it on Fedora for a while, I suspet not. But this is lunacy.&nbsp;Do we all need to start opening support tickets for this, or are people on it and adding to the pile isn't going to help?

btaroli · Answer

OK. Managed to get in. Tried just killing the balance, but it hung almost instantly (and stopped pinging). Rebooted again, waiting for email alert about volume usage over 70% -- never thought I'd find that useful -- and then first stopped all installed apps, waited a bit then THEN killed the balance, and after few minutes it finally gave up and stopped. Balance and defrag jobs have been disabled.

I now understand why it died around 9:20, too... The balance job kicked off at 9AM (since I"m never home then on weekdays), but then my Macbook began a Time Machine backup at 9:20. The combination of these two seems to have been what fried it. I can accept /slow/ performance during a BTRFS job, but outright hanging the kernel is a bit much. I'll leave those btrfs jobs disabled for now and hope for a fix to that in the near future.

Forum Discussion

6.4 makes our 516 to lock up during disk balance, need to abort. URGENT!

10 Replies

Related Content

ReadyNAS 316 out of memory during Balance

Volume balance

XS728T - LAG Load balancing

Lost Password During Restore

Packet loss during gaming

NETGEAR Academy

ProSupport for Business