NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
EKroboter
Oct 19, 2015Apprentice
6.4 makes our 516 to lock up during disk balance, need to abort. URGENT!
The 6.4 firmware updated continues to screw up everything, it's becoming the worst update ever from Netgear. During a disk balance task, our 516 completely locks up. No frontview access, no SSH, no ...
btaroli
Oct 20, 2015Prodigy
Ugh... My 516 has been similarly afflicted. This is inexcusable.
Happend during the day today, but I'm not sure this was a scheduled balance job, as I receive no alert email indicating that such a job was starting. It's been running 6.4 for a few days already, and I recall watching the initial updates for quota running just fine. And I happen to know it WAS running a balance because I tend to keep top running in a shell at home.
top - 09:21:38 up 5 days, 15:38, 2 users, load average: 2.21, 2.27, 1.82 Tasks: 225 total, 6 running, 219 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.3 us, 11.4 sy, 0.0 ni, 61.3 id, 27.0 wa, 0.0 hi, 0.2 si, 0.0 st KiB Mem: 16324816 total, 15487560 used, 837256 free, 312 buffers KiB Swap: 2093052 total, 16 used, 2093036 free, 13655956 cached packet_write_wait: Connection to 192.168.23.16: Broken pipe oscar:~ btaroli$ R NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND oscar:~ btaroli$ 9 -1 15572 1020 880 R 41.9 0.0 1:55.63 btrfs balance start -dusage 79 -musage 79 /data oscar:~ btaroli$ 0 0 226m 9888 7052 D 1.3 0.1 0:03.95 /usr/sbin/afpd -d -F /etc/netatalk/afp.conf oscar:~ btaroli$ 0 0 3122m 328m 22m S 1.0 2.1 170:47.62 /apps/dvblink-tv-server/dvblink_server
So it appears to have lost network connection around 9:20 this morning. :( Weird thing is that none of this looks like the box was under any extreme load. The I/O wait time and CPU usage numbers look fine. So this smells like a hang or crash. (sigh)
I'm on my second restart now, first was a "soft" down from the front panel but this time I did a full power cycle.
The really frustrating thing is that I KNOW how to kill the balance, but I can't get in to the ssh interface to save my life. The box pings after startup but the front panel stays on "Booting...". Control pad lights up twice in the process and disk lights are blinking, so I know it's up to something at least.
But with other services trying to fight against the balance to start up, I don't think sshd is getting it's turn. After 10-15 minutes the box stops pinging and I see how appreciable action on the disk activity lights.
In lieu of being able to ssh in to shut this crap down, will it actually finish once it gets into this state or is it actually hanging? If I do a FP OS reinstall will this subvert the resumption of the balance at restart? Knowing btrfs from using it on Fedora for a while, I suspet not. But this is lunacy.
Do we all need to start opening support tickets for this, or are people on it and adding to the pile isn't going to help?
btaroli
Oct 20, 2015Prodigy
OK. Managed to get in. Tried just killing the balance, but it hung almost instantly (and stopped pinging). Rebooted again, waiting for email alert about volume usage over 70% -- never thought I'd find that useful -- and then first stopped all installed apps, waited a bit then THEN killed the balance, and after few minutes it finally gave up and stopped. Balance and defrag jobs have been disabled.
I now understand why it died around 9:20, too... The balance job kicked off at 9AM (since I"m never home then on weekdays), but then my Macbook began a Time Machine backup at 9:20. The combination of these two seems to have been what fried it. I can accept /slow/ performance during a BTRFS job, but outright hanging the kernel is a bit much. I'll leave those btrfs jobs disabled for now and hope for a fix to that in the near future.
- EKroboterOct 20, 2015Apprentice
Sounds like we both experienced the same issue. I was lucky enough to be able to cancel the balance at the first try.
I also disabled scheduled jobs and also snapshot creation. These two mundane tasks for the NAS used to be trivial and lightweight, never ever hanging the unit. The only thing that I was complaining before 6.4 was that the performance dropped a bit during backup jobs (which is understandable), but now I celebrate if the unit does not freeze for eight hours straight.
My advice to eveyrone is that they enable ssh access inmediately. I don't know how I could have fixed it without it.
Related Content
NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!