- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello everyone,
I have updated my RN104 last week, and have been poised by the unit hanging up randomly since then.
I have searched andread a lot of posts here, and after 5 hard reboot and re-syncing, I may have found a solution.
At least, it's been 1 day the unit has not locked up, so I'm crossing my fingers.
The culprit, I think, might by the snapshot creation.
I had 3 share protected by snapshots. Documents, Caldav/Carddav hosting and my ebooks collection.
All those share had a daily snapshooting policy.
After reading the "FAQs on upgrading ReadyNAS firmware to 6.4.0" post linked in here, the last chapter caught my eye:
btrfs-cleaner is commonly invoked after Smart Snapshot Management prunes older snapshots. ReadyNAS commonly prunes older snapshots based on its snapshot schedule.
Now, my nas is almost 80% full, and the snapshots where taking around 4Gb of space (out of a 4 disks Raid5 array of 6To usable).
As the unit completely locks up, no SSH-ing in to check how much CPU the btrfs-cleaner process takes, of course, and nothing in the syslog appart a bunch "@" at the time the hangs-up hapenned.
Oct 13 21:48:05 nas kernel: [ 6591.520831] md: delaying resync of md126 until md127 has finished (they share one or more physical units) Oct 13 21:48:06 nas readydropd[4685]: DEBUG:readydropd.c:897 Shares.conf has been changed Oct 13 21:48:07 nas readydropd[4685]: DEBUG:readydropd.c:652 Reload Share configs Oct 13 21:48:08 nas kernel: [ 6593.661751] md: delaying resync of md126 until md127 has finished (they share one or more physical units) Oct 13 21:48:08 nas kernel: [ 6594.415415] md: delaying resync of md126 until md127 has finished (they share one or more physical units) Oct 13 21:48:09 nas kernel: [ 6594.927980] md: delaying resync of md126 until md127 has finished (they share one or more physical units) @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@Oct 13 21:53:01 nas kernel : imklog 5.8.11, log source = /proc/kmsg started. Oct 13 21:53:01 nas rsyslogd: [origin software="rsyslogd" swVersion="5.8.11" x-pid="2837" x-info="http://www.rsyslog.com"] start Oct 13 21:53:01 nas kernel: Booting Linux on physical CPU 0x0 Oct 13 21:53:01 nas kernel: Initializing cgroup subsys cpuset Oct 13 21:53:01 nas kernel: Initializing cgroup subsys cpu Oct 13 21:53:01 nas kernel: Initializing cgroup subsys cpuacct
Reading the FAQ, I thought that I would try to remove the snapshots creation schedules of my shares, and remove all the existing snapshots.
Since this moment, I had no more issues.
By the way netgear, I'd really like to know another way than using the timeline to clean old snapshot.
Removing 108 snapshots manually, one after the other was not fun...
The "emergency" button in the shares/browse is nowhere to be found today in my web gui, so that only leaves the timeline afaik...
It might not be relevant, but maybe the combination of low disk space, numerous snapshots and high CPU usage (because syncing back the array after the last hard reboot, my cpu load reported by "top" was around 12~15 at that time) might have been the issue.
To be really torough, I even have deactivated DLNA, AFP and SMB services, leaving only the NFS, as I'm using linux boxes as client anyway.
DLNA re-scanning the shares seems to be heavy, when resyncing the disks.
Maybe it can help other people I've seen here having unresponsive units after bootup too.
I'll come back comenting here if the unit locks-up again, but I am confident.
Regards.
Thierry.
Solved! Go to Solution.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
All Replies
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
Hi tripy,
Welcome to the community!
Thanks for sharing this very informative and detailed post. It will surely help a lot of community members who have experienced the same. By the way, have you tried checking this article on deleting snapshots?
Kind regards,
BrianL
NETGEAR Community Team
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
Hello Brian, thanks for the message.
Yes, I did check that article.
In my initial post, when I was refering to "The "emergency" button in the shares/browse is nowhere to be found", I was in fact talking about the recovery mode.
Sorry about that, I had not the article on view at the moment I wrote my post.
As shown in the screenshot, the recovery button () is not there.
For the moment, the NAS is still up.
Crossing my fingers it stays that way.
Regards.
Thierry.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
I also found that deleting all snapshots fix it for me too.
Here's a script I wrote that did it for me
share=$1 for snapshot in `rn_nml -Q snapshot:/data/$share | grep Snapshot_Name | cut -c30-39` do rn_nml -d snapshot:/data/$share@$snapshot:1 done
Save this into a file and then use by passing the share name as parameter. But please use the above script with caution, you might want to try the commands individually first. It basically queries for all snashots for the given share, cuts out the ID and then deletes them one at a time. Saved my LOTS of time.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
Thanks TonyKL,
I'll be keeping this handy.
Would I had knew where to look, I would have gone that way.
It would have avoided 2 hours of clicky clicky to get ride of the snapsots.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
The article linked to by BrianL above has been corrected now. It is useful in that it allows you to delete multiple snapshots at once. Note that deleting multiple snapshots at once can be an intensive task and may impact performance on a NAS.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
Hi,
I have to report that it wasn't the solution.
As long as the nas is left alone, it's fine.
As soon as some activity is done on the disk, it freezes.
It hapenned yesterday when downloading a Linux iso image through transmission.
It hapenned again today, when I was copying a 6Gb file to the nas.
It froze mid-copy.
No reaction to the power button, lcd stays black, nothing happens...
But it does answers to pings.
No way to enter a SSH session though.
Is there a way for someone having only osx and linux machines (ie: no computer running windows) to downgrade the nas to 6.2.5?
It was running perfectly fine until I did the update.
I'm pretty annoyed of this whole update mess, I have to admit.
I wish I did read this forum before...
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
Hi tripy,
We may have to further investigate this issue that you've been experiencing. Kindly attach the logs so we can have a thorough look at what's going on with your ReadyNAS unit. In copying large or small files, if you will do a direct connection between your ReadyNAS and PC, will it froze or complete the copy? With regard to your downgrade question, I apologize but it won't be possible to go back to 6.2.4 or 6.2.5.
Looking forward to your response.
Kind regards,
BrianL
NETGEAR Community Team
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
It's completely possible to go back to 6.2.4 but you will have to back up your data and factory reset the unit.
First you download the 6.4.0 beta 1 or 2 (can't remember which one it is)
http://www.readynas.com/download/beta/readynasos/6.4.0/ReadyNASOS-6.4.0-T112-arm.img
Then you download the 6.2.4 download for arm processors:
http://www.downloads.netgear.com/files/GDC/READYNAS-100/ReadyNASOS-6.2.4_arm.zip - unzip this one to get the img file.
Manually install the first 6.4.0 image, then manually install the 6.2.4 image. Then factory reset. Boom, problem fixed.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
Hi Brian,
I am currently doing a backup of the nas content, copying it to usb disks attached to the unit.
I will re-try, and upload the logs when it happens once everything important is backed up.
As for your question, I think my connection was as direct as possible.
Both the nas and my pc are connected through Gigabit ethernet on the same switch.
As I'm running Linux, I am using NFS to reach the shares. NFS 4 is not enabled in the nas config.
For the logs, am I supposed to send them through a ticket, as explained there: http://kb.netgear.com/app/answers/detail/a_id/25625 or should I just give a public link to the archive?
Thanks.
Thierry.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
Thanks for the info, IfixDevices !
I'll keep that handy, if everything else fails.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
@tripy wrote:
As for your question, I think my connection was as direct as possible.
Both the nas and my pc are connected through Gigabit ethernet on the same switch.
A Direct Connection means connecting the NAS to the PC w/o a switch. http://kb.netgear.com/app/answers/detail/a_id/21414/~/how-do-i-direct-connect-between-readynas-and-p...
@tripy wrote:
For the logs, am I supposed to send them through a ticket, as explained there: http://kb.netgear.com/app/answers/detail/a_id/25625 or should I just give a public link to the archive?
Don't post a public link. Use this method: http://kb.netgear.com/app/answers/detail/a_id/21543/~/how-do-i-send-all-logs-to-a-readynas-forum-mod...
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
Hi Brian,
I did finish to backup the content of my nas, and tried to transfert a 10Gb text file from and to the nas yesterday (Sunday) evening.
I have used NFS and SMB to do the tests. Both with a direct connection and through a switch.
There was no issues, everything went ok.
I copied it several test back and forth between my computer and the nas.
But this evening I have found my unit frozen when I came back from work.
As far as I know, it wasn't used.
I have sent the mail with 2 logs.
The first (System_log-nas-20151018-201435), is the one after my tests that did not crash the nas.
The second (System_log-nas-20151019-195757.zip) is the one fetched right now, after restarting the nas.
Thanks.
Thierry.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
Hi tripy,
Can you confirm if the lock up occurs when an external USB storage is plugged in your ReadyNAS device? Also where did you send or attached the logs?
I look forward to your response.
Kind regards,
BrianL
NETGEAR Community Team
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
Hi Brian,
Yes, there was a device plugged in the usb3 port in the back the whole time, an ext4 1Tb usb3 drive.
And I've sent the logs to the email address specified in the article StephenB linked to: http://kb.netgear.com/app/answers/detail/a_id/21543/~/how-do-i-send-all-logs-to-a-readynas-forum-mod...
Thanks.
Thierry.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
Hi again Brian,
A kind of live update, I just have "ejected" the usb drive from the web interface, and the unit locked up again.
Don't know if this is relevant to you.
Regards.
Thierry.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
Hi tripy,
I have the logs handy. Let me have a further look at get back to you as soon as possible.
Kind regards,
BrianL
NETGEAR Community Team
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
I am having the random lock up under load problem since 6.4.0. I do not use snapshots and have no extenal USB devices connected to it. I have 2 RN104 and only one of them is having this issue. The only difference is that the one that's working was upgraded to 6.4.0 without issues (checksum mismatch) and the one that's freezing needed the installation of the workaround app in order to upgrade to 6.4.0.
Please help.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
Hi Calder,
Welcome to the community!
Could you confirm if it'll lockup again if you will run balance in your volume? Kindly send us a copy of your device logs so we can have a further look at your unit.
Kind regards,
BrianL
NETGEAR Community Team
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
Thanks BrianL. I spoke too soon, now both RN104 are freezing under load. I have a scheduled task to take a system image of 2 servers and store them on the two RN104 and as soon as the imaging starts they freeze and requires pulling of the power cable.
One of them I started from scratch and is rebuilding the RAID so I won't run balance on that one. I will run balance on the other one and see what happens.
Wish I could go back to 6.2.5 but no such luck. I'll send the logs for both units.
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
BTW, the imaging software I am using is Acronis and it's saving to a SMB share on the two RN104 - the image files is 13GB in size. No problem with 6.2.5.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
Just discovered another new issue with 6.4.0.
When I restart my Windows servers that are connected to both RN104 running OS 6.4.0 using iSCSI, the shares on the RN104 won't be shared - I have to restart the server service for the shares to reappear.
If there is a way to go back to 6.2.5 even if it means wiping everything, spending 7 days to rebuild the RAID, selling (or just giving away) all my kids or whatever it takes please let me know.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
Look on the first page, message n° 9.
IfixDevices gave a way, but I haven't tried it yet.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
There are some changes made when updating to 6.4.0 that cannot be reversed using the method described in that post. In my view it would be better to troubleshoot and resolve the issues you are having on 6.4.0.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: rn104 updated to 6.4.0 and possible solution to random lock-ups
What changes specifically? As long as you do a factory reset after downgrading back to 6.2.4 there shouldn't be any problem. Unless it updated the bios or something of that nature that can't be changed back it seems to me it's more of a "we don't want to admit 6.4.0 got out in the wild before everything was proper" and having people jump down a software version.
The betas were incredibly buggy even up to the last beta available for 6.4.0 so that's why I was so shocked to see the update on my machines. I should have known from testing the betas that it would have been a bad idea.
I also want to mention that I don't just own legacy devices and do have current model readynas units at my place of business and at customer's businesses. The reason I don't have an RN316 or RN516 or even RN716 is because I feel the software is far too buggy at this point to spend that kind of money. If it was better I wouldn't be putting it on legacy units hoping that it works and seeing all of the unfortunate people who have purchased the expensive units now having trouble.