× NETGEAR will be terminating ReadyCLOUD service by July 1st, 2023. For more details click here.
Orbi WiFi 7 RBE973
Reply

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

tripy
Aspirant

rn104 updated to 6.4.0 and possible solution to random lock-ups

Hello everyone,

I have updated my RN104 last week, and have been poised by the unit hanging up randomly since then.
I have searched andread a lot of posts here, and after 5 hard reboot and re-syncing, I may have found a solution.
At least, it's been 1 day the unit has not locked up, so I'm crossing my fingers.

The culprit, I think, might by the snapshot creation.
I had 3 share protected by snapshots. Documents, Caldav/Carddav hosting and my ebooks collection.
All those share had a daily snapshooting policy.

After reading the "FAQs on upgrading ReadyNAS firmware to 6.4.0" post linked in here, the last chapter caught my eye:
     btrfs-cleaner is commonly invoked after Smart Snapshot Management prunes older snapshots. ReadyNAS commonly prunes older snapshots based on its snapshot schedule.

Now, my nas is almost 80% full, and the snapshots where taking around 4Gb of space (out of a 4 disks Raid5 array of 6To usable).
As the unit completely locks up, no SSH-ing in to check how much CPU the btrfs-cleaner process takes, of course, and nothing in the syslog appart a bunch "@" at the time the hangs-up hapenned.

Oct 13 21:48:05 nas kernel: [ 6591.520831] md: delaying resync of md126 until md127 has finished (they share one or more physical units)
Oct 13 21:48:06 nas readydropd[4685]: DEBUG:readydropd.c:897 Shares.conf has been changed
Oct 13 21:48:07 nas readydropd[4685]: DEBUG:readydropd.c:652 Reload Share configs
Oct 13 21:48:08 nas kernel: [ 6593.661751] md: delaying resync of md126 until md127 has finished (they share one or more physical units)
Oct 13 21:48:08 nas kernel: [ 6594.415415] md: delaying resync of md126 until md127 has finished (they share one or more physical units)
Oct 13 21:48:09 nas kernel: [ 6594.927980] md: delaying resync of md126 until md127 has finished (they share one or more physical units)
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@Oct 13 21:53:01 nas kernel
: imklog 5.8.11, log source = /proc/kmsg started.
Oct 13 21:53:01 nas rsyslogd: [origin software="rsyslogd" swVersion="5.8.11" x-pid="2837" x-info="http://www.rsyslog.com"] start
Oct 13 21:53:01 nas kernel: Booting Linux on physical CPU 0x0
Oct 13 21:53:01 nas kernel: Initializing cgroup subsys cpuset
Oct 13 21:53:01 nas kernel: Initializing cgroup subsys cpu
Oct 13 21:53:01 nas kernel: Initializing cgroup subsys cpuacct


Reading the FAQ, I thought that I would try to remove the snapshots creation schedules of my shares, and remove all the existing snapshots.
Since this moment, I had no more issues.
By the way netgear, I'd really like to know another way than using the timeline to clean old snapshot.

Removing 108 snapshots manually, one after the other was not fun...

The "emergency" button in the shares/browse is nowhere to be found today in my web gui, so that only leaves the timeline afaik...


It might not be relevant, but maybe the combination of low disk space, numerous snapshots and high CPU usage (because syncing back the array after the last hard reboot, my cpu load reported by "top" was around 12~15 at that time) might have been the issue.

To be really torough, I even have deactivated DLNA, AFP and SMB services, leaving only the NFS, as I'm using linux boxes as client anyway.
DLNA re-scanning the shares seems to be heavy, when resyncing the disks.

Maybe it can help other people I've seen here having unresponsive units after bootup too.

I'll come back comenting here if the unit locks-up again, but I am confident.

Regards.
Thierry.

Message 1 of 93

Accepted Solutions
michaelarnauts
Aspirant

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

6.4.1 is released. Who is taking the risk? 🙂

 

-> http://kb.netgear.com/app/answers/detail/a_id/30110

View solution in original post

Message 82 of 93

All Replies
BrianL2
NETGEAR Employee Retired

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

Hi tripy,

 

Welcome to the community!

 

Thanks for sharing this very informative and detailed post. It will surely help a lot of community members who have experienced the same. By the way, have you tried checking this article on deleting snapshots?

 


Kind regards,

 

BrianL
NETGEAR Community Team

Message 2 of 93
tripy
Aspirant

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

Hello Brian, thanks for the message.

 

Yes, I did check that article.

In my initial post, when I was refering to "The "emergency" button in the shares/browse is nowhere to be found", I was in fact talking about the recovery mode.

Sorry about that, I had not the article on view at the moment I wrote my post.

readynas.PNG

As shown in the screenshot, the recovery button (icon_recovery mode.PNG) is not there.

 

For the moment, the NAS is still up.

Crossing my fingers it stays that way.

 

Regards.

Thierry.

Message 3 of 93
TonyKL
Guide

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

I also found that deleting all snapshots fix it for me too.

Here's a script I wrote that did it for me Man Happy

 

share=$1

for snapshot in `rn_nml -Q snapshot:/data/$share | grep Snapshot_Name | cut -c30-39`
do
  rn_nml -d snapshot:/data/$share@$snapshot:1
done

Save this into a file and then use by passing the share name as parameter.  But please use the above script with caution, you might want to try the commands individually first.  It basically queries for all snashots for the given share, cuts out the ID and then deletes them one at a time.  Saved my LOTS of time.

Message 4 of 93
tripy
Aspirant

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

Thanks TonyKL,

 

I'll be keeping this handy.

Would I had knew where to look, I would have gone that way.

It would have avoided 2 hours of clicky clicky to get ride of the snapsots.

Message 5 of 93
BrendanM
NETGEAR Expert

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

The article linked to by BrianL above has been corrected now. It is useful in that it allows you to delete multiple snapshots at once. Note that deleting multiple snapshots at once can be an intensive task and may impact performance on a NAS.

Message 6 of 93
tripy
Aspirant

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

Hi,

 

I have to report that it wasn't the solution.

As long as the nas is left alone, it's fine.

 

As soon as some activity is done on the disk, it freezes.

It hapenned yesterday when downloading a Linux iso image through transmission.

 

It hapenned again today, when I was copying a 6Gb file to the nas.

It froze mid-copy.

No reaction to the power button, lcd stays black, nothing happens...

But it does answers to pings.

No way to enter a SSH session though.

 

Is there a way for someone having only osx and linux machines (ie: no computer running windows) to downgrade the nas to 6.2.5?

It was running perfectly fine until I did the update.

 

I'm pretty annoyed of this whole update mess, I have to admit.

I wish I did read this forum before...

Message 7 of 93
BrianL2
NETGEAR Employee Retired

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

Hi tripy,

 

We may have to further investigate this issue that you've been experiencing. Kindly attach the logs so we can have a thorough look at what's going on with your ReadyNAS unit. In copying large or small files, if you will do a direct connection between your ReadyNAS and PC, will it froze or complete the copy? With regard to your downgrade question, I apologize but it won't be possible to go back to 6.2.4 or 6.2.5.

 

Looking forward to your response.

 

 

Kind regards,

 

BrianL
NETGEAR Community Team

Message 8 of 93
ifixidevices
Luminary

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

It's completely possible to go back to 6.2.4 but you will have to back up your data and factory reset the unit.

 

First you download the 6.4.0 beta 1 or 2 (can't remember which one it is)

 

http://www.readynas.com/download/beta/readynasos/6.4.0/ReadyNASOS-6.4.0-T112-arm.img

 

Then you download the 6.2.4 download for arm processors:

 

http://www.downloads.netgear.com/files/GDC/READYNAS-100/ReadyNASOS-6.2.4_arm.zip - unzip this one to get the img file.

 

Manually install the first 6.4.0 image, then manually install the 6.2.4 image. Then factory reset. Boom, problem fixed.

Message 9 of 93
tripy
Aspirant

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

Hi Brian,

 

I am currently doing a backup of the nas content, copying it to usb disks attached to the unit.

I will re-try, and upload the logs when it happens once everything important is backed up.

 

As for your question, I think my connection was as direct as possible.

Both the nas and my pc are connected through Gigabit ethernet on the same switch.

As I'm running Linux, I am using NFS to reach the shares. NFS 4 is not enabled in the nas config.

 

For the logs, am I supposed to send them through a ticket, as explained there: http://kb.netgear.com/app/answers/detail/a_id/25625 or should I just give a public link to the archive?

 

Thanks.

Thierry.

Message 10 of 93
tripy
Aspirant

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

Thanks for the info, IfixDevices !

I'll keep that handy, if everything else fails.

Message 11 of 93
StephenB
Guru

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups


@tripy wrote:

 

 

As for your question, I think my connection was as direct as possible.

Both the nas and my pc are connected through Gigabit ethernet on the same switch.

 


 

A Direct Connection means connecting the NAS to the PC w/o a switch.  http://kb.netgear.com/app/answers/detail/a_id/21414/~/how-do-i-direct-connect-between-readynas-and-p...


@tripy wrote:

  

For the logs, am I supposed to send them through a ticket, as explained there: http://kb.netgear.com/app/answers/detail/a_id/25625 or should I just give a public link to the archive?

 

 


Don't post a public link.  Use this method: http://kb.netgear.com/app/answers/detail/a_id/21543/~/how-do-i-send-all-logs-to-a-readynas-forum-mod...

Message 12 of 93
tripy
Aspirant

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

Hi Brian,


I did finish to backup the content of my nas, and tried to transfert a 10Gb text file from and to the nas yesterday (Sunday) evening.
I have used NFS and SMB to do the tests. Both with a direct connection and through a switch.
There was no issues, everything went ok.
I copied it several test back and forth between my computer and the nas.

But this evening I have found my unit frozen when I came back from work.
As far as I know, it wasn't used.

I have sent the mail with 2 logs.
The first (System_log-nas-20151018-201435), is the one after my tests that did not crash the nas.
The second (System_log-nas-20151019-195757.zip) is the one fetched right now, after restarting the nas.

Thanks.
Thierry.

Message 13 of 93
BrianL2
NETGEAR Employee Retired

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

Hi tripy,

 

Can you confirm if the lock up occurs when an external USB storage is plugged in your ReadyNAS device? Also where did you send or attached the logs?

 

I look forward to your response.

 

 

Kind regards,

 

BrianL
NETGEAR Community Team

Message 14 of 93
tripy
Aspirant

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

Hi Brian,

 

Yes, there was a device plugged in the usb3 port in the back the whole time, an ext4 1Tb usb3 drive.

And I've sent the logs to the email address specified in the article StephenB linked to: http://kb.netgear.com/app/answers/detail/a_id/21543/~/how-do-i-send-all-logs-to-a-readynas-forum-mod...

 

Thanks.

Thierry.

Message 15 of 93
tripy
Aspirant

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

Hi again Brian,

 

A kind of live update, I just have "ejected" the usb drive from the web interface, and the unit locked up again.

Don't know if this is relevant to you.

 

Regards.

Thierry.

Message 16 of 93
BrianL2
NETGEAR Employee Retired

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

Hi tripy,

 

I have the logs handy. Let me have a further look at get back to you as soon as possible.

 

 

Kind regards,

 

BrianL
NETGEAR Community Team

Message 17 of 93
Calder
Guide

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

I am having the random lock up under load problem since 6.4.0.  I do not use snapshots and have no extenal USB devices connected to it.  I have 2 RN104 and only one of them is having this issue.  The only difference is that the one that's working was upgraded to 6.4.0 without issues (checksum mismatch) and the one that's freezing needed the installation of the workaround app in order to upgrade to 6.4.0.

 

Please help.

 

Message 18 of 93
BrianL2
NETGEAR Employee Retired

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

Hi Calder,

 

Welcome to the community!

 

Could you confirm if it'll lockup again if you will run balance in your volume? Kindly send us a copy of your device logs so we can have a further look at your unit.

 

 

Kind regards,

 

BrianL
NETGEAR Community Team

 

Message 19 of 93
Calder
Guide

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

Thanks BrianL.  I spoke too soon, now both RN104 are freezing under load.  I have a scheduled task to take a system image of 2 servers and store them on the two RN104 and as soon as the imaging starts they freeze and requires pulling of the power cable.

 

One of them I started from scratch and is rebuilding the RAID so I won't run balance on that one.  I will run balance on the other one and see what happens. 

 

Wish I could go back to 6.2.5 but no such luck.  I'll send the logs for both units.

 

Thanks!

 

 

Message 20 of 93
Calder
Guide

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

BTW, the imaging software I am using is Acronis and it's saving to a SMB share on the two RN104 - the image files is 13GB in size.  No problem with 6.2.5.

 

 

Message 21 of 93
Calder
Guide

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

Just discovered another new issue with 6.4.0.

 

When I restart my Windows servers that are connected to both RN104 running OS 6.4.0 using iSCSI, the shares on the RN104 won't be shared - I have to restart the server service for the shares to reappear.

 

If there is a way to go back to 6.2.5 even if it means wiping everything, spending 7 days to rebuild the RAID, selling (or just giving away) all my kids or whatever it takes please let me know.

 

Thanks.

 

Message 22 of 93
tripy
Aspirant

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

Look on the first page, message n° 9.

IfixDevices gave a way, but I haven't tried it yet.

 

https://community.netgear.com/t5/Using-your-ReadyNAS/rn104-updated-to-6-4-0-and-possible-solution-to...

 

Message 23 of 93
mdgm-ntgr
NETGEAR Employee Retired

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

There are some changes made when updating to 6.4.0 that cannot be reversed using the method described in that post. In my view it would be better to troubleshoot and resolve the issues you are having on 6.4.0.

Message 24 of 93
ifixidevices
Luminary

Re: rn104 updated to 6.4.0 and possible solution to random lock-ups

What changes specifically? As long as you do a factory reset after downgrading back to 6.2.4 there shouldn't be any problem. Unless it updated the bios or something of that nature that can't be changed back it seems to me it's more of a "we don't want to admit 6.4.0 got out in the wild before everything was proper" and having people jump down a software version.

 

The betas were incredibly buggy even up to the last beta available for 6.4.0 so that's why I was so shocked to see the update on my machines. I should have known from testing the betas that it would have been a bad idea.

 

I also want to mention that I don't just own legacy devices and do have current model readynas units at my place of business and at customer's businesses. The reason I don't have an RN316 or RN516 or even RN716 is because I feel the software is far too buggy at this point to spend that kind of money. If it was better I wouldn't be putting it on legacy units hoping that it works and seeing all of the unfortunate people who have purchased the expensive units now having trouble.

Message 25 of 93
Top Contributors
Discussion stats
Announcements