× NETGEAR will be terminating ReadyCLOUD service by July 1st, 2023. For more details click here.
Orbi WiFi 7 RBE973

Re: ReadyNASOS 6.2.3-T1718 more stable on 2100

mmartinezv
Aspirant

ReadyNASOS 6.2.3-T1718 more stable on 2100

Hi,

About 40 o 50 days ago, we installed ReadyNASOS 6.2.0 on a ReadyNAS 2100 and a RNDP4220. After testing the new OS and updating it to 6.2.2 we began to use it for doing backups of our five node Proxmox cluster and then started the problems.

OS 6.2.2 seems to hang under heavy use of NFS v3 tcp (both the 2100 and the RNDP4220), we reduced the concurrent writing processes on the NFS shares and it seemed to work better but sometimes the system hanged again. We tried to downgrade to 6.2.0 and the system seemed to work well so we retried the original backup process (the 5 nodes of the cluster writing on the nfs share concurrently) and it worked fine for two weeks. After that we experimented 3 more hangs.

I installed OS 6.2.3-T1718 on Saturday 21 and the 2100 is working fine until then. It's too early to draw conclusions but we've thought that it would be good to give some feeback about this beta.

Regards,

Manuel Martínez Valls
Message 1 of 16
tlyczko
Tutor

Re: ReadyNASOS 6.2.3-T1718 more stable on 2100

How many writing processes did you configure to make things work better with the newer 6.2.3 firmware??
I have seen similar NFS issues on a ReadyNAS 516 so I am wondering if such a change would help our backups.
Thank you, Tom
Message 2 of 16
mmartinezv
Aspirant

Re: ReadyNASOS 6.2.3-T1718 more stable on 2100

Hi Tom,

I only installed the 6.2.3-T1718 beta. I didn't do changes in the nfs server's process number because 8 processes is enough for us. Now we're doing the backups the same way as ever, 5 proxmox servers writing concurrently during 4 or five hours. The NAS has worked well during all week untill yesterday.

Last night the backup process finished ok. NFS server is still working but I have no access to the web manager nor ssh. With 6.2.3-T1718 beta it has worked better, but I hope next beta will be the one! 😉

I'm doing a hard reset now and then I'll try the new 6.2.3-T1730 (Beta 7) firmware.

Regards,

Manuel Martínez Valls
Message 3 of 16
mdgm-ntgr
NETGEAR Employee Retired

Re: ReadyNASOS 6.2.3-T1718 more stable on 2100

A NFS thread count of 8 is quite high. If you can lower it and still have enough threads it should help.

You could also consider using async if you have a UPS.
Message 4 of 16
tlyczko
Tutor

Re: ReadyNASOS 6.2.3-T1718 more stable on 2100

Why is 8 considered a high nfs count and what is a suggested nfs count??
What factors determine an appropriate nfs count??

We have backups running from/with a backup appliance VM that runs several backup jobs concurrently and is therefore sending a lot of data to the NAS, particularly a lot of small units of data.

The performance charts show:

1) maximum volume peaks of 469.4 (what unit?? kb?? mb?? what??????? the graph does not say) read and 696.4 (what unit?? kb?? mb?? what??????? the graph does not say) write over the past week's period, usually everything is considerably lower.

2) maximum network peaks of 60M RX and 12.7M TX over the past week's period, everything else is considerably lower

Thank you, Tom
Message 5 of 16
mmartinezv
Aspirant

Re: ReadyNASOS 6.2.3-T1718 more stable on 2100

Hi,

Bad luck with 6.2.3-T1730. Yesterday the NAS got hunged again while doing the backups.

Here are the logs:

Mar 2 01:22:58 NAS-Bkp kernel: CPU 2
Mar 2 01:22:58 NAS-Bkp kernel: Modules linked in: pvgpio vpd(P)
Mar 2 01:22:58 NAS-Bkp kernel:
Mar 2 01:22:58 NAS-Bkp kernel: Pid: 2496, comm: nfsd Tainted: P 3.0.101.RNx86_64.3 #1 NETGEAR ReadyNAS/ReadyNAS
Mar 2 01:22:58 NAS-Bkp kernel: RIP: 0010:[<ffffffff880ed392>] [<ffffffff880ed392>] __d_rehash+0x42/0x50
Mar 2 01:22:58 NAS-Bkp kernel: RSP: 0018:ffff88000cea3a40 EFLAGS: 00010282
Mar 2 01:22:58 NAS-Bkp kernel: RAX: 000000000000673b RBX: ffff88003bd98598 RCX: 0000000000000011
Mar 2 01:22:58 NAS-Bkp kernel: RDX: ffff88003f1af000 RSI: ffff88003f1e29d8 RDI: ffff8800218283c0
Mar 2 01:22:58 NAS-Bkp kernel: ff800e35 ffff8e34ff800e3a ffff8e66<>ff80288c 00000000ff80288c ff801290<>ff803c24 00000000ff800e38 00000000<>alTae
Mar 2 01:22:58 NAS-Bkp kernel: RBP: ffff88000cea3a40 R08: ffff88000cea39b8 R09: ffffffff882bdc86
Mar 2 01:22:58 NAS-Bkp kernel: R10: 0000000000000000 R11: 0000000000000005 R12: ffff8800218283c0
Mar 2 01:22:58 NAS-Bkp kernel: R13: ffff8800218289c0 R14: ffff8800218283c0 R15: ffff88003d163600
Mar 2 01:22:58 NAS-Bkp kernel: FS: 0000000000000000(0000) GS:ffff88003f100000(0000) knlGS:0000000000000000
Mar 2 01:22:58 NAS-Bkp kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Mar 2 01:22:58 NAS-Bkp kernel: CR2: 00007f2ba9845000 CR3: 000000000da95000 CR4: 00000000000007e0
Mar 2 01:22:58 NAS-Bkp kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar 2 01:22:58 NAS-Bkp kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Mar 2 01:22:58 NAS-Bkp kernel: Process nfsd (pid: 2496, threadinfo ffff88000cea2000, task ffff88003d51b480)
Mar 2 01:22:58 NAS-Bkp kernel: ff80caa0ffff80de ff80caa0ffff80f1
Mar 2 01:22:58 NAS-Bkp kernel: 4 ff80123000000000 ff801290ff80288c
Mar 2 01:22:58 NAS-Bkp kernel: 4 ff8091d800000000 ff80cab000000000
Mar 2 01:22:58 NAS-Bkp kernel: 0Cl rc:<>[ffff80de> drhs+x405
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8e66]dmtraieuiu+x6040<>[ffff8212> tf_okp02/x0<>[ffff8005> _lo+x407
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8eb4]dalcadlou+x409
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8f84]?dlou+x406
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8e6f]_lou_ah0e/xa
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8e99]lou_n_e+x9010<>[ffff8158> eonc_ah014020<>[ffff8196> f_paeir..at906/x0<>[ffff8196> f_paeir..at906/x0<>[ffff815c> xotsdcd_h0e/xd
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8d04]?adpril02/x0<>[ffff8032> eciaesa+xe0f
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff86a1]?peaeces02/x7
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8d0e]?ke_ah_lo+xe0e
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8dea]f_eiy03a050<>[ffff80bf> rusfe+x506
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8e6a]ns3po_eat+xa0e
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8d61]ns_ipth011020<>[ffff888c> v_rcs+xc/x5
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8dde]ns+xe010<>[ffff815b> xotsdcd_h020020<>[ffff8032> tra+x909
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8864]kre_hedhle+x/x0<>[ffff803a> tra_okrf+x4/x4
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8860]?g_hne0b0b<>oe f1 88 64 3e e4 d5 84 5c 88 70 40 88 00 88 71 88 a0 88 60 a3 05 3<f bf 04 b0 80 5f bb 08 f2 8b 10 <>I [ffff80d9> __eah04/x0<>RP<ff80caa0

Mar 2 01:22:58 NAS-Bkp kernel: 4-- n rc 8cf6badb]-
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8e34]__eah04/x0<>[ffff80f1> _aeils_nqe08/x2
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8f85]brslou+x506
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8f44]?dalc02/x0<>[ffff8034> _lo_n_okp04/x0<>[ffff8000> _okp03/x0<>[ffff804f> _okphs+xf010<>[ffff8043> okpoeln0a/x2
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8d84]rcnetpt+xa/xa
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8d20]?_hudt.sa8pr.+x006
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8d20]?_hudt.sa8pr.+x006
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8daa]eprf_eoef+xa020<>[ffff8020> d_ata+x408
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8d8e]?datvt_lb04/x0<>[ffff80a5> rpr_rd+x1010<>[ffff8042> mmccealc0b/x0<>[ffff8197> hvrf+x0/xe
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff86c5]?gop_re05/x0<>[ffff8155> fd_rcgttr06/x0<>[ffff816a> fddsac+x0/x4
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8483]scpoes043080<>[ffff8156> fd0b/x5
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff8dc0]?eprf_eoef+xd/xd
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff86d9]khed08/x0<>[ffff88d3> enltra_epr0401
Mar 2 01:22:58 NAS-Bkp kernel: 4 <ffff86c0]?khedwre_n010010<>[ffff88d3> scag+x/x
Mar 2 01:22:58 NAS-Bkp kernel: 0Cd:0 04 b0 88 0f 88 70 88 04 94 87 44 95 84 97 04 3c 14 91 fb 60 dc 0>0 39 88 6a 17 7e c9 b4 04 80 0
Mar 2 01:22:58 NAS-Bkp kernel: 1RP <ffff8e32]_drhs+x205

I'll try to reduce the nfs procs number.

Regards,

Manuel
Message 6 of 16
tlyczko
Tutor

Re: ReadyNASOS 6.2.3-T1718 more stable on 2100

K that makes two people now who want to know more about nfs procs number 🙂 🙂
Message 7 of 16
mdgm-ntgr
NETGEAR Employee Retired

Re: ReadyNASOS 6.2.3-T1718 more stable on 2100

A NFS thread count that is low as will do what you need is recommended.
Message 8 of 16
tlyczko
Tutor

Re: ReadyNASOS 6.2.3-T1718 more stable on 2100

mdgm wrote:
A NFS thread count that is low as will do what you need is recommended.


And how would we know how to figure that out??
Please give more details on this.
The appliance is doing a fair amount of concurrent disk access.
Netgear defaults to 8, why??
Thank you, Tom
Message 9 of 16
mdgm-ntgr
NETGEAR Employee Retired

Re: ReadyNASOS 6.2.3-T1718 more stable on 2100

Trial and error can be used for that. I believe for systems reset on 6.2.3 there is a lower default thread count.

How many jobs are you running concurrently to the NAS using NFS?
Message 10 of 16
tlyczko
Tutor

Re: ReadyNASOS 6.2.3-T1718 more stable on 2100

mdgm wrote:
Trial and error can be used for that. I believe for systems reset on 6.2.3 there is a lower default thread count.

How many jobs are you running concurrently to the NAS using NFS?


OIC. Several VMs are backed up per backup job by the backup appliance, plus the backup appliance can have started some post-processing on the backups. The backup appliance can run up to 8 simultaneous backups, would each backup be considered a 'client'??

In our backups, the most VMs in a job is 8, the fewest is 4.

Our ReadyNAS is set to 8 nfs threads, I've not changed it ever since it was set up.

It is set for 8, the only remaining options are 1, 2, 4, then 12, 16, 24, 32.

Is any logging done by the ReadyNAS about how many connections it receives while the backups are happening??
If so, where do we look??

Thank you, Tom
Message 11 of 16
mmartinezv
Aspirant

Re: ReadyNASOS 6.2.3-T1718 more stable on 2100

Sorry, I'll reduce the count to 6 and reboot tomorrow, now is rebuiding the volume and I prefer to wait 14hours before rebooting...

In my case I have 6 concurrent nfs backup jobs...

Regards,

Manuel.
Message 12 of 16
mdgm-ntgr
NETGEAR Employee Retired

Re: ReadyNASOS 6.2.3-T1718 more stable on 2100

This command would indicate how used the threads are:

# watch 'cat /proc/net/rpc/nfsd | grep th'

Best to run this when doing as much via NFS as you will do.
Message 13 of 16
tlyczko
Tutor

Re: ReadyNASOS 6.2.3-T1718 more stable on 2100

I must leave this command running in a putty window during the night while the backups run??
Unless you say otherwise I'm assuming the answer is yes, since the display is presently all zeroes.
Thank you, Tom
Message 14 of 16
mdgm-ntgr
NETGEAR Employee Retired

Re: ReadyNASOS 6.2.3-T1718 more stable on 2100

Yes, the output is only useful when NFS is actually being used.

I guess you could record your screen.
Message 15 of 16
mmartinezv
Aspirant

Re: ReadyNASOS 6.2.3-T1718 more stable on 2100

Hi,

The system got hunged again a couple of days ago, so I installed last 6.2.3 beta (8). This morning we've found it freezed again...

I'm installing the 6.3.3 beta 8 now to see if it works better.

Regards,

Manuel
Message 16 of 16
Discussion stats
  • 15 replies
  • 7824 views
  • 0 kudos
  • 3 in conversation
Announcements