× NETGEAR will be terminating ReadyCLOUD service by July 1st, 2023. For more details click here.
Orbi WiFi 7 RBE973
Reply

Re: ReadyNAS Netgear 2120 Stops responding to all traffic (iSCSI / Web interface)

tonitheitgirl
Aspirant

ReadyNAS Netgear 2120 Stops responding to all traffic (iSCSI / Web interface) #25509387

I have a RN2120 running 6.2.4 with 4 2TB disks. (one of 5 that we have running in our estate, none of the others I'm experiencing this on) This unit is 2 years old.

 

I started to see this behaviour earlier this month, with no configuration change or anything else changing. 

 

Every 12 hours, the device stops responding to all traffic. The lights on the box appear normal. Once I have power cycled the box, it is rebuilding the array.  I have factory reset this device 3 times and it is behaving in the same way every time. The original firmware was 6.0.8 which was behaving in exactly the same way (I upgraded to 6.2.4 in investigating this)

 

The box is configured for RAID 5, with a single iSCSI lun of 5TB. 

 

I am being told by the webchat folk that this behaviour is 'normal' because it is above 95% usage, but this doesnt make sense, because 4x2TB = 8TB, which when raid 5'd = 6TB and I'm using the maximum that the box will allow me to use which is 5TB.

 

I am also being told that to escalate or to even get any clarification on this subject will require a paid contract.

 

I am currently personally suspecting that the device is borked and needs replacement, but would love to know if anyone has experienced similar before and has resolved the issue or if you know what might be causing this issue. 

 

Thanks. 

 

Toni. 

Message 1 of 18
OOM-9
NETGEAR Expert

Re: ReadyNAS Netgear 2120 Stops responding to all traffic (iSCSI / Web interface)

How are you using the iSCSI? Is it a Windows file share, VMs or backup?
Depending on how it is being used may determine recommendations.
Message 2 of 18
tonitheitgirl
Aspirant

Re: ReadyNAS Netgear 2120 Stops responding to all traffic (iSCSI / Web interface)

its one big iSCSI LUN being used for veeam backup and replication. There is a single veeam proxy windows (2012) machine connected to it. 

Message 3 of 18
OOM-9
NETGEAR Expert

Re: ReadyNAS Netgear 2120 Stops responding to all traffic (iSCSI / Web interface)

It would typically be recommended to keep the volume closer to 80% of volume capacity. There are general performance and reliability issues that may come up if you exceed that threshold.

This is normally not the best option if data is already on the iSCSI. Other options that you can look at changing to help in these cases; disable the checksum that is under the [System/Volumes/Volume Properties/Settings/Summery] and under disable bit rot protection under [Shares/{share} Settings/Properties]


There can be more things that we could go into, but these items are the more common items that had issues similar to what you have mentioned.
Message 4 of 18
mdgm-ntgr
NETGEAR Employee Retired

Re: ReadyNAS Netgear 2120 Stops responding to all traffic (iSCSI / Web interface)

Can you send your logs in (see the Sending Logs link in my sig)?

Message 5 of 18
mdgm-ntgr
NETGEAR Employee Retired

Re: ReadyNAS Netgear 2120 Stops responding to all traffic (iSCSI / Web interface)

I can see from the logs that you last factory defaulted this unit on 6.0.8 in late 2013

 

We have better default settings when creating LUNs now.

 

You used continuous protection on the LUN though with the volume this full this protection would no longer be in effect even if it were still enabled.

 

Did you try OOM-9's suggestions?

Message 6 of 18
tonitheitgirl
Aspirant

Re: ReadyNAS Netgear 2120 Stops responding to all traffic (iSCSI / Web interface)

okay, so if I factory reset again, then create it? 

 

Can you see any reason why it might be freezing?

Message 7 of 18
mdgm-ntgr
NETGEAR Employee Retired

Re: ReadyNAS Netgear 2120 Stops responding to all traffic (iSCSI / Web interface)

It is freezing because the volume is so full.

 

If you do another factory reset and create a thick LUN using no more than 80% of the 5.4TB volume capacity with bit rot protection disabled and snapshots disabled You should find it works much better.

 

You may also wish to disable checksums as OOM-9 mentioned.

 

Note 6*1000^4/1024^4 is approximately 5.4

 

Like most computers we use a base of 1024 whereas disk manufacturers measure using a base of 1000. Same amount of space but the larger base gives a smaller number.

 

Your volume usage of 95% is a percentage of the volume capacity after redundancy.

Message 8 of 18
tonitheitgirl
Aspirant

Re: ReadyNAS Netgear 2120 Stops responding to all traffic (iSCSI / Web interface)

Okay, so is this the case for all readynas OS6 devices? 

Message 9 of 18
tonitheitgirl
Aspirant

Re: ReadyNAS Netgear 2120 Stops responding to all traffic (iSCSI / Web interface)

Re: the usable space, in a standard RAID 5 the usable space is 5700GB (5.56TB) the size of the LUN is 5.1TB which is the maximum that the GUI would allow me to set (without warning message)

 

There is no documentation to suggest to use no more than 80% (which gives me usable space of about 4.4TB (5.56TB-1.1TB)) which to be honest, seems a little absurd? 

Message 10 of 18
mdgm-ntgr
NETGEAR Employee Retired

Re: ReadyNAS Netgear 2120 Stops responding to all traffic (iSCSI / Web interface)

With any PC it is not advisable to fill a volume.

 

On OS6 we start alerting about volume usage when it exceeds 70% to allow some time to expand the volume or free up some space before the volume gets very full.

 

You want to allow plenty of space for metadata especially if bitrot protection and/or snapshots are used.

 

With ProSupport Installation service contracts we would set things up in an optimal way for the use case scenario.

Message 11 of 18
tonitheitgirl
Aspirant

Re: ReadyNAS Netgear 2120 Stops responding to all traffic (iSCSI / Web interface)

this is a business level device with a business level OS not a PC. None of the equipment used in this configuration is 'personal' in any way.

 

All I want to do is use the max available space on it as an iscsi target. I don't mean to come across rude here, but I'm now in a situation where through no fault of our own we're being told that the 7 readynas devices (we also have devices from lenovoemc/qnap which do not have this problem) we have can't be used in the same way that we have always previously used them, e.g. just to use all the available space as an iscsi target. 

 

Worse still, is that this device completely locks up/freezes when a configurable option is set from within the GUI, which makes me feel that this is not a business level product. It is unacceptable and unfair to pay for installation services because a bug exists in the software and is undocumented.  

 

Even worse; I'm being told to pay for a support contract to be told the same.  I'm losing 1.5TB of space on this device because the OS is restricting the use of 300GB and then being told that I can't use 20% of it.   Undocumented in the product technical details for this device.

 

I have no use for metadata. I do not need snapshots.  Essentially I want to use this device as a storage header for our backup solution, veeam backup and replication.  Everyone keeps jumping on the "80%" usage bandwagon to me, but this device has worked fine in this configuration for 18 months before the freezing issue occured. 

 

So, on my entire readynas fleet, am I now going to have to go back retrospectively and change everything? 

 

 

 

 

Message 12 of 18
OOM-9
NETGEAR Expert

Re: ReadyNAS Netgear 2120 Stops responding to all traffic (iSCSI / Web interface)

I can understand that the situation you are in can be frustrating. The recommendation for the 80% is a general recommendation for a various of reasons. There is over head on any file system and harddrive over a period of time. Depending on usage will vary if it is a requirement. From past cases of usage of the storage units there has be significate performance and reliability impact. I have not seen anywhere saying that it is recommended to run the volumes will the 95% and higher with data usage.

 

In either case; we were able to find the volumes with the iSCSI scenario more stable with 90% volume consumed, and this is implemented in the firmware to not exceed a creation of iSCSI past 90%. There has been notiable stability improvements with that change.

Some of the suggestions that mdgm and I made with disabling some of the features that were originally enabled by default on the older firmware. The options were disabled by default in our rackmount units to help that level of support that most business uses in the rackmount space. If there was a case for the rackmount uses to have these additional features enable, the option will still be available to toggle.

 

I looked at the logs and I did see the size dispearity that you were mentioning. I was not seeing the exact 300GB calculated, so I think there might be some over head that might be impacting some of the sizing. (This is seperate from the RAID calculation that was noted in this thread.) There is a good bit of meta data used that is not displayed easily. Running a defrag and rebalance with BTRFS helps manage the meta size and has been known to help in the past.

 

 

Some possible actions (with your setup):

  • Disable checksum
  • Disable bit-rot-protection
  • Disable snapshots
  • Run a defrag and rebalance under balance (There is a schedule option under settings.)

 

I hope these options help get the unit working again without formatting.

Message 13 of 18
mdgm-ntgr
NETGEAR Employee Retired

Re: ReadyNAS Netgear 2120 Stops responding to all traffic (iSCSI / Web interface)

Please try OOM-9's suggestions.

What I meant with Installation service contracts is not for now, but when you purchase a new ReadyNAS if unsure how to configure it, it is a good option for getting things configured right for your use case the first time. See http://prosupport.netgear.com/installation_service.html

In the past you had snapshots enabled. With these enabled a lot of metadata can be generated. Metadata is data about data. With snapshots you have data from a number of different points in time so a lot of metadata would have been required to keep track of it all. When the snapshots were deleted when volume usage exceeded 95% the space no longer used by the removed metadata remained allocated to it (unable to be used for data). A balance can be used to return allocated space to unallocated space.

Message 14 of 18
tonitheitgirl
Aspirant

Re: ReadyNAS Netgear 2120 Stops responding to all traffic (iSCSI / Web interface)

Followed all of the suggestions but the box was still crashing. So I've ended up reformatting. 

 

Can you tell me what the actual figure is for this? I've now heard 80, 85 and 90%. 

 

Also, does this apply to the 4220x ?. 

Message 15 of 18
OOM-9
NETGEAR Expert

Re: ReadyNAS Netgear 2120 Stops responding to all traffic (iSCSI / Web interface)

As mentioned before; I would recommend the 80% limit for the more performing and longer lasting setups.

 

Since you are pressed for storage and are running backups 85% to 90% should be okay with the volume schedule with a defrag and balance will help lower the overhead. You originally formatted on 6.0.x, and there were a few default settings that in most cases were not a requirement for these units. These options are disabled by default on 6.2.x to help lower some of the over head concerns that you experienced on the RN2120.

 

I suspect that if you keep the volume maintained will help the longevity with the volume.

Message 16 of 18
WingDog
Guide

Re: ReadyNAS Netgear 2120 Stops responding to all traffic (iSCSI / Web interface)

@tonitheitgirl

Also, does this apply to the 4220x ?. 

 

4220 halting while iSCSI load

https://community.netgear.com/t5/ReadyNAS-in-Business/MPIO-slow-speed/td-p/964959/jump-to/first-unre...

 

be aware!

 

 

 

 

Message 17 of 18
mobocracy
Aspirant

Re: ReadyNAS Netgear 2120 Stops responding to all traffic (iSCSI / Web interface)

Have been running into similar problems across a mix of 2100, 2120, and 3220 units since November 2014.  Units will hard freeze under heavy I/O (also from Veeam backup) and need to be power cycled to recover.

 

We have mostly stabilized the units by making sure that none have any kind of teaming configuration and all have jumbo frames disabled (which is a noticable performance hit, especially with Veeam!).  One 2100 never stabilized and was replaced with a Dell R530 stuffed with SATA disk.

 

With the exception of the 3220, all devices had 1 or more years of stable usage with no lockups (in one case it was over 4 years).

 

The oldest device (and the one eventually replaced with other hardware) was swapped out by support around November of 2014 due to a perception the NIC may have gone bad based on symptoms.   The swap didn't help which is when we figured out that disabling any teaming and jumbo frames seemed to help with stability.  This network was already properly configured for jumbo frames and was running an Equallogic SAN without issues, so we knew that the network itself wasn't a problem.

 

I don't have an answer for you and share your frustration.  My best guess is that a bug was introduced in the software released around November 2014 and under heavy networking loads the units are crashing the network stack.  I think it's persisted as long as it has because the units do poor fault reporting and the overwhelming customer base probably doesn't load them very hard.

Message 18 of 18
Top Contributors
Discussion stats
  • 17 replies
  • 7193 views
  • 0 kudos
  • 5 in conversation
Announcements