× NETGEAR will be terminating ReadyCLOUD service by July 1st, 2023. For more details click here.
Orbi WiFi 7 RBE973
Reply

Re: btrfs corruption

ronlaws86
Guide

Re: btrfs corruption

Is there a way to downgrade the firmware to before the BTRFS Switchover? BTRFS is an extremely fickle filesystem that is prone to randomly going bad on these devices. I also noticed the ext4 support is broken, i get a lot of kernel panics on my device using it for external storage backups. 

 

Currenlty in the process of attempting to repair one of the NAS drives we use at the office because of BTRFS being quirky, both units seem to take turns every couple of months playing up and making me waste a day fixing it, losing count now, i think it's maybe the 5th time? 

 

 

 

 

Model: RN214D41|ReadyNAS 214 Series 4-Bay
Message 1 of 9
mdgm-ntgr
NETGEAR Employee Retired

Re: btrfs corruption


@ronlaws86 wrote:

Is there a way to downgrade the firmware to before the BTRFS Switchover?


Only if you’re using a legacy x86 ReadyNAS that shipped with RAIDiator-x86. Such a downgrade does involve a factory reset


@ronlaws86 wrote:

BTRFS is an extremely fickle filesystem that is prone to randomly going bad on these devices.


I disagree. I’ve found it to be highly reliable. It sounds like you’re using your device wrong in some way e.g. filling it too full.


@ronlaws86 wrote:

Is there I also noticed the ext4 support is broken, i get a lot of kernel panics on my device using it for external storage backups. 


Sounds like you should send in your logs (See the Sending Logs link in my sig?)


@ronlaws86 wrote:

Currenlty in the process of attempting to repair one of the NAS drives we use at the office because of BTRFS being quirky, both units seem to take turns every couple of months playing up and making me waste a day fixing it, losing count now, i think it's maybe the 5th time? 


Definitely sounds like you’re doing something wrong. More details and logs would be helpful.

 

BTRFS is a much better filesystem than EXT4 and has better tools for dealing with problems should they arise.

Message 2 of 9
StephenB
Guru

Re: btrfs corruption


@ronlaws86 wrote:

Is there a way to downgrade the firmware to before the BTRFS Switchover?

 


OS 6 has always used BTRFS, and there is no path to install OS 5 on your RN214.  So the answer to this is no.

 

Like @mdgm-ntgr, my own experience has been different from yours.  I've been using OS6 since 2013, and I haven't had any issues with BTRFS.

 

You do need to be thoughtful on the use of snapshots, and I recommend scheduling the maintenance functions on the volume settings.  I run each of them once every three months.

Message 3 of 9
ronlaws86
Guide

Re: btrfs corruption

My experience has been mixed with os6 - I have a 104 at home that has never gone wrong, ever. but the 214's i have here in the office seem to do quirky things every couple of months.  

 

At this point they don't do anything mission critical, they are used to back up the servers in case of curruption/desaster, so the loss of data is more of an annoyance than a problem. That said, How does filling them up cause problems? that seems like a poor filesystem design to me if you must avoid using all available space to not have the file system suddenly die. - Yes this one i'm doing a resync on at preset was close to maximum capacity, why is that a problem and how does one fix this? they are being used as incrimental backup storage for servers. 


EDIT: I also have weekly scrubs sceduled, the last thing i saw in the log was this had failed. I noticed the backups were stuck too, so i rebooted the nas and bam, no volume. 

Message 4 of 9
ronlaws86
Guide

Re: btrfs corruption

I have emailed the logs and included a link to this post (Very meta) 

The problem as best as i can figure out: - 

 

The NAS Drives are used as backups for servers on the network, One nas pulls off the data from the servers, the other nas (Physically located in another building) pulls the data from the first nas and has a USB HDD i swap and take off site every day. 

 

I came in this morning to check on the backups and so fourth, to find jobs stuck in a queue and could not cancle or re-run anything, so instead i told it to reboot. When it woke back up, the volume was gone. 

 

- I have had a small poke around to try and get an idea for what went wrong, best I can tell is the btrfs parition on md127 has gone bad, btrfsck cannot recover. 

 

I am waiting for a resync to complete which i manually ran with echo check > /sys/block/md127/md/sync_action

so as far as I can tell, there it nothing wrong with the RAID5 array, that according to mdadm is functioning fine and reports healthy, it seems to be the partition within that's gone up its own rear end. 

Message 5 of 9
mdgm-ntgr
NETGEAR Employee Retired

Re: btrfs corruption

The data volume was over 95% full.

Filling any data volume extremely full regardless of the filesystem used is not a good idea and may lead to problems.

We do provide plenty of warning starting when the data volume exceeds 70% full.

You should try to keep the data volume under 85% full.

 

btrfsck is useful to check the state of the filesystem, but is only useful for repairing certain problems. Using btrsck to attempt to repair a filesystem inappropriately may lead to further problems.

 

Using SSH commands is at your own risk and may make problems worse.

Message 6 of 9
ronlaws86
Guide

Re: btrfs corruption

Been using Linux for 13+ years, so the only time i use the SSH on these things is to perform tasks that the webadmin does not allow, (e.g taming the recent implementation of clamav that flags everything as a broken executable when pulling in a root fs from a linux server.) also correctly implemented a secure incremental rsync backup in cron, since this also fails to work properly on the normal backup method (copies the folders and stops, no suitable customisation options or useful logs) - not at risk of damaging the system. 

 

If this was an EXT4 or ZFS partition, recovery options would probably be better, btrfs is still a very young fs and many people have issues with it. still a file system should not self-destruct when the utilisation is at 95+% there should be some robustness to prevent this. 

 

Message 7 of 9
mdgm-ntgr
NETGEAR Employee Retired

Re: btrfs corruption

EXT4 is far more likely to run into problems when it's nearly full as well, than if it's got plenty of free space.

 

Your PC will get sluggish if you fill it very full as well.

 

It's very easy to blame the filesystem for something that's not the fault of the filesystem at all.

 


One also can't rule out that your modifications using SSH may have contributed to problems.

 

Does your remote PC use the same version of Rsync as us or an older version?

Message 8 of 9
ronlaws86
Guide

Re: btrfs corruption

it's readynas to readynas, so I should hope so... As i pointed out, running it through the normal backup sceduler in the admin console did not work, at all. It did not copy the data and reported complete after a few seconds of running. The devices in question do not share the volumes for security reasons, this would be the easy route to do it with the somewhat limiting backup utility on RNOS but running rsync over a ssh session with keypairs to authenticate is much more secure. I Don't see how my modifications would cause issues here, the rsync job was set to delete files if they were removed from the source, in an effort to avoid the disk filling up with heaps of files! something else must of gone wrong, I noted that on a few occasions after scrubs, that snapshots would suddenly vanish and show as used space not as snapshots or vanish entirely. So i wonder if there is a bug in the kernel fuse driver, since as i also mentioned, using EXT* on any external HDDs caused the NAS to crash with "mb_cache_entry_get+68" on the LCD display.

 

Message 9 of 9
Top Contributors
Discussion stats
  • 8 replies
  • 1201 views
  • 0 kudos
  • 3 in conversation
Announcements