NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
ale2
Jul 28, 2021Aspirant
Readynas 424 BTRFS error
Hi all, a couple of days ago, when trying to readd a Private Time Machine share for a user after deleting it, I got an error that it couln't be written. After trying a few times, I restarted the ...
rn_enthusiast
Jul 29, 2021Virtuoso
Hi ale2
Thanks for the logs
The issue started on the 25th at 01:11AM.
Jul 25 01:11:15 nas2012 kernel: BTRFS error (device dm-0): parent transid verify failed on 8379516911616 wanted 2728973 found 2735258 Jul 25 01:11:15 nas2012 kernel: BTRFS error (device dm-0): parent transid verify failed on 8379516911616 wanted 2728973 found 2735258 Jul 25 01:11:15 nas2012 kernel: BTRFS warning (device dm-0): Skipping commit of aborted transaction. Jul 25 01:11:15 nas2012 kernel: BTRFS: error (device dm-0) in cleanup_transaction:1864: errno=-5 IO failure Jul 25 01:11:15 nas2012 kernel: BTRFS info (device dm-0): forced readonly
Some I/O error happened somewhere and typically can be disk related. However, I cannot see your NAS having any complaints about your disks, at any time. I trawled the kernel logs and I see no clear explanation for why this would have happened.
But it is still obvious that the filesystem ran into some I/O issue which also caused checksum errors (hence the parent transid verify failed messages).
You are using 8TB WDC WD80EFAX disks but since these are 8TB and I think they are still CMR drives. StephenB and Sandshark can confirm this as they know more about drive hardware than I do. If these are actually SMR drives, they should be replaced as that is not suitable for a NAS.
I also notice you have spindown enabled and we see your disks spinning up and down a lot. Often after just a few seconds. Below is a random day:
Jul 22 00:42:42 nas2012 noflushd[3402]: Spinning up disk 3 (/dev/sda) after 0:00:08. Jul 22 00:42:42 nas2012 noflushd[3402]: Spinning up disk 1 (/dev/sdc) after 0:00:05. Jul 22 01:45:32 nas2012 noflushd[3402]: Spinning up disk 1 (/dev/sdc) after 0:00:05. Jul 22 04:02:36 nas2012 noflushd[3402]: Spinning up disk 3 (/dev/sda) after 0:00:11. Jul 22 04:02:36 nas2012 noflushd[3402]: Spinning up disk 2 (/dev/sdb) after 0:00:08. Jul 22 04:02:36 nas2012 noflushd[3402]: Spinning up disk 1 (/dev/sdc) after 0:00:05. Jul 22 04:47:57 nas2012 noflushd[3402]: Spinning up disk 2 (/dev/sdb) after 0:00:08. Jul 22 04:47:57 nas2012 noflushd[3402]: Spinning up disk 1 (/dev/sdc) after 0:00:05. Jul 22 04:49:05 nas2012 noflushd[3402]: Spinning up disk 3 (/dev/sda) after 0:00:05. Jul 22 05:33:34 nas2012 noflushd[3402]: Spinning up disk 2 (/dev/sdb) after 0:00:08. Jul 22 05:33:34 nas2012 noflushd[3402]: Spinning up disk 1 (/dev/sdc) after 0:00:05. Jul 22 06:18:51 nas2012 noflushd[3402]: Spinning up disk 1 (/dev/sdc) after 0:00:05. Jul 22 06:48:17 nas2012 noflushd[3402]: Spinning up disk 2 (/dev/sdb) after 0:00:05. Jul 22 07:04:25 nas2012 noflushd[3402]: Spinning up disk 1 (/dev/sdc) after 0:00:05. Jul 22 07:49:43 nas2012 noflushd[3402]: Spinning up disk 3 (/dev/sda) after 0:00:05. Jul 22 08:49:15 nas2012 noflushd[3402]: Spinning up disk 3 (/dev/sda) after 0:00:11. Jul 22 08:49:15 nas2012 noflushd[3402]: Spinning up disk 2 (/dev/sdb) after 0:00:08. Jul 22 08:49:15 nas2012 noflushd[3402]: Spinning up disk 1 (/dev/sdc) after 0:00:05. Jul 22 17:25:06 nas2012 noflushd[3402]: Spinning up disk 3 (/dev/sda) after 0:00:05. Jul 22 17:26:12 nas2012 noflushd[3402]: Spinning up disk 2 (/dev/sdb) after 0:00:05. Jul 22 18:55:46 nas2012 noflushd[3402]: Spinning up disk 3 (/dev/sda) after 0:00:11. Jul 22 18:55:46 nas2012 noflushd[3402]: Spinning up disk 2 (/dev/sdb) after 0:00:08. Jul 22 18:55:46 nas2012 noflushd[3402]: Spinning up disk 1 (/dev/sdc) after 0:00:05. Jul 22 21:17:00 nas2012 noflushd[3402]: Spinning up disk 3 (/dev/sda) after 0:00:08. Jul 22 21:17:00 nas2012 noflushd[3402]: Spinning up disk 2 (/dev/sdb) after 0:00:05. Jul 22 22:03:13 nas2012 noflushd[3402]: Spinning up disk 2 (/dev/sdb) after 0:00:08. Jul 22 22:03:13 nas2012 noflushd[3402]: Spinning up disk 1 (/dev/sdc) after 0:00:05. Jul 22 23:00:14 nas2012 noflushd[3402]: Spinning up disk 3 (/dev/sda) after 0:00:08. Jul 22 23:00:14 nas2012 noflushd[3402]: Spinning up disk 1 (/dev/sdc) after 0:00:05.
Looking at the drive statistics, the start/stop count is very high (due to constant spin up/down).
Device: sdb Controller: 0 Channel: 1 Model: WDC WD80EFAX-68KNBN0 Serial: VAH1WPHL Firmware: 81.00A81W Class: SATA RPM: 5400 Sectors: 15628053168 Pool: data PoolType: RAID 5 PoolState: 1 PoolHostId: a449f54 Health data ATA Error Count: 0 Reallocated Sectors: 0 Reallocation Events: 0 Spin Retry Count: 0 Current Pending Sector Count: 0 Uncorrectable Sector Count: 0 Temperature: 41 Start/Stop Count: 31533 Power-On Hours: 17133 Power Cycle Count: 217 Load Cycle Count: 31609
That does not look like a good way to keep drives healthy. This much start/stops on the drives cannot be good, IMO. I don't see disk issues or kernel complaining about disks or disk spindown happening when issue first started, but I would still recommend to disable Disk Spindown. I don't have a good explanation for the BTRFS failure but I suspect it is related to failed disk I/O somewhere.
Since you can access your data, do a backup now. Make sure the backup is good and you can then either delete and re-create the volume --> then restore from backup or if you want to be adventurers you can try and let btrfsck do a repair on the volume (though this can make the problem even worse) - https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-check
This should be run on an unmounted volume, so you first need to unmount the volume via the CLI (SSH to the NAS). However, backup the data first before doing anything else - to be safe.
I understand that you choose to use a BTRFS filesystem due to bitrot protection and other advantages that it has. I have ReadyNAS myself, standing a corner for some backups but my main Ubuntu server uses a BTRFS raid and all my backup drives are BTRFS formatted too. I never had an issue with it. I would not call the filesystem fragile but I do want to point out that Netgear is using a 3.5 year old BTRFS version on the NAS (v4.16) which means it would ultimately not be as stable as a more modern version of the filesystem.
Cheers
ale2
Jul 31, 2021Aspirant
Thank you rn_enthusiast and StephenB for taking the time to check the logs and for your suggestions. I will try to backup what I can (I THINK I had the most important items already backup up), and will factory reset to start fresh.
Thanks!
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!