NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

edkedk's avatar
edkedk
Tutor
Jun 17, 2019

RN214 file system read-only (again)

It seems I'm extremely unlucky with this device because this is not the first time a file system error appears (cf. https://community.netgear.com/t5/Using-your-ReadyNAS-in-Business/RN204-repeated-file-system-error/m-p/1706392 )

 

RN214 (firmware v6.9.5 Hotfix1) with 4x8TB WD Purple HDDs, used for backing up some servers in an AD environment. About half of the volume is filled. Balance, defrag, scrub scheduled weekly.

 

Then the file system went read-only. The web interface shows this:

 

Jun 17, 2019 04:00:01 AM Volume: Scrub started for volume data.
Jun 17, 2019 03:00:01 AM Volume: Defragmentation complete for volume data.
Jun 17, 2019 03:00:01 AM Volume: Defragmentation started for volume data.
Jun 17, 2019 02:00:01 AM Volume: Balance complete for volume data.
Jun 17, 2019 02:00:01 AM Volume: Balance started for volume data.
Jun 16, 2019 10:13:58 PM Volume: The volume data encountered an error and was made read-only. It is recommended to backup your data.
Jun 11, 2019 10:22:24 AM Volume: Scrub completed for volume data'.
Jun 10, 2019 04:00:01 AM Volume: Scrub started for volume data.
Jun 10, 2019 03:53:36 AM Volume: Defragmentation complete for volume data.
Jun 10, 2019 03:00:01 AM Volume: Defragmentation started for volume data.

In systemd-journal.log I see a lot of entries like this:

Jun 16 22:12:20 hubu004 kernel: ------------[ cut here ]------------
Jun 16 22:12:20 hubu004 kernel: WARNING: CPU: 3 PID: 6743 at fs/btrfs/disk-io.c:541 btree_csum_one_bio+0x94/0xd8()
Jun 16 22:12:20 hubu004 kernel: Modules linked in: vpd(PO)
Jun 16 22:12:20 hubu004 kernel: CPU: 3 PID: 6743 Comm: kworker/u8:2 Tainted: P W O 4.4.157.alpine.1 #1
Jun 16 22:12:20 hubu004 kernel: Hardware name: Annapurna Labs Alpine
Jun 16 22:12:20 hubu004 kernel: Workqueue: btrfs-worker btrfs_worker_helper
Jun 16 22:12:20 hubu004 kernel: [<c0014690>] (unwind_backtrace) from [<c0011590>] (show_stack+0x10/0x14)
Jun 16 22:12:20 hubu004 kernel: [<c0011590>] (show_stack) from [<c035d294>] (dump_stack+0x7c/0x9c)
Jun 16 22:12:20 hubu004 kernel: [<c035d294>] (dump_stack) from [<c0090d98>] (warn_slowpath_common+0x80/0xac)
Jun 16 22:12:20 hubu004 kernel: [<c0090d98>] (warn_slowpath_common) from [<c001e700>] (warn_slowpath_null+0x18/0x20)
Jun 16 22:12:20 hubu004 kernel: [<c001e700>] (warn_slowpath_null) from [<c02787e0>] (btree_csum_one_bio+0x94/0xd8)
Jun 16 22:12:20 hubu004 kernel: [<c02787e0>] (btree_csum_one_bio) from [<c0277868>] (run_one_async_start+0x34/0x44)
Jun 16 22:12:20 hubu004 kernel: [<c0277868>] (run_one_async_start) from [<c02b55ec>] (btrfs_worker_helper+0xec/0x1ac)
Jun 16 22:12:20 hubu004 kernel: [<c02b55ec>] (btrfs_worker_helper) from [<c0031c44>] (process_one_work+0x1d4/0x30c)
Jun 16 22:12:20 hubu004 kernel: [<c0031c44>] (process_one_work) from [<c0032b30>] (worker_thread+0x2cc/0x440)
Jun 16 22:12:20 hubu004 kernel: [<c0032b30>] (worker_thread) from [<c0036c44>] (kthread+0xf4/0x104)
Jun 16 22:12:20 hubu004 kernel: [<c0036c44>] (kthread) from [<c000e920>] (ret_from_fork+0x14/0x34)
Jun 16 22:12:20 hubu004 kernel: ---[ end trace f005209bdac8c6a3 ]---

Then this:

Jun 16 22:12:37 hubu004 kernel: BTRFS: error (device md127) in btrfs_commit_transaction:2241: errno=-5 IO failure (Error while writing out transaction)
Jun 16 22:12:37 hubu004 kernel: BTRFS info (device md127): forced readonly
Jun 16 22:12:37 hubu004 kernel: BTRFS warning (device md127): Skipping commit of aborted transaction.
Jun 16 22:12:37 hubu004 kernel: BTRFS: error (device md127) in cleanup_transaction:1864: errno=-5 IO failure
Jun 16 22:12:37 hubu004 kernel: BTRFS info (device md127): delayed_refs has NO entry

Finally hundreds of lines of this:

Jun 17 08:52:36 hubu004 kernel: BTRFS critical (device md127): unable to find logical 764401909760 len 4096

This means a file system crash, right? Any options other than to reset the device and rebuild the volume losing all data? I do have secondary backup, but to be honest, I am fed up with a file system crash every few months!

7 Replies

Replies have been turned off for this discussion
  • StephenB's avatar
    StephenB
    Guru - Experienced User

    Did you replace the disk that generated the errors the last time?

    • edkedk's avatar
      edkedk
      Tutor
      Some weeks ago I replaced a disk that started to develop bad sectors.
      • StephenB's avatar
        StephenB
        Guru - Experienced User

        edkedk wrote:
        Some weeks ago I replaced a disk that started to develop bad sectors.

        Ok.  And that resynced ok?

         

        Perhaps enable ssh, and enter

        # smartctl -x /dev/sda
        # smartctl -x /dev/sdb
        # smartctl -x /dev/sdc
        # smartctl -x /dev/sdd
        

        and look for saved errors for the drives.

         

        For example, something like this:

        Error 12 [11] occurred at disk power-on lifetime: 36166 hours (1506 days + 22 hours)
          When the command that caused the error occurred, the device was active or idle.
        
          After command completion occurred, registers were:
          ER -- ST COUNT  LBA_48  LH LM LL DV DC
          -- -- -- == -- == == == -- -- -- -- --
          40 -- 51 00 00 00 00 0c 27 df 40 40 00  Error: UNC at LBA = 0x0c27df40 = 203939648
        
          Commands leading to the command that caused the error were:
          CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
          -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
          60 00 80 00 c8 00 00 0c 27 df 40 40 08  1d+08:15:46.204  READ FPDMA QUEUED
          60 00 08 00 c0 00 01 06 34 36 98 40 08  1d+08:15:46.163  READ FPDMA QUEUED
          60 00 80 00 b8 00 00 0c 27 e4 40 40 08  1d+08:15:46.146  READ FPDMA QUEUED
          60 00 80 00 b0 00 00 0c 27 e9 40 40 08  1d+08:15:46.123  READ FPDMA QUEUED
          60 00 80 00 a8 00 00 0c 27 ee 40 40 08  1d+08:15:46.094  READ FPDMA QUEUED

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More