NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

jukkaforss's avatar
Jun 27, 2019
Solved

kernel bug ReadyNASOS 6.10.1 screen massage __extent_writepsge_io+1d3

Hi,

My Readynas hit kernel bug, it running latests code 6.10.1.

Dmesg output after it happened. I needed to power cycle to get it rebooted.

[1108355.142034] ------------[ cut here ]------------
[1108355.146933] kernel BUG at fs/btrfs/extent_io.c:3400!
[1108355.152192] invalid opcode: 0000 [#1] SMP 
[1108355.301048] Modules linked in: vpd(PO)
[1108355.305129] CPU: 0 PID: 4578 Comm: nfsd Tainted: P           O    4.4.178.x86_64.1 #1
[1108355.313300] Hardware name: NETGEAR ReadyNAS 314/To be filled by O.E.M., BIOS 4.6.5 11/05/2013
[1108355.322211] task: ffff8800c3362a00 ti: ffff8800c34d8000 task.ti: ffff8800c34d8000
[1108355.330033] RIP: 0010:[<ffffffff882b5f5a>]  [<ffffffff882b5f5a>] __extent_writepage_io+0x1d3/0x398
[1108355.339383] RSP: 0018:ffff8800c34db8d8  EFLAGS: 00010206
[1108355.344991] RAX: ffff880108549870 RBX: ffffea0000d2ba80 RCX: 0000007d07005000
[1108355.352456] RDX: 0000007d07005000 RSI: 0000007d06ffd000 RDI: 0000000000000000
[1108355.359908] RBP: ffff8800c34db978 R08: 0000000000001000 R09: 0000000000000001
[1108355.367396] R10: ffffea0000f0bd40 R11: ffff8800ad6b5488 R12: 0000007d07014000
[1108355.374860] R13: ffff8800c34dbb58 R14: 0000000000000000 R15: 0000000000001000
[1108355.382324] FS:  0000000000000000(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
[1108355.390783] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1108355.396857] CR2: 00007f2ad2a30000 CR3: 00000000c72e9000 CR4: 00000000000006f0
[1108355.404345] Stack:
[1108355.406591]  ffff8800c6f31ab0 ffff8800c6f31900 000000fa00000000 0000000000001000
[1108355.414442]  0000007d06ffd000 0000000000000000 ffff8800c34dbaf8 ffff8800c6f31ab0
[1108355.422276]  ffff8800c6f31900 0000000000000000 ffff8800c34db988 0000007d07014fff
[1108355.430093] Call Trace:
[1108355.432782]  [<ffffffff882b7579>] __extent_writepage+0x176/0x1db
[1108355.439119]  [<ffffffff882b7885>] extent_write_cache_pages.isra.10.constprop.26+0x2a7/0x374
[1108355.447881]  [<ffffffff8807d2ff>] ? ttwu_do_activate.constprop.23+0x57/0x5c
[1108355.455197]  [<ffffffff882b7d3f>] extent_writepages+0x47/0x58
[1108355.461274]  [<ffffffff8829bfab>] ? uncompress_inline+0x148/0x148
[1108355.467720]  [<ffffffff8829bc08>] btrfs_writepages+0x23/0x25
[1108355.473764]  [<ffffffff880e68b3>] do_writepages+0x1e/0x28
[1108355.479483]  [<ffffffff880de30e>] __filemap_fdatawrite_range+0xb2/0xca
[1108355.486349]  [<ffffffff880de3a3>] filemap_fdatawrite_range+0xe/0x10
[1108355.492969]  [<ffffffff882ae6bb>] btrfs_fdatawrite_range+0x1b/0x41
[1108355.499477]  [<ffffffff882ae71c>] start_ordered_ops+0x3b/0x5a
[1108355.505561]  [<ffffffff882ae794>] btrfs_sync_file+0x59/0x2da
[1108355.511550]  [<ffffffff883168b3>] ? security_file_open+0x79/0x80
[1108355.517877]  [<ffffffff881416d2>] vfs_fsync_range+0x86/0x95
[1108355.523780]  [<ffffffff881fb87e>] nfsd_vfs_write+0x219/0x265
[1108355.529758]  [<ffffffff881fd531>] nfsd_write+0xa6/0xc6
[1108355.535202]  [<ffffffff88202552>] nfsd3_proc_write+0x90/0xab
[1108355.541175]  [<ffffffff881f7ec3>] nfsd_dispatch+0xcd/0x189
[1108355.546991]  [<ffffffff888c825b>] svc_process+0x582/0x6b6
[1108355.552711]  [<ffffffff881f7924>] ? nfsd_destroy+0x57/0x57
[1108355.558516]  [<ffffffff881f7a19>] nfsd+0xf5/0x147
[1108355.563527]  [<ffffffff88078c4a>] kthread+0xdc/0xe4
[1108355.568693]  [<ffffffff88078b6e>] ? kthread_worker_fn+0x129/0x129
[1108355.575098]  [<ffffffff888e476f>] ret_from_fork+0x3f/0x80
[1108355.580807]  [<ffffffff88078b6e>] ? kthread_worker_fn+0x129/0x129
[1108355.587229] Code: 48 3d 01 f0 ff ff 44 0f 43 d0 45 89 d6 e9 c3 01 00 00 48 8b 70 18 48 89 f1 48 03 48 20 48 89 75 80 48 89 ca 72 07 49 39 cc 72 06 <0f> 0b 48 83 ca ff 4c 29 e2 48 8b b5 78 ff ff ff 48 8b 78 70 4c 
[1108355.608271] RIP  [<ffffffff882b5f5a>] __extent_writepage_io+0x1d3/0x398
[1108355.615266]  RSP <ffff8800c34db8d8>
[1108355.619704] ---[ end trace 6b6431a8ef19cb1c ]---
  • StephenB's avatar
    StephenB
    Jun 28, 2019

    Shrinking this one down ...

     


    jukkaforss wrote:

    sdb part 1

     

    root@readynas:~# smartctl -x /dev/sdb
    
    ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
      1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    159
      9 Power_On_Hours          -O--CK   028   028   000    -    52621
    
    Error 9 [8] occurred at disk power-on lifetime: 50321 hours (2096 days + 17 hours)
    Error: UNC at LBA = 0x12fdf7710 = 5098141456
    
    Error 8 [7] occurred at disk power-on lifetime: 50321 hours (2096 days + 17 hours)
    Error: WP at LBA = 0x12fdf7710 = 5098141456
    
    Error 7 [6] occurred at disk power-on lifetime: 41988 hours (1749 days + 12 hours)
    Error: UNC at LBA = 0xcf083840 = 3473422400
    
    Error 6 [5] occurred at disk power-on lifetime: 41988 hours (1749 days + 12 hours)
    Error: WP at LBA = 0xcf083840 = 3473422400
    
    Error 5 [4] occurred at disk power-on lifetime: 38618 hours (1609 days + 2 hours)
    Error: UNC at LBA = 0xe824a0c0 = 3894714560
    
    Error 4 [3] occurred at disk power-on lifetime: 38618 hours (1609 days + 2 hours)
    Error: WP at LBA = 0xe824a0b8 = 3894714552
    
    Error 3 [2] occurred at disk power-on lifetime: 38618 hours (1609 days + 2 hours)
    Error: UNC at LBA = 0xe824a0b8 = 3894714552
    
    Error 2 [1] occurred at disk power-on lifetime: 38615 hours (1608 days + 23 hours)
    Error: WP at LBA = 0x11616d4b8 = 4665562296
    

     


    I saw a similar pattern on one of my WD60EFRX drives a while ago, and when I tested it with Lifeguard it failed.   Though a second disk with the same pattern passed Lifeguard.  So I recommend testing this disk (and perhaps replace it even if it does pass).

     

    The most recent logged error was about 2000 hours ago (~ 3 months), so that particular error didn't cause the most recent crash.  But I'm thinking that this disk likely triggered it anyway.

     

    FWIW, I haven't seen any explanation of how to decode the raw read error rate.  But it is quite a bit higher on this drive than your other ones.

9 Replies

Replies have been turned off for this discussion
  • StephenB's avatar
    StephenB
    Guru - Experienced User

    I'd look in kernel.log and system.log for disk errors and btrfs errors.

    • jukkaforss's avatar
      jukkaforss
      Tutor

      Kernel log has same messages, but I couldn't found anything from disk_info and btrfs logs.

      No ATA errors or any other problems with disks.

      • StephenB's avatar
        StephenB
        Guru - Experienced User

        If ssh is enabled then smartctl -x might also give a clue.

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More