NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

rickyl's avatar
rickyl
Tutor
Feb 10, 2018
Solved

R6400 crashing overnight with high SY and high WA

My R6400 (F/W V1.0.1.36_1.0.25) has crashed overnight two nights in a row. After the first crash, I started running two telnet sessions from my laptop to the router (via wi-fi) - one running top and the other running vmstat 1.  When I checked this morning, the router was crashed (although all led's were flashing and steady as normal). The only interesting clues (for me anyway) were: 1) top had [mtdblock3] as the highest process and it was at 40% cpu, 2) vmstat showed consistently high SY and high WA, roughly 75% SY and 25% WA (no idle and no user). Also, the run list was only 1 to 3 (except for one time at 39). And, blocked was 1 to 4 when it normally is 0). I don't have any timestamps so I can't tell when this happened. I have a USB drive connected that I use for automated backups and manual storage. I haven't changed anything with the backups and it is possible that one was running when it crashed. The clues point to some type of system/kernel activity (75% SY) and blocked on I/O wait (high "B" number and 25% wait on i/o).  Any help or suggestions is greatly appreciated.  I had the debug log emailed to me but I'm not smart enough to understand all of it ;-)  Thanks.

  • rickyl's avatar
    rickyl
    Feb 13, 2018

    Thanks for your research and help.  I was thinking the same - that it's the internal storage that's failing.  It is now occuring at least once a day, randomly.  I wonder if there is a way to run something like a chkdsk on the internal storage device?  I'd hate to have to trash the router for something that should be fixable. 

3 Replies

  • Found some more clues, this time in the mtd17 log file:

    <3>[598396.820000] SQUASHFS error: Unable to read fragment cache entry [edcf50]
    <3>[598396.820000] SQUASHFS error: Unable to read page, block edcf50, size c43c
    <3>[598396.820000] SQUASHFS error: Unable to read fragment cache entry [edcf50]
    <3>[598396.820000] SQUASHFS error: Unable to read page, block edcf50, size c43c
    <3>[598396.820000] SQUASHFS error: Unable to read fragment cache entry [edcf50]
    <3>[598396.820000] SQUASHFS error: Unable to read page, block edcf50, size c43c
    <0>[598396.820000] Kernel panic - not syncing: Attempted to kill init!
    <4>[598396.820000] [<c00552a0>] (unwind_backtrace+0x0/0xe4) from [<c03ac418>] (panic+0x68/0x194)
    <4>[598396.830000] [<c03ac418>] (panic+0x68/0x194) from [<c007294c>] (do_exit+0x68/0x608)
    <4>[598396.840000] [<c007294c>] (do_exit+0x68/0x608) from [<c00731b0>] (do_group_exit+0x90/0xc0)
    <4>[598396.850000] [<c00731b0>] (do_group_exit+0x90/0xc0) from [<c007cfe4>] (get_signal_to_deliver+0x334/0x36c)
    <4>[598396.860000] [<c007cfe4>] (get_signal_to_deliver+0x334/0x36c) from [<c0051320>] (do_signal+0x50/0x5d4)
    <4>[598396.870000] [<c0051320>] (do_signal+0x50/0x5d4) from [<c0051da0>] (do_notify_resume+0x18/0x38)
    <4>[598396.870000] [<c0051da0>] (do_notify_resume+0x18/0x38) from [<c004eaf8>] (work_pending+0x24/0x28)
    <2>[598396.880000] CPU0: stopping
    <4>[598396.880000] [<c00552a0>] (unwind_backtrace+0x0/0xe4) from [<c004e2f0>] (do_IPI+0xfc/0x180)
    <4>[598396.880000] [<c004e2f0>] (do_IPI+0xfc/0x180) from [<c0487ca8>] (__irq_svc+0x48/0xe8)
    <4>[598396.880000] Exception stack(0xc04c9f78 to 0xc04c9fc0)
    <4>[598396.880000] 9f60:                                                       00000000 d794f000
    <4>[598396.880000] 9f80: c04c9fc0 00000000 c04c8000 c04d4bc8 c04f50a8 c04d4bc0 000260e0 413fc090
    <4>[598396.880000] 9fa0: 0000001f 00000000 c0548cb8 c04c9fc0 c004fbb0 c004fbb4 60000013 ffffffff
    <4>[598396.880000] [<c0487ca8>] (__irq_svc+0x48/0xe8) from [<c004fbb4>] (default_idle+0x24/0x28)
    <4>[598396.880000] [<c004fbb4>] (default_idle+0x24/0x28) from [<c004fd1c>] (cpu_idle+0x40/0x94)
    <4>[598396.880000] [<c004fd1c>] (cpu_idle+0x40/0x94) from [<c0008c64>] (start_kernel+0x320/0x37c)
    <4>[598396.880000] [<c0008c64>] (start_kernel+0x320/0x37c) from [<00008084>] (0x8084)
    <4>[598396.880000] NVRAM LOG 16384 23487240 23503624
    FS error: squashfs_read_data failed to read block 0xa96a94
    <3>[598396.570000] SQUASHFS error: Unable to read data cache entry [a96a94]

     

    including

     

    <3>[598396.620000] SQUASHFS error: xz_dec_run error, data probably corrupt

     

    So all the clues point to data corruption.  Question though, where?  How do I determine if it is on my USB attached drive or on some internal storage in the R6400 itself? 

     

    # df -h
    Filesystem                Size      Used Available Use% Mounted on
    /dev/mtdblock3           27.9M     27.9M         0 100% /
    devtmpfs                124.3M         0    124.3M   0% /dev
    devfs                   124.3M         0    124.3M   0% /dev
    /dev/mtdblock19           5.0M    396.0k      4.6M   8% /tmp/openvpn
    /dev/mtdblock18          57.0M      2.9M     54.1M   5% /tmp/media/nand
    /dev/sda2                 5.5T    555.3G      4.9T  10% /tmp/mnt/usb0/part2

    • antinode's avatar
      antinode
      Guru

      > <3>[598396.820000] SQUASHFS error: Unable to read fragment cache entry
      > [edcf50]

         I know nothing, but a quick Web search suggests that "Squashfs is a
      compressed read-only file system for Linux.", which suggests that you're
      looking at router firmware, not data on your external USB device(s).

            https://en.wikipedia.org/wiki/SquashFS

      > <3>[598396.620000] SQUASHFS error: xz_dec_run error, data probably
      > corrupt

         "xz" sounds like (some form of) LZMA compression.  A "mount" command
      should tell you which file system is used on each mount point ("type
      XXXX").  Might be informative.

      • rickyl's avatar
        rickyl
        Tutor

        Thanks for your research and help.  I was thinking the same - that it's the internal storage that's failing.  It is now occuring at least once a day, randomly.  I wonder if there is a way to run something like a chkdsk on the internal storage device?  I'd hate to have to trash the router for something that should be fixable.