NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

Roman304's avatar
Roman304
Aspirant
May 27, 2021
Solved

Сhecksum errors in files on RAID5 Netgear ReadyNAS Pro 6 RNDP6000

Netgear ReadyNAS Pro 6 RNDP6000

RAID-5 (of 6 disks) 2TB

FW 6.10.2

 

I got a checksum error message in the backup software.
I was check disks and found on one many bad sectors. (using HDAT2 software).
NetGear was no errors in GUI.
Bad disk was extracted.
In degraded mode without one disk (via root ssh)
created random file

dd if=/dev/urandom of=Test.flie bs=64M count=32

immidatly check file

md5sum Test.flie
594eacb844ae053ab8bccadb9f3e43b4 Test.flie

waiting 10 min and checked again

md5sum Test.flie
522c8afffd428e14b425d31d8b5d7f52 Test.flie

Checksums not equal.
Whataps happend?

 


  • Roman304 wrote:

    I have partitioned jbod raid drives into each drive. RAID number 4 checksum error.

    That's a clear indication that your issue is either linked to that disk or to that slot.  The next step is to figure out which.

     

    I suggest destroying RAID 1,2,5,6, and removing those disks.  Then power down the NAS, and swap RAID 3 and RAID 4.  Power up and re-run the test on both volumes.  That will let you know if the problem is linked to the disk or the slot.  

     

    If the problem disappears on both disks, then it could be power-related.  You can confirm that by adding the removed disks back one at a time, and see when the problem starts happening again.

20 Replies

Replies have been turned off for this discussion
  • Did you have bit-rot protection turned on or disabled on the share?

     

    The checksum of the test file changing would seem to suggest bad RAM (memory) or a loose RAM module as a possibility. Have you run the memory test boot menu option?

    • Roman304's avatar
      Roman304
      Aspirant

      Bit-rot Protection was disabled for all share. RAM (Memory) was tested by MemTest86 - No error (7 laps).
      I create file under ReadyNAS os directly on mountpoint /RAID-5

      • StephenB's avatar
        StephenB
        Guru - Experienced User

        Roman304 wrote:

        Bit-rot Protection was disabled for all share. 
        I create file under ReadyNAS os directly on mountpoint /RAID-5


        I understood that from your commands.

         

        Not sure what happened, as of course md5sum should have generated the same result the second time. Of course the volume is degraded, so any other disk errors can't be corrected by RAID parity.

         

        So a read error or file system error could account for it - which is why I suggested looking in the log zip for any errors that occured while the commands were being executed.  You could substitute journalctl if you prefer to use ssh.

         

        You might also try rebooting the NAS and see if you can reproduce the error.

  • StephenB's avatar
    StephenB
    Guru - Experienced User

    Roman304 wrote:

    immidatly check file

    md5sum Test.flie
    594eacb844ae053ab8bccadb9f3e43b4 Test.flie

    waiting 10 min and checked again

    md5sum Test.flie
    522c8afffd428e14b425d31d8b5d7f52 Test.flie

    Checksums not equal.
    Whataps happend?

     


    Sounds like you might have more than one bad disk. I hope you have a backup.

     

    I suggest downloading the full log zip file, and then look for errors around the time you created and checksumed the test file.

     

     

     

  • dmesg on boot nothing about RAID-5 /dev/md127

    root@NAS-2:~# dmesg | grep md0
    [   31.068901] md: md0 stopped.
    [   31.075222] md/raid1:md0: active with 6 out of 6 mirrors
    [   31.075307] md0: detected capacity change from 0 to 4290772992
    [   31.602526] BTRFS: device label 33ea55f9:root devid 1 transid 467355 /dev/md0
    [   31.603155] BTRFS info (device md0): has skinny extents
    [   33.369153] BTRFS warning (device md0): csum failed ino 117932 off 2420736 csum 3710567192 expected csum 4039208015
    [   33.378234] BTRFS warning (device md0): csum failed ino 117932 off 3203072 csum 2302637777 expected csum 2765412742
    [   39.711268] BTRFS warning (device md0): csum failed ino 26800 off 2105344 csum 3723732640 expected csum 4129019946
    root@NAS-2:~# dmesg | grep md1
    [   31.100266] md: md1 stopped.
    [   31.108850] md/raid10:md1: active with 6 out of 6 devices
    [   31.108933] md1: detected capacity change from 0 to 1604321280
    [   34.218589] md: md127 stopped.
    [   34.246979] md/raid:md127: device sda3 operational as raid disk 0
    [   34.246985] md/raid:md127: device sdf3 operational as raid disk 5
    [   34.246988] md/raid:md127: device sde3 operational as raid disk 4
    [   34.246990] md/raid:md127: device sdd3 operational as raid disk 3
    [   34.246993] md/raid:md127: device sdc3 operational as raid disk 2
    [   34.246996] md/raid:md127: device sdb3 operational as raid disk 1
    [   34.247777] md/raid:md127: allocated 6474kB
    [   34.247926] md/raid:md127: raid level 5 active with 6 out of 6 devices, algorithm 2
    [   34.248112] md127: detected capacity change from 0 to 9977158696960
    [   34.658138] Adding 1566716k swap on /dev/md1.  Priority:-1 extents:1 across:1566716k
    [   34.670593] BTRFS: device label 33ea55f9:RAID-5 devid 1 transid 27093 /dev/md127
    [   34.980123] BTRFS info (device md127): has skinny extents
    root@NAS-2:~# dmesg | grep md127
    [   34.218589] md: md127 stopped.
    [   34.246979] md/raid:md127: device sda3 operational as raid disk 0
    [   34.246985] md/raid:md127: device sdf3 operational as raid disk 5
    [   34.246988] md/raid:md127: device sde3 operational as raid disk 4
    [   34.246990] md/raid:md127: device sdd3 operational as raid disk 3
    [   34.246993] md/raid:md127: device sdc3 operational as raid disk 2
    [   34.246996] md/raid:md127: device sdb3 operational as raid disk 1
    [   34.247777] md/raid:md127: allocated 6474kB
    [   34.247926] md/raid:md127: raid level 5 active with 6 out of 6 devices, algorithm 2
    [   34.248112] md127: detected capacity change from 0 to 9977158696960
    [   34.670593] BTRFS: device label 33ea55f9:RAID-5 devid 1 transid 27093 /dev/md127
    [   34.980123] BTRFS info (device md127): has skinny extents

    What log file do you recommend watching?

    There are a lot of them in zip.

     

    Problem not solved after reboot.

    root@NAS-2:/RAID-5/TEST-FILE# dd if=/dev/urandom of=Test.flie bs=64M count=32
    dd: warning: partial read (33554431 bytes); suggest iflag=fullblock
    0+32 records in
    0+32 records out
    1073741792 bytes (1.1 GB, 1.0 GiB) copied, 104.607 s, 10.3 MB/s
    root@NAS-2:/RAID-5/TEST-FILE# md5sum Test.flie
    5c07ebd42dc2af232c0431d3b86cab7d  Test.flie
    root@NAS-2:/RAID-5/TEST-FILE# md5sum Test.flie
    5c07ebd42dc2af232c0431d3b86cab7d  Test.flie
    root@NAS-2:/RAID-5/TEST-FILE# md5sum Test.flie
    433c2099b00285f11ed88d0c7d580f32  Test.flie
    root@NAS-2:/RAID-5/TEST-FILE# md5sum Test.flie
    38b511f0129c3ac888f19a430b3938c8  Test.flie
    root@NAS-2:/RAID-5/TEST-FILE# md5sum Test.flie
    38b511f0129c3ac888f19a430b3938c8  Test.flie

    Nothing errors in GUI after rebuild.

    /dev/md127:
               Version : 1.2
         Creation Time : Mon Mar 16 21:27:21 2020
            Raid Level : raid5
            Array Size : 9743319040 (9291.95 GiB 9977.16 GB)
         Used Dev Size : 1948663808 (1858.39 GiB 1995.43 GB)
          Raid Devices : 6
         Total Devices : 6
           Persistence : Superblock is persistent
    
           Update Time : Fri May 28 17:29:02 2021
                 State : clean
        Active Devices : 6
       Working Devices : 6
        Failed Devices : 0
         Spare Devices : 0
    
                Layout : left-symmetric
            Chunk Size : 64K
    
    Consistency Policy : unknown
    
                  Name : 33ea55f9:RAID-5-0  (local to host 33ea55f9)
                  UUID : 04d214c4:ee331e6a:74ca0a04:5e846481
                Events : 977
    
        Number   Major   Minor   RaidDevice State
           6       8        3        0      active sync   /dev/sda3
           1       8       19        1      active sync   /dev/sdb3
           2       8       35        2      active sync   /dev/sdc3
           3       8       51        3      active sync   /dev/sdd3
           4       8       67        4      active sync   /dev/sde3
           5       8       83        5      active sync   /dev/sdf3
    cat /sys/block/md127/md/mismatch_cnt 
    0

    What other test can i do?

     

    • StephenB's avatar
      StephenB
      Guru - Experienced User

      Roman304 wrote:

      What log file do you recommend watching?

      system.log, kernel.log, and system-journal.log

      • Roman304's avatar
        Roman304
        Aspirant

        StephenB wrote:

        Roman304 wrote:

        What log file do you recommend watching?

        system.log, kernel.log, and system-journal.log


        in kernel.log 

         

        May 30 14:35:29 NAS-2 kernel: mdcsrepair[13627]: segfault at 2325238 ip 00000000004048df sp 00007fff08ffa500 error 4 in mdcsrepair[400000+10000]
        May 30 14:35:29 NAS-2 kernel: sh (13626): drop_caches: 3
        May 30 14:35:29 NAS-2 kernel: sh (13625): drop_caches: 3
        May 30 14:35:29 NAS-2 kernel: mdcsrepair[13639]: segfault at 1cd6230 ip 00000000004048df sp 00007ffeba060c50 error 4 in mdcsrepair[400000+10000]
        May 30 14:35:29 NAS-2 kernel: mdcsrepair[13640]: segfault at bfe230 ip 00000000004048df sp 00007ffe6f442530 error 4 in mdcsrepair[400000+10000]
        May 30 14:35:30 NAS-2 kernel: sh (13647): drop_caches: 3
        May 30 14:35:30 NAS-2 kernel: sh (13648): drop_caches: 3
        May 30 17:47:16 NAS-2 kernel: BTRFS warning (device md0): csum failed ino 26800 off 6852608 csum 846329252 expected csum 429493038
        May 30 17:47:16 NAS-2 kernel: sh (9521): drop_caches: 3
        May 30 17:47:17 NAS-2 kernel: sh (9522): drop_caches: 3
        May 30 20:41:32 NAS-2 kernel: BTRFS warning (device md0): csum failed ino 26800 off 2220032 csum 4138936066 expected csum 3680012373
        May 30 20:41:32 NAS-2 kernel: BTRFS warning (device md0): csum failed ino 26800 off 5521408 csum 988873707 expected csum 286883169
        May 30 20:41:32 NAS-2 kernel: BTRFS warning (device md0): csum failed ino 26800 off 5603328 csum 2031798327 expected csum 1425106784
        May 30 20:41:32 NAS-2 kernel: sh (2563): drop_caches: 3
        May 30 20:41:32 NAS-2 kernel: BTRFS warning (device md0): csum failed ino 26800 off 6266880 csum 1389706193 expected csum 2134807686
        May 30 20:41:32 NAS-2 kernel: BTRFS warning (device md0): csum failed ino 26800 off 6754304 csum 1977957496 expected csum 1477350191
        May 30 20:41:32 NAS-2 kernel: BTRFS warning (device md0): csum failed ino 26800 off 6754304 csum 1977957496 expected csum 1477350191
        May 30 20:41:32 NAS-2 kernel: sh (2567): drop_caches: 3
        May 30 20:41:32 NAS-2 kernel: sh (2569): drop_caches: 3
        May 30 20:41:32 NAS-2 kernel: mdcsrepair[2571]: segfault at 1fd5230 ip 00000000004048df sp 00007ffe294f0c30 error 4 in mdcsrepair[400000+10000]

        what is mean - mdcsrepair segfault ?

         

        i found problems in logs on root device /dev/md0

        in system-journal.log

         

        May 30 20:41:32 NAS-2 kernel: BTRFS warning (device md0): csum failed ino 26800 off 2220032 csum 4138936066 expected csum 3680012373
        May 30 20:41:32 NAS-2 mdcsrepaird[2862]: mdcsrepaird: mdcsrepair /dev/md0 622764032 4096 fdd84c09 aa77a724 //var/readynasd/db.sq3
        May 30 20:41:32 NAS-2 kernel: BTRFS warning (device md0): csum failed ino 26800 off 5521408 csum 988873707 expected csum 286883169
        May 30 20:41:32 NAS-2 kernel: BTRFS warning (device md0): csum failed ino 26800 off 5603328 csum 2031798327 expected csum 1425106784
        May 30 20:41:32 NAS-2 kernel: sh (2563): drop_caches: 3
        May 30 20:41:32 NAS-2 kernel: BTRFS warning (device md0): csum failed ino 26800 off 6266880 csum 1389706193 expected csum 2134807686
        May 30 20:41:32 NAS-2 kernel: BTRFS warning (device md0): csum failed ino 26800 off 6754304 csum 1977957496 expected csum 1477350191
        May 30 20:41:32 NAS-2 kernel: BTRFS warning (device md0): csum failed ino 26800 off 6754304 csum 1977957496 expected csum 1477350191
        May 30 20:41:32 NAS-2 kernel: sh (2567): drop_caches: 3
        May 30 20:41:32 NAS-2 mdcsrepaird[2862]: mdcsrepair: repairing /dev/md0 @ 622764032 [//var/readynasd/db.sq3] succeeded.
        May 30 20:41:32 NAS-2 kernel: sh (2569): drop_caches: 3
        May 30 20:41:32 NAS-2 kernel: mdcsrepair[2571]: segfault at 1fd5230 ip 00000000004048df sp 00007ffe294f0c30 error 4 in mdcsrepair[400000+10000]
        May 30 20:41:32 NAS-2 mdcsrepaird[2862]: mdcsrepaird: mdcsrepair /dev/md0 626065408 4096 14fc0ec5 9e82e6ee //var/readynasd/db.sq3
        May 30 20:41:32 NAS-2 mdcsrepaird[2862]: mdcsrepaird: mdcsrepair /dev/md0 626147328 4096 c837e586 9f980eab //var/readynasd/db.sq3
        May 30 20:41:32 NAS-2 mdcsrepaird[2862]: mdcsrepaird: mdcsrepair /dev/md0 626810880 4096 2ec42aad 796bc180 //var/readynasd/db.sq3
        May 30 20:41:32 NAS-2 mdcsrepaird[2862]: mdcsrepaird: mdcsrepair /dev/md0 627298304 4096 87c31a8a d06cf1a7 //var/readynasd/db.sq3
        May 30 20:41:33 NAS-2 kernel: sh (2616): drop_caches: 3
        May 30 20:41:33 NAS-2 kernel: sh (2617): drop_caches: 3
        May 30 20:41:33 NAS-2 mdcsrepaird[2862]: mdcsrepair: designated data mismatches bad checksum /dev/md0 @ 626065408
        May 30 20:41:33 NAS-2 kernel: sh (2620): drop_caches: 3
        May 30 20:41:33 NAS-2 kernel: sh (2619): drop_caches: 3
        May 30 20:41:33 NAS-2 mdcsrepaird[2862]: mdcsrepair: designated data mismatches bad checksum /dev/md0 @ 627298304
        May 30 20:41:33 NAS-2 kernel: sh (2618): drop_caches: 3
        May 30 20:41:33 NAS-2 kernel: sh (2622): drop_caches: 3
        May 30 20:41:33 NAS-2 mdcsrepaird[2862]: mdcsrepair: designated data mismatches bad checksum /dev/md0 @ 626147328
        May 30 20:41:33 NAS-2 kernel: sh (2621): drop_caches: 3
        May 30 20:41:33 NAS-2 kernel: sh (2623): drop_caches: 3
        May 30 20:41:33 NAS-2 mdcsrepaird[2862]: mdcsrepair: designated data mismatches bad checksum /dev/md0 @ 626810880

        Can we conclude that my device is completely defective and the operating system does not work correctly? What can I do about it?

  •  

    i tried to check mdadm by signal.

     

    echo repair > /sys/block/md127/md/sync_action

     

    9012.910037] md: requested-resync of RAID array md127
    [ 9012.910043] md: minimum _guaranteed_  speed: 30000 KB/sec/disk.
    [ 9012.910045] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for requested-resync.
    [ 9012.910053] md: using 128k window, over a total of 1948663808k.
    [ 9084.033911] BTRFS warning (device md0): csum failed ino 26800 off 3297280 csum 2509189606 expected csum 3194441580
    [ 9084.454904] BTRFS warning (device md0): csum failed ino 26800 off 4055040 csum 1637586726 expected csum 1282422897
    [ 9084.581585] BTRFS warning (device md0): csum failed ino 26800 off 6422528 csum 1593964658 expected csum 1961461496
    [ 9084.612492] sh (25655): drop_caches: 3
    [ 9084.614361] BTRFS warning (device md0): csum failed ino 26800 off 5677056 csum 1932001198 expected csum 1489994020
    [ 9084.615384] BTRFS warning (device md0): csum failed ino 26800 off 5677056 csum 1932001198 expected csum 1489994020
    [ 9084.622754] BTRFS warning (device md0): csum failed ino 26800 off 7094272 csum 3606228094 expected csum 4246393588
    [ 9084.695542] sh (25658): drop_caches: 3
    [ 9085.341325] sh (25699): drop_caches: 3
    [ 9085.424200] sh (25700): drop_caches: 3
    [ 9085.607585] sh (25701): drop_caches: 3
    [ 9085.704849] sh (25702): drop_caches: 3
    [ 9085.731888] sh (25704): drop_caches: 3
    [ 9085.732521] mdcsrepair[25705]: segfault at 1902230 ip 00000000004048df sp 00007ffea4e019b0 error 4 in mdcsrepair[400000+10000]
    [ 9087.009513] sh (25731): drop_caches: 3
    [ 9087.110047] sh (25732): drop_caches: 3
    [ 9087.557204] sh (25762): drop_caches: 3
    [ 9087.638523] sh (25766): drop_caches: 3
    [ 9101.700580] BTRFS warning (device md0): csum failed ino 26800 off 3297280 csum 2509189606 expected csum 3194441580
    [ 9102.140147] BTRFS warning (device md0): csum failed ino 26800 off 6729728 csum 843258588 expected csum 430662742
    [ 9102.141347] BTRFS warning (device md0): csum failed ino 26800 off 4530176 csum 4014326161 expected csum 3266211526
    [ 9102.142287] BTRFS warning (device md0): csum failed ino 26800 off 4055040 csum 1637586726 expected csum 1282422897
    [ 9102.142732] BTRFS warning (device md0): csum failed ino 26800 off 4804608 csum 2561880108 expected csum 3042484091
    [ 9103.060502] BTRFS warning (device md0): csum failed ino 26800 off 2076672 csum 188520152 expected csum 651639183
    [ 9103.276951] sh (26055): drop_caches: 3
    [ 9103.277401] sh (26056): drop_caches: 3
    [ 9103.277815] sh (26058): drop_caches: 3
    [ 9103.281850] sh (26054): drop_caches: 3
    [ 9103.282294] sh (26057): drop_caches: 3
    [ 9103.434179] sh (26062): drop_caches: 3
    [ 9103.437582] sh (26060): drop_caches: 3
    [ 9103.438571] sh (26059): drop_caches: 3
    [ 9103.465113] sh (26061): drop_caches: 3
    [ 9103.467454] sh (26063): drop_caches: 3
    [ 9103.467969] sh (26064): drop_caches: 3
    [ 9103.566172] sh (26066): drop_caches: 3
    [ 9103.567000] mdcsrepair[26070]: segfault at 1ca6238 ip 00000000004048df sp 00007fffea17d9d0 error 4 in mdcsrepair[400000+10000]
    [ 9103.568900] sh (26067): drop_caches: 3
    [ 9103.568901] sh (26065): drop_caches: 3
    [ 9103.569312] mdcsrepair[26071]: segfault at 1d29220 ip 00000000004048df sp 00007ffcf1ef8270 error 4 in mdcsrepair[400000+10000]
    [ 9103.599827] sh (26069): drop_caches: 3
    [ 9103.600311] mdcsrepair[26092]: segfault at 25a7228 ip 00000000004048df sp 00007ffd96a6c460 error 4 in mdcsrepair[400000+10000]
    [ 9103.639920] BTRFS warning (device md0): csum failed ino 26800 off 7749632 csum 976488093 expected csum 400533962
    [ 9103.640132] BTRFS warning (device md0): csum failed ino 26800 off 7749632 csum 976488093 expected csum 400533962
    [ 9105.156635] sh (26145): drop_caches: 3
    [ 9105.262331] sh (26146): drop_caches: 3
    [ 9105.338203] sh (26148): drop_caches: 3
    [ 9105.338666] mdcsrepair[26149]: segfault at 14e0228 ip 00000000004048df sp 00007ffd2601bda0 error 4 in mdcsrepair[400000+10000]
    [ 9392.955849] BTRFS warning (device md0): csum failed ino 26800 off 3297280 csum 2509189606 expected csum 3194441580
    [ 9393.396182] sh (27133): drop_caches: 3
    [ 9393.407254] BTRFS warning (device md0): csum failed ino 26800 off 5730304 csum 1721535998 expected csum 1266098857
    [ 9393.555932] sh (27134): drop_caches: 3
    [ 9393.680572] BTRFS warning (device md0): csum failed ino 26800 off 4276224 csum 2257014538 expected csum 2876039261
    [ 9393.705564] sh (27136): drop_caches: 3
    [ 9393.705994] mdcsrepair[27138]: segfault at fb1238 ip 00000000004048df sp 00007fffde04bee0 error 4 in mdcsrepair[400000+10000]
    [ 9394.203447] BTRFS warning (device md0): csum failed ino 26800 off 5115904 csum 406066205 expected csum 870083223
    [ 9394.203615] BTRFS warning (device md0): csum failed ino 26800 off 5115904 csum 406066205 expected csum 870083223
    [ 9395.340527] sh (27180): drop_caches: 3
    [ 9395.465281] sh (27182): drop_caches: 3
    [ 9395.529394] sh (27184): drop_caches: 3
    [ 9395.529871] mdcsrepair[27187]: segfault at 134d230 ip 00000000004048df sp 00007ffeebc6fb40 error 4 in mdcsrepair[400000+10000]
    [ 9395.831133] sh (27222): drop_caches: 3
    [ 9395.950588] sh (27223): drop_caches: 3
    [ 9395.951581] sh (27224): drop_caches: 3
    [ 9395.980226] sh (27226): drop_caches: 3
    [ 9396.077257] sh (27227): drop_caches: 3
    [ 9396.077683] mdcsrepair[27229]: segfault at 1b93238 ip 00000000004048df sp 00007fff9c18ac80 error 4 in mdcsrepair[400000+10000]
    [11045.200001] BTRFS warning (device md0): csum failed ino 26800 off 2174976 csum 549178347 expected csum 190035297
    [11045.294967] BTRFS warning (device md0): csum failed ino 26800 off 4857856 csum 2091937243 expected csum 1364968076
    [11045.297180] BTRFS warning (device md0): csum failed ino 26800 off 5017600 csum 455988020 expected csum 918977635
    [11045.323613] BTRFS warning (device md0): csum failed ino 26800 off 6045696 csum 3782956367 expected csum 3432053272
    [11045.324193] sh (31561): drop_caches: 3
    [11045.380741] BTRFS warning (device md0): csum failed ino 26800 off 7426048 csum 3231217454 expected csum 3983792249
    [11045.540305] sh (31563): drop_caches: 3
    [11046.101205] sh (31610): drop_caches: 3
    [11046.200518] sh (31611): drop_caches: 3
    [11046.201664] sh (31612): drop_caches: 3
    [11046.212111] sh (31613): drop_caches: 3
    [11046.212113] sh (31614): drop_caches: 3
    [11046.226738] sh (31616): drop_caches: 3
    [11046.226747] sh (31615): drop_caches: 3
    [11046.238619] sh (31618): drop_caches: 3
    [11046.239139] sh (31617): drop_caches: 3
    [11046.239189] mdcsrepair[31619]: segfault at 2179228 ip 00000000004048df sp 00007ffdd7ed7ba0 error 4 in mdcsrepair[400000+10000]
    [33839.238730] md: md127: requested-resync done.

     

    next cat /sys/block/md127/md/mismatch_cnt and results: 135008

    but, no error status in GUI

     

    /dev/md127:
               Version : 1.2
         Creation Time : Mon Mar 16 21:27:21 2020
            Raid Level : raid5
            Array Size : 9743319040 (9291.95 GiB 9977.16 GB)
         Used Dev Size : 1948663808 (1858.39 GiB 1995.43 GB)
          Raid Devices : 6
         Total Devices : 6
           Persistence : Superblock is persistent
    
           Update Time : Sun May 30 11:49:08 2021
                 State : clean
        Active Devices : 6
       Working Devices : 6
        Failed Devices : 0
         Spare Devices : 0
    
                Layout : left-symmetric
            Chunk Size : 64K
    
    Consistency Policy : unknown
    
                  Name : 33ea55f9:RAID-5-0  (local to host 33ea55f9)
                  UUID : 04d214c4:ee331e6a:74ca0a04:5e846481
                Events : 979
    
        Number   Major   Minor   RaidDevice State
           6       8        3        0      active sync   /dev/sda3
           1       8       19        1      active sync   /dev/sdb3
           2       8       35        2      active sync   /dev/sdc3
           3       8       51        3      active sync   /dev/sdd3
           4       8       67        4      active sync   /dev/sde3
           5       8       83        5      active sync   /dev/sdf3

     

    problem still exists

     

    root@NAS-2:/RAID-5/TEST-FILE# dd if=/dev/urandom of=Test.flie bs=64M count=32dd: warning: partial read (33554431 bytes); suggest iflag=fullblock
    0+32 records in
    0+32 records out
    1073741792 bytes (1.1 GB, 1.0 GiB) copied, 103.542 s, 10.4 MB/s
    root@NAS-2:/RAID-5/TEST-FILE# md5sum Test.flie
    71b8e1ea63c2d543dd1b521698f1f40b  Test.flie
    
    after 5-10 minutes
    
    root@NAS-2:/RAID-5/TEST-FILE# md5sum Test.flie
    5952e1d1c6447efbbc4e76b13f090dbd  Test.flie

     

    • DEADDEADBEEF's avatar
      DEADDEADBEEF
      Apprentice

      It seems incredibly strange that BTRFS would have a file get corrupted like this and not throw BTRFS errors all over the place.. What's in the journal would not match what's on the disk!

       

      Are you absolutely sure there's not some application out there touching/modifying the files? Can you also track the modify date of the file ($ stat <file>)? Perhaps even turn on Auditing? It's a pretty major thing for a file to get changed like that silently, BTRFS should detect and report corruption as soon as you try to access the file.. Unless.... maybe you have turned off checksumming on your data volume?

      • Roman304's avatar
        Roman304
        Aspirant

        DEADDEADBEEF wrote:

         

        Are you absolutely sure there's not some application out there touching/modifying the files? Can you also track the modify date of the file ($ stat <file>)? Perhaps even turn on Auditing?

         


        i am not sure... But i use default all services and nothing install outhere.

        about $ stat <file> , file not modified, but checksum changed.

         

        root@HQ-NAS-2:/RAID-5/TEST-FILE# dd if=/dev/urandom of=Test.flie bs=64M count=32
        dd: warning: partial read (33554431 bytes); suggest iflag=fullblock
        0+32 records in
        0+32 records out
        1073741792 bytes (1.1 GB, 1.0 GiB) copied, 103.885 s, 10.3 MB/s
        root@HQ-NAS-2:/RAID-5/TEST-FILE# md5sum Test.flie
        0542952ac3e7e9d494a26a37c41a6c9e  Test.flie
        root@HQ-NAS-2:/RAID-5/TEST-FILE# stat Test.flie
          File: 'Test.flie'
          Size: 1073741792      Blocks: 2097160    IO Block: 4096   regular file
        Device: 35h/53d Inode: 1049        Links: 1
        Access: (0660/-rw-rw----)  Uid: (    0/    root)   Gid: (    0/    root)
        Access: 2021-05-31 10:54:16.835309473 +0300
        Modify: 2021-05-31 10:56:00.681735098 +0300
        Change: 2021-05-31 10:56:00.681735098 +0300
         Birth: -
        root@HQ-NAS-2:/RAID-5/TEST-FILE# md5sum Test.flie
        0542952ac3e7e9d494a26a37c41a6c9e  Test.flie
        root@HQ-NAS-2:/RAID-5/TEST-FILE# md5sum Test.flie
        0efe119d6aba0648ba32fc722fd72095  Test.flie
        root@HQ-NAS-2:/RAID-5/TEST-FILE# stat Test.flie
          File: 'Test.flie'
          Size: 1073741792      Blocks: 2097152    IO Block: 4096   regular file
        Device: 35h/53d Inode: 1049        Links: 1
        Access: (0660/-rw-rw----)  Uid: (    0/    root)   Gid: (    0/    root)
        Access: 2021-05-31 10:54:16.835309473 +0300
        Modify: 2021-05-31 10:56:00.681735098 +0300
        Change: 2021-05-31 10:56:00.681735098 +0300
         Birth: -
        root@HQ-NAS-2:/RAID-5/TEST-FILE# md5sum Test.flie
        d289a229916a49bede053b9cdc778ec6  Test.flie
        root@HQ-NAS-2:/RAID-5/TEST-FILE# stat Test.flie
          File: 'Test.flie'
          Size: 1073741792      Blocks: 2097152    IO Block: 4096   regular file
        Device: 35h/53d Inode: 1049        Links: 1
        Access: (0660/-rw-rw----)  Uid: (    0/    root)   Gid: (    0/    root)
        Access: 2021-05-31 10:54:16.835309473 +0300
        Modify: 2021-05-31 10:56:00.681735098 +0300
        Change: 2021-05-31 10:56:00.681735098 +0300
         Birth: -

        DEADDEADBEEF wrote:

        Unless.... maybe you have turned off checksumming on your data volume?


        where i can have turned off or turned on checksumming on data volume?

        checksum was turned on only this box.

         

        Checksum

         

  • StephenB DEADDEADBEEF 

     

    I have partitioned jbod raid drives into each drive. RAID number 4 checksum error. But no errors in dmesg.

     

    Every 2.0s: cat /proc/mdstat                                                             Fri Jun  4 11:30:16 2021
    
    Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
    md122 : active raid1 sdf3[0]
          1948663808 blocks super 1.2 [1/1] [U]
    
    md123 : active raid1 sde3[0]
          1948663808 blocks super 1.2 [1/1] [U]
    
    md124 : active raid1 sdd3[0]
          1948663808 blocks super 1.2 [1/1] [U]
    
    md125 : active raid1 sdc3[0]
          1948663808 blocks super 1.2 [1/1] [U]
    
    md126 : active raid1 sdb3[0]
          1948663808 blocks super 1.2 [1/1] [U]
    
    md127 : active raid1 sda3[0]
          1948663808 blocks super 1.2 [1/1] [U]
    
    md1 : active raid10 sda2[0] sdf2[5] sde2[4] sdd2[3] sdc2[2] sdb2[1]
          1566720 blocks super 1.2 512K chunks 2 near-copies [6/6] [UUUUUU]
    
    md0 : active raid1 sdf1[6] sdb1[2](S) sdc1[3](S) sdd1[4](S) sde1[5](S) sda1[1]
          4190208 blocks super 1.2 [2/2] [UU]
    
    unused devices: <none>
    
    

    Error on SHARE 4 (R4)JBODshares

     

    • StephenB's avatar
      StephenB
      Guru - Experienced User

      Roman304 wrote:

      I have partitioned jbod raid drives into each drive. RAID number 4 checksum error.

      That's a clear indication that your issue is either linked to that disk or to that slot.  The next step is to figure out which.

       

      I suggest destroying RAID 1,2,5,6, and removing those disks.  Then power down the NAS, and swap RAID 3 and RAID 4.  Power up and re-run the test on both volumes.  That will let you know if the problem is linked to the disk or the slot.  

       

      If the problem disappears on both disks, then it could be power-related.  You can confirm that by adding the removed disks back one at a time, and see when the problem starts happening again.

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More