Zero Bytes after a few seconds

Guru - Experienced User

Feb 09, 2022

pdaly12 wrote:

Is there any particular log that I should concentrate on. i.e. system.log (system log only has todays events).

system.log, kernel.log, systemd-journal.log

If you have ssh enabled, you can log into the NAS and run smartctl -x /dev/sdxx

That will show the disk's error log (including UNCs that don't seem to show up other places). https://community.netgear.com/t5/Using-your-ReadyNAS-in-Business/Disk-test-running-10-days-already/td-p/1786277

pdaly12

Aspirant

Feb 09, 2022

If I'm right, based on the SMART data, it seems disks 3 & 4 need to be replaced as they're receiving high error rate - drive 4 being the worst (I can't bear to look at the raw value forRaw_Read_Error_Rate on drive 4).

I think Disc 2 needs to be replaced as its suffering with Offline_Uncorrectable issues. which suggests its going to fail soon.

I also discovered that drive 4 is not a NAS drive, its a normal PC drive, so Ive no idea who put that in there!!! humph!

Disk 1
No Errors Logged

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    1
  3 Spin_Up_Time            POS--K   170   165   021    -    4458
  4 Start_Stop_Count        -O--CK   092   092   000    -    8389
  5 Reallocated_Sector_Ct   PO--CK   200   200   140    -    0
  7 Seek_Error_Rate         -OSR-K   200   200   000    -    0
  9 Power_On_Hours          -O--CK   001   001   000    -    89392
 10 Spin_Retry_Count        -O--CK   100   100   000    -    0
 11 Calibration_Retry_Count -O--CK   100   100   000    -    0
 12 Power_Cycle_Count       -O--CK   100   100   000    -    299
192 Power-Off_Retract_Count -O--CK   200   200   000    -    53
193 Load_Cycle_Count        -O--CK   198   198   000    -    8335
194 Temperature_Celsius     -O---K   106   098   000    -    41
196 Reallocated_Event_Count -O--CK   200   200   000    -    0
197 Current_Pending_Sector  -O--CK   200   200   000    -    0
198 Offline_Uncorrectable   ----CK   200   200   000    -    0
199 UDMA_CRC_Error_Count    -O--CK   200   200   000    -    1
200 Multi_Zone_Error_Rate   ---R--   200   200   000    -    32
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online

Disk 2
No Errors Logged

ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    1
  3 Spin_Up_Time            POS--K   172   169   021    -    4358
  4 Start_Stop_Count        -O--CK   092   092   000    -    8118
  5 Reallocated_Sector_Ct   PO--CK   200   200   140    -    0
  7 Seek_Error_Rate         -OSR-K   200   200   000    -    0
  9 Power_On_Hours          -O--CK   009   009   000    -    66510
 10 Spin_Retry_Count        -O--CK   100   100   000    -    0
 11 Calibration_Retry_Count -O--CK   100   100   000    -    0
 12 Power_Cycle_Count       -O--CK   100   100   000    -    135
192 Power-Off_Retract_Count -O--CK   200   200   000    -    38
193 Load_Cycle_Count        -O--CK   198   198   000    -    8079
194 Temperature_Celsius     -O---K   101   094   000    -    46
196 Reallocated_Event_Count -O--CK   200   200   000    -    0
197 Current_Pending_Sector  -O--CK   200   200   000    -    25
198 Offline_Uncorrectable   ----CK   200   200   000    -    24
199 UDMA_CRC_Error_Count    -O--CK   200   200   000    -    1
200 Multi_Zone_Error_Rate   ---R--   200   200   000    -    33
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

Disk 3
Error count 3

ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    3229
  3 Spin_Up_Time            POS--K   171   168   021    -    4441
  4 Start_Stop_Count        -O--CK   092   092   000    -    8333
  5 Reallocated_Sector_Ct   PO--CK   198   198   140    -    60
  7 Seek_Error_Rate         -OSR-K   200   200   000    -    0
  9 Power_On_Hours          -O--CK   001   001   000    -    88526
 10 Spin_Retry_Count        -O--CK   100   100   000    -    0
 11 Calibration_Retry_Count -O--CK   100   100   000    -    0
 12 Power_Cycle_Count       -O--CK   100   100   000    -    289
192 Power-Off_Retract_Count -O--CK   200   200   000    -    53
193 Load_Cycle_Count        -O--CK   198   198   000    -    8279
194 Temperature_Celsius     -O---K   102   097   000    -    45
196 Reallocated_Event_Count -O--CK   145   145   000    -    55
197 Current_Pending_Sector  -O--CK   200   200   000    -    0
198 Offline_Uncorrectable   ----CK   200   200   000    -    0
199 UDMA_CRC_Error_Count    -O--CK   200   200   000    -    0
200 Multi_Zone_Error_Rate   ---R--   200   200   000    -    0

All Disk 3 errors are similar to:

40 -- 51 05 40 00 00 3f 3b 24 58 40 00  Error: UNC at LBA = 0x3f3b2458 = 1060840536

Disk 4
Device error count: 56

ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     PO-R--   099   099   016    -    131073
  2 Throughput_Performance  P-S---   137   137   054    -    91
  3 Spin_Up_Time            POS---   119   119   024    -    318 (Average 319)
  4 Start_Stop_Count        -O--C-   099   099   000    -    6582
  5 Reallocated_Sector_Ct   PO--CK   100   100   005    -    11
  7 Seek_Error_Rate         PO-R--   100   100   067    -    0
  8 Seek_Time_Performance   P-S---   132   132   020    -    34
  9 Power_On_Hours          -O--C-   092   092   000    -    62247
 10 Spin_Retry_Count        PO--C-   100   100   060    -    0
 12 Power_Cycle_Count       -O--CK   100   100   000    -    1127
192 Power-Off_Retract_Count -O--CK   095   095   000    -    6831
193 Load_Cycle_Count        -O--C-   095   095   000    -    6831
194 Temperature_Celsius     -O----   157   157   000    -    38 (Min/Max 7/49)
196 Reallocated_Event_Count -O--CK   100   100   000    -    12
197 Current_Pending_Sector  -O---K   100   100   000    -    0
198 Offline_Uncorrectable   ---R--   100   100   000    -    0
199 UDMA_CRC_Error_Count    -O-R--   200   200   000    -    54

Again, typical errors recorded are:

  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 05 40 00 00 3f 3b 24 58 40 00  Error: UNC at LBA = 0x3f3b2458 = 1060840536

StephenB
Guru - Experienced User
Feb 09, 2022
pdaly12 wrote:

ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 40 -- 51 05 40 00 00 3f 3b 24 58 40 00 Error: UNC at LBA = 0x3f3b2458 = 1060840536

The times can be very useful here (though you have to translate them from power-on hours to actual time).

Also, after you replace the disks, I suggest setting up a maintenance schedule (settings wheel of the volume). You can schedule

defrag

balance

scrub

disk test

Personally I run one of these each month (cyclling through all 4 3X a year).

Definitely wait until you have healthy disks installed though.

pdaly12
Aspirant
Feb 09, 2022
Thanks for the help. The more I think about it, the more it seems to be that the drives are failing (slow directory listings, phantom invisible files appearing, etc) so I've ordered some Seagate Iron Wolf's.

Drive 4 had 131073 Raw Read errors and a few Smart "UNC at LBA" Errors ,

Drive 3 was identical to drive 4, but with 3229 raw read errors.

Drive 2 had zero errors but that Uncorrectable Sector count: 24 is a cause for concern.

Drive 1 is a badass just chugging away leading the pack.
- Sandshark
  Sensei
  Feb 09, 2022
  Because each re-sync puts a strain on the drives, the chances you will have another drive fail during re-sync is high with those drives. Make sure you have your backup up to date before you start swapping.

Forum Discussion

Related Content

xt_TCPMSS: bad length (160 bytes) meaning

No Internet on Second Router

R6020 et second reseau

Second MR60 Router

Insight second authentication

NETGEAR Academy

ProSupport for Business