× NETGEAR will be terminating ReadyCLOUD service by July 1st, 2023. For more details click here.
Orbi WiFi 7 RBE973
Reply

Re: Volume: The volume XXXX encountered an error and was made read-only. It is recommended to backup

gpaolo
Luminary

Volume: The volume XXXX encountered an error and was made read-only. It is recommended to backup

Hello,

I have just received this notification:

Volume: The volume Volume1TB encountered an error and was made read-only. It is recommended to backup your data

I just checked and the volume is marked as healthy, but effectively set as read only.

 

Disks on this share are both quite new. Disk_info.log doesn´t show anything odd:

 

Device:             sdf
Controller:         0
Channel:            0
Model:              WDC WD40EFAX-68JH4N0
Serial:             WD-WX42D10FCJ36
Firmware:           82.00A82W
Class:              SATA
RPM:                5400
Sectors:            7814037168
Pool:               Volume1TB
PoolType:           RAID 1
PoolState:          1
PoolHostId:         2fe75bc5
Health data 
  ATA Error Count:                0
  Reallocated Sectors:            0
  Reallocation Events:            0
  Spin Retry Count:               0
  Current Pending Sector Count:   0
  Uncorrectable Sector Count:     0
  Temperature:                    23
  Start/Stop Count:               27749
  Power-On Hours:                 3856
  Power Cycle Count:              1
  Load Cycle Count:               27749

Device:             sda
Controller:         0
Channel:            1
Model:              WDC WD40EFRX-68N32N0
Serial:             WD-WCC7K1YJCZ2P
Firmware:           82.00A82W
Class:              SATA
RPM:                5400
Sectors:            7814037168
Pool:               Volume1TB
PoolType:           RAID 1
PoolState:          1
PoolHostId:         2fe75bc5
Health data 
  ATA Error Count:                0
  Reallocated Sectors:            0
  Reallocation Events:            0
  Spin Retry Count:               0
  Current Pending Sector Count:   0
  Uncorrectable Sector Count:     0
  Temperature:                    21
  Start/Stop Count:               23779
  Power-On Hours:                 3852
  Power Cycle Count:              1
  Load Cycle Count:               23779

...BUT, here is where I bought two Western Digital Red disks and I got one EFAX and one EFRX. Could this be the issue?

What should I do here? I don't see anything wrong...

 

 

 

Model: RN524X|ReadyNAS 524X – Premium Performance Data Storage - 4-Bay
Message 1 of 13

Accepted Solutions
gpaolo
Luminary

Re: Volume: The volume XXXX encountered an error and was made read-only. It is recommended to backup

Well, end of the story. For posterity:

 

when it happens, do not attempt anything except a backup copy. 

Files should be ok (they were all fine in my case, no data loss), as someone else said in another similar thread office files won't open, but I think it's because office tries to lock them and it is not able to write on the share, but if you copy them somewhere else they will be ok.

But don't do anything else, after I tried to reboot the volume was completely gone, together with the data. 

If this failure happens in the volume including the first disk, applications are lost too. Lesson learned here (it took a while): backup all configurations in a share, which is backed up again.

My two 6-months new WD HDD are now moved to external USB units for automatic backup and two new Seagate Ironwolf disks are in.

Rebuilding is in progress (my personal discovery, copy files from backup USB unit to volumes is a lot faster if done from SSH), let's hope it will last for a while...

Thanks again to @StephenB for the support!

View solution in original post

Message 13 of 13

All Replies
gpaolo
Luminary

Re: Volume: The volume XXXX encountered an error and was made read-only. It is recommended to backup

This is the complete log packet, in case it helps. [uh? I can't attach a zip file now? odd. Well, here is it, in case: <redacted>

 

Files seem to be ok. I'm making a new backup right now.

What would be the way to remove the read only status? I can't find any option to do that.

Message 2 of 13
gpaolo
Luminary

Re: Volume: The volume XXXX encountered an error and was made read-only. It is recommended to backup

ahh, here it is, in kernel.log:

 

-- Logs begin at Tue 2020-09-01 05:50:10 GMT, end at Thu 2020-10-01 07:03:00 GMT. --
Sep 02 01:00:01 NAS4-CASA-GP kernel: md: requested-resync of RAID array md127
Sep 02 01:00:01 NAS4-CASA-GP kernel: md: minimum _guaranteed_  speed: 30000 KB/sec/disk.
Sep 02 01:00:01 NAS4-CASA-GP kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for requested-resync.
Sep 02 01:00:01 NAS4-CASA-GP kernel: md: using 128k window, over a total of 3902168832k.
Sep 02 15:08:19 NAS4-CASA-GP kernel: md: md127: requested-resync done.
Sep 22 01:00:10 NAS4-CASA-GP kernel: BTRFS info (device md127): relocating block group 3484992667648 flags data
Sep 22 01:00:13 NAS4-CASA-GP kernel: BTRFS info (device md127): found 5 extents
Sep 22 01:00:22 NAS4-CASA-GP kernel: BTRFS info (device md127): found 5 extents
Sep 22 01:00:22 NAS4-CASA-GP kernel: BTRFS info (device md127): relocating block group 3474758565888 flags metadata|dup
Sep 22 01:00:31 NAS4-CASA-GP kernel: BTRFS info (device md127): found 254 extents
Sep 22 01:00:31 NAS4-CASA-GP kernel: BTRFS info (device md127): relocating block group 3443586498560 flags system|dup
Sep 22 01:00:32 NAS4-CASA-GP kernel: BTRFS info (device md127): found 10 extents
Sep 22 01:00:32 NAS4-CASA-GP kernel: BTRFS info (device md127): relocating block group 1259528519680 flags data
Sep 22 01:00:42 NAS4-CASA-GP kernel: BTRFS info (device md127): found 161 extents
Sep 22 01:00:47 NAS4-CASA-GP kernel: BTRFS info (device md127): found 161 extents
Sep 22 01:00:48 NAS4-CASA-GP kernel: BTRFS info (device md127): relocating block group 1243422392320 flags data
Sep 22 01:00:56 NAS4-CASA-GP kernel: BTRFS info (device md127): found 137 extents
Sep 22 01:01:00 NAS4-CASA-GP kernel: BTRFS info (device md127): found 137 extents
Oct 01 01:00:08 NAS4-CASA-GP kernel: BTRFS info (device md126): relocating block group 1620943306752 flags data
Oct 01 01:00:09 NAS4-CASA-GP kernel: BTRFS info (device md126): found 186 extents
Oct 01 01:00:14 NAS4-CASA-GP kernel: BTRFS info (device md126): found 186 extents
Oct 01 01:00:15 NAS4-CASA-GP kernel: BTRFS info (device md126): relocating block group 1619869564928 flags data
Oct 01 01:00:18 NAS4-CASA-GP kernel: BTRFS info (device md126): found 1107 extents
Oct 01 01:00:24 NAS4-CASA-GP kernel: BTRFS info (device md126): found 1107 extents
Oct 01 01:00:24 NAS4-CASA-GP kernel: BTRFS info (device md126): relocating block group 1590844981248 flags system|dup
Oct 01 01:00:24 NAS4-CASA-GP kernel: BTRFS info (device md126): found 6 extents
Oct 01 01:09:06 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:09:06 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:09:06 NAS4-CASA-GP kernel: BTRFS: error (device md126) in btrfs_update_root:148: errno=-5 IO failure
Oct 01 01:09:06 NAS4-CASA-GP kernel: BTRFS info (device md126): forced readonly
Oct 01 01:09:06 NAS4-CASA-GP kernel: BTRFS warning (device md126): Skipping commit of aborted transaction.
Oct 01 01:09:06 NAS4-CASA-GP kernel: BTRFS: error (device md126) in cleanup_transaction:1864: errno=-5 IO failure
Oct 01 01:09:06 NAS4-CASA-GP kernel: BTRFS info (device md126): delayed_refs has NO entry
Oct 01 01:14:11 NAS4-CASA-GP kernel: BTRFS error (device md126): Remounting read-write after error is not allowed
Oct 01 01:14:11 NAS4-CASA-GP kernel: BTRFS error (device md126): Remounting read-write after error is not allowed
Oct 01 01:14:11 NAS4-CASA-GP kernel: BTRFS error (device md126): Remounting read-write after error is not allowed
Oct 01 01:14:11 NAS4-CASA-GP kernel: BTRFS error (device md126): Remounting read-write after error is not allowed
Oct 01 01:14:13 NAS4-CASA-GP kernel: BTRFS error (device md126): Remounting read-write after error is not allowed
Oct 01 01:14:13 NAS4-CASA-GP kernel: BTRFS error (device md126): Remounting read-write after error is not allowed
Oct 01 01:14:13 NAS4-CASA-GP kernel: BTRFS error (device md126): Remounting read-write after error is not allowed
Oct 01 01:14:13 NAS4-CASA-GP kernel: BTRFS error (device md126): Remounting read-write after error is not allowed
Oct 01 01:14:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:14:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:15:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:15:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:16:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:16:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:17:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:17:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:18:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:18:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:19:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:19:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:20:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:20:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:21:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:21:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:22:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:22:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:23:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120
Oct 01 01:23:19 NAS4-CASA-GP kernel: BTRFS critical (device md126): corrupt leaf, bad key order: block=730591854592, root=1, slot=120

....
Message 3 of 13
gpaolo
Luminary

Re: Volume: The volume XXXX encountered an error and was made read-only. It is recommended to backup

I am trying to understand which one of the two disks might have some problem. Following the indication in another thread ( https://community.netgear.com/t5/New-ReadyNAS-Users-General/Rn314-update-to-6-10-1-made-volume-to-re... ) I have tried 

 

# smartctl -x /dev/sda

Results for the two disks in the volume are attached. I can´t see any error in there ---I have to split the message, can't attach txt and can't copy paste it all...

 

root@NAS4-CASA-GP:~# smartctl -x /dev/sda
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.4.190.x86_64.1] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68N32N0
Serial Number:    WD-WCC7K1YJCZ2P
LU WWN Device Id: 5 0014ee 211d8150b
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Thu Oct  1 09:31:41 2020 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Unavailable
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 247) Self-test routine in progress...
                                        70% of test remaining.
Total time to complete Offline
data collection:                (43380) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 461) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x303d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    0
  3 Spin_Up_Time            POS--K   188   162   021    -    5566
  4 Start_Stop_Count        -O--CK   077   077   000    -    23781
  5 Reallocated_Sector_Ct   PO--CK   200   200   140    -    0
  7 Seek_Error_Rate         -OSR-K   200   200   000    -    0
  9 Power_On_Hours          -O--CK   095   095   000    -    3855
 10 Spin_Retry_Count        -O--CK   100   100   000    -    0
 11 Calibration_Retry_Count -O--CK   100   253   000    -    0
 12 Power_Cycle_Count       -O--CK   100   100   000    -    1
192 Power-Off_Retract_Count -O--CK   200   200   000    -    0
193 Load_Cycle_Count        -O--CK   193   193   000    -    23781
194 Temperature_Celsius     -O---K   121   107   000    -    29
196 Reallocated_Event_Count -O--CK   200   200   000    -    0
197 Current_Pending_Sector  -O--CK   200   200   000    -    0
198 Offline_Uncorrectable   ----CK   100   253   000    -    0
199 UDMA_CRC_Error_Count    -O--CK   200   200   000    -    0
200 Multi_Zone_Error_Rate   ---R--   100   253   000    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      5  Comprehensive SMART error log
0x03       GPL     R/O      6  Ext. Comprehensive SMART error log
0x04       GPL,SL  R/O      8  Device Statistics log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x30       GPL,SL  R/O      9  IDENTIFY DEVICE data log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xa0-0xa7  GPL,SL  VS      16  Device vendor specific log
0xa8-0xb6  GPL,SL  VS       1  Device vendor specific log
0xb7       GPL,SL  VS      56  Device vendor specific log
0xbd       GPL,SL  VS       1  Device vendor specific log
0xc0       GPL,SL  VS       1  Device vendor specific log
0xc1       GPL     VS      93  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (6 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       258 (0x0102)
SCT Support Level:                   1
Device State:                        DST executing in background (3)
Current Temperature:                    29 Celsius
Power Cycle Min/Max Temperature:     20/43 Celsius
Lifetime    Min/Max Temperature:     20/43 Celsius
Under/Over Temperature Limit Count:   0/0
Vendor specific:
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

SCT Temperature History Version:     2
Temperature Sampling Period:         1 minute
Temperature Logging Interval:        1 minute
Min/Max recommended Temperature:      0/65 Celsius
Min/Max Temperature Limit:           -41/85 Celsius
Temperature History Size (Index):    478 (182)

Index    Estimated Time   Temperature Celsius
 183    2020-10-01 01:34    21  **
 ...    ..( 14 skipped).    ..  **
 198    2020-10-01 01:49    21  **
 199    2020-10-01 01:50    20  *
 ...    ..( 25 skipped).    ..  *
 225    2020-10-01 02:16    20  *
 226    2020-10-01 02:17     ?  -
 227    2020-10-01 02:18    20  *
 228    2020-10-01 02:19    21  **
 ...    ..(  2 skipped).    ..  **
 231    2020-10-01 02:22    21  **
 232    2020-10-01 02:23    22  ***
 233    2020-10-01 02:24    22  ***
 234    2020-10-01 02:25    22  ***
 235    2020-10-01 02:26    23  ****
 236    2020-10-01 02:27     ?  -
 237    2020-10-01 02:28    23  ****
 ...    ..(  2 skipped).    ..  ****
 240    2020-10-01 02:31    23  ****
 241    2020-10-01 02:32    24  *****
 242    2020-10-01 02:33    24  *****
 243    2020-10-01 02:34     ?  -
 244    2020-10-01 02:35    25  ******
 ...    ..(  2 skipped).    ..  ******
 247    2020-10-01 02:38    25  ******
 248    2020-10-01 02:39    26  *******
 ...    ..(  2 skipped).    ..  *******
 251    2020-10-01 02:42    26  *******
 252    2020-10-01 02:43    27  ********
 253    2020-10-01 02:44    27  ********
 254    2020-10-01 02:45    27  ********
 255    2020-10-01 02:46    28  *********
 ...    ..( 15 skipped).    ..  *********
 271    2020-10-01 03:02    28  *********
 272    2020-10-01 03:03    29  **********
 ...    ..( 65 skipped).    ..  **********
 338    2020-10-01 04:09    29  **********
 339    2020-10-01 04:10    28  *********
 ...    ..(  9 skipped).    ..  *********
 349    2020-10-01 04:20    28  *********
 350    2020-10-01 04:21    29  **********
 351    2020-10-01 04:22    28  *********
 ...    ..( 10 skipped).    ..  *********
 362    2020-10-01 04:33    28  *********
 363    2020-10-01 04:34    29  **********
 ...    ..( 18 skipped).    ..  **********
 382    2020-10-01 04:53    29  **********
 383    2020-10-01 04:54    26  *******
 384    2020-10-01 04:55    26  *******
 385    2020-10-01 04:56    26  *******
 386    2020-10-01 04:57    25  ******
 ...    ..(  5 skipped).    ..  ******
 392    2020-10-01 05:03    25  ******
 393    2020-10-01 05:04    24  *****
 394    2020-10-01 05:05    24  *****
 395    2020-10-01 05:06     ?  -
 396    2020-10-01 05:07    24  *****
 ...    ..(  2 skipped).    ..  *****
 399    2020-10-01 05:10    24  *****
 400    2020-10-01 05:11    25  ******
 ...    ..(  7 skipped).    ..  ******
 408    2020-10-01 05:19    25  ******
 409    2020-10-01 05:20     ?  -
 410    2020-10-01 05:21    25  ******
 ...    ..(  4 skipped).    ..  ******
 415    2020-10-01 05:26    25  ******
 416    2020-10-01 05:27    26  *******
 ...    ..(  4 skipped).    ..  *******
 421    2020-10-01 05:32    26  *******
 422    2020-10-01 05:33    25  ******
 ...    ..(  5 skipped).    ..  ******
 428    2020-10-01 05:39    25  ******
 429    2020-10-01 05:40     ?  -
 430    2020-10-01 05:41    24  *****
 ...    ..(  2 skipped).    ..  *****
 433    2020-10-01 05:44    24  *****
 434    2020-10-01 05:45    25  ******
 ...    ..(  7 skipped).    ..  ******
 442    2020-10-01 05:53    25  ******
 443    2020-10-01 05:54    24  *****
 ...    ..(  6 skipped).    ..  *****
 450    2020-10-01 06:01    24  *****
 451    2020-10-01 06:02     ?  -
 452    2020-10-01 06:03    24  *****
 ...    ..(  5 skipped).    ..  *****
 458    2020-10-01 06:09    24  *****
 459    2020-10-01 06:10    25  ******
 ...    ..(  4 skipped).    ..  ******
 464    2020-10-01 06:15    25  ******
 465    2020-10-01 06:16     ?  -
 466    2020-10-01 06:17    25  ******
 ...    ..(  5 skipped).    ..  ******
 472    2020-10-01 06:23    25  ******
 473    2020-10-01 06:24    26  *******
 ...    ..(  4 skipped).    ..  *******
   0    2020-10-01 06:29    26  *******
   1    2020-10-01 06:30    27  ********
 ...    ..( 10 skipped).    ..  ********
  12    2020-10-01 06:41    27  ********
  13    2020-10-01 06:42    26  *******
 ...    ..(  4 skipped).    ..  *******
  18    2020-10-01 06:47    26  *******
  19    2020-10-01 06:48    25  ******
 ...    ..(  4 skipped).    ..  ******
  24    2020-10-01 06:53    25  ******
  25    2020-10-01 06:54    24  *****
 ...    ..(  7 skipped).    ..  *****
  33    2020-10-01 07:02    24  *****
  34    2020-10-01 07:03    23  ****
 ...    ..(  9 skipped).    ..  ****
  44    2020-10-01 07:13    23  ****
  45    2020-10-01 07:14    22  ***
 ...    ..( 23 skipped).    ..  ***
  69    2020-10-01 07:38    22  ***
  70    2020-10-01 07:39    21  **
 ...    ..( 51 skipped).    ..  **
 122    2020-10-01 08:31    21  **
 123    2020-10-01 08:32     ?  -
 124    2020-10-01 08:33    21  **
 ...    ..(  4 skipped).    ..  **
 129    2020-10-01 08:38    21  **
 130    2020-10-01 08:39    22  ***
 ...    ..( 25 skipped).    ..  ***
 156    2020-10-01 09:05    22  ***
 157    2020-10-01 09:06    21  **
 ...    ..( 24 skipped).    ..  **
 182    2020-10-01 09:31    21  **

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)

Device Statistics (GP Log 0x04)
Page  Offset Size        Value Flags Description
0x01  =====  =               =  ===  == General Statistics (rev 1) ==
0x01  0x008  4               1  ---  Lifetime Power-On Resets
0x01  0x010  4            3855  ---  Power-on Hours
0x01  0x018  6     10632268361  ---  Logical Sectors Written
0x01  0x020  6        64491989  ---  Number of Write Commands
0x01  0x028  6       324623431  ---  Logical Sectors Read
0x01  0x030  6          687054  ---  Number of Read Commands
0x01  0x038  6       993098112  ---  Date and Time TimeStamp
0x03  =====  =               =  ===  == Rotating Media Statistics (rev 1) ==
0x03  0x008  4            2286  ---  Spindle Motor Power-on Hours
0x03  0x010  4            2241  ---  Head Flying Hours
0x03  0x018  4           23781  ---  Head Load Events
0x03  0x020  4               0  ---  Number of Reallocated Logical Sectors
0x03  0x028  4               0  ---  Read Recovery Attempts
0x03  0x030  4               0  ---  Number of Mechanical Start Failures
0x03  0x038  4               0  ---  Number of Realloc. Candidate Logical Sectors
0x03  0x040  4               0  ---  Number of High Priority Unload Events
0x04  =====  =               =  ===  == General Errors Statistics (rev 1) ==
0x04  0x008  4               0  ---  Number of Reported Uncorrectable Errors
0x04  0x010  4               0  ---  Resets Between Cmd Acceptance and Completion
0x05  =====  =               =  ===  == Temperature Statistics (rev 1) ==
0x05  0x008  1              29  ---  Current Temperature
0x05  0x010  1              26  ---  Average Short Term Temperature
0x05  0x018  1              28  ---  Average Long Term Temperature
0x05  0x020  1              43  ---  Highest Temperature
0x05  0x028  1              20  ---  Lowest Temperature
0x05  0x030  1              39  ---  Highest Average Short Term Temperature
0x05  0x038  1              23  ---  Lowest Average Short Term Temperature
0x05  0x040  1              32  ---  Highest Average Long Term Temperature
0x05  0x048  1              28  ---  Lowest Average Long Term Temperature
0x05  0x050  4               0  ---  Time in Over-Temperature
0x05  0x058  1              65  ---  Specified Maximum Operating Temperature
0x05  0x060  4               0  ---  Time in Under-Temperature
0x05  0x068  1               0  ---  Specified Minimum Operating Temperature
0x06  =====  =               =  ===  == Transport Statistics (rev 1) ==
0x06  0x008  4               1  ---  Number of Hardware Resets
0x06  0x010  4               0  ---  Number of ASR Events
0x06  0x018  4               0  ---  Number of Interface CRC Errors
                                |||_ C monitored condition met
                                ||__ D supports DSN
                                |___ N normalized value

Pending Defects log (GP Log 0x0c) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  2            0  Command failed due to ICRC error
0x0002  2            0  R_ERR response for data FIS
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0005  2            0  R_ERR response for non-data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
0x0008  2            0  Device-to-host non-data FIS retries
0x0009  2            0  Transition from drive PhyRdy to drive PhyNRdy
0x000a  2            1  Device-to-host register FISes sent due to a COMRESET
0x000b  2            0  CRC errors within host-to-device FIS
0x000d  2            0  Non-CRC errors within host-to-device FIS
0x000f  2            0  R_ERR response for host-to-device data FIS, CRC
0x0012  2            0  R_ERR response for host-to-device non-data FIS, CRC
0x8000  4     13879345  Vendor specific

 

Message 4 of 13
gpaolo
Luminary

Re: Volume: The volume XXXX encountered an error and was made read-only. It is recommended to backup

root@NAS4-CASA-GP:~# smartctl -x /dev/sdf
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.4.190.x86_64.1] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD40EFAX-68JH4N0
Serial Number:    WD-WX42D10FCJ36
LU WWN Device Id: 5 0014ee 267aef87e
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Thu Oct  1 09:38:40 2020 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Unavailable
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 241) Self-test routine in progress...
                                        10% of test remaining.
Total time to complete Offline
data collection:                ( 3764) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 146) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x3039) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    0
  3 Spin_Up_Time            POS--K   207   206   021    -    2633
  4 Start_Stop_Count        -O--CK   073   073   000    -    27749
  5 Reallocated_Sector_Ct   PO--CK   200   200   140    -    0
  7 Seek_Error_Rate         -OSR-K   200   200   000    -    0
  9 Power_On_Hours          -O--CK   095   095   000    -    3859
 10 Spin_Retry_Count        -O--CK   100   100   000    -    0
 11 Calibration_Retry_Count -O--CK   100   253   000    -    0
 12 Power_Cycle_Count       -O--CK   100   100   000    -    1
192 Power-Off_Retract_Count -O--CK   200   200   000    -    0
193 Load_Cycle_Count        -O--CK   191   191   000    -    27749
194 Temperature_Celsius     -O---K   121   104   000    -    26
196 Reallocated_Event_Count -O--CK   200   200   000    -    0
197 Current_Pending_Sector  -O--CK   200   200   000    -    0
198 Offline_Uncorrectable   ----CK   100   253   000    -    0
199 UDMA_CRC_Error_Count    -O--CK   200   200   000    -    0
200 Multi_Zone_Error_Rate   ---R--   100   253   000    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      5  Comprehensive SMART error log
0x03       GPL     R/O      6  Ext. Comprehensive SMART error log
0x04       GPL     R/O    256  Device Statistics log
0x04       SL      R/O      8  Device Statistics log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x0c       GPL     R/O   2048  Pending Defects log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x24       GPL     R/O    294  Current Device Internal Status Data log
0x30       GPL,SL  R/O      9  IDENTIFY DEVICE data log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xa0-0xa7  GPL,SL  VS      16  Device vendor specific log
0xa8-0xb6  GPL,SL  VS       1  Device vendor specific log
0xb7       GPL,SL  VS      78  Device vendor specific log
0xb9       GPL,SL  VS       4  Device vendor specific log
0xbd       GPL,SL  VS       1  Device vendor specific log
0xc0       GPL,SL  VS       1  Device vendor specific log
0xc1       GPL     VS      93  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (6 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       258 (0x0102)
SCT Support Level:                   1
Device State:                        DST executing in background (3)
Current Temperature:                    26 Celsius
Power Cycle Min/Max Temperature:     20/43 Celsius
Lifetime    Min/Max Temperature:     20/43 Celsius
Under/Over Temperature Limit Count:   0/0
Vendor specific:
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

SCT Temperature History Version:     2
Temperature Sampling Period:         1 minute
Temperature Logging Interval:        1 minute
Min/Max recommended Temperature:      0/65 Celsius
Min/Max Temperature Limit:           -41/85 Celsius
Temperature History Size (Index):    478 (7)

Index    Estimated Time   Temperature Celsius
   8    2020-10-01 01:41    21  **
   9    2020-10-01 01:42    20  *
 ...    ..( 17 skipped).    ..  *
  27    2020-10-01 02:00    20  *
  28    2020-10-01 02:01     ?  -
  29    2020-10-01 02:02    20  *
  30    2020-10-01 02:03    20  *
  31    2020-10-01 02:04    20  *
  32    2020-10-01 02:05    21  **
  33    2020-10-01 02:06    21  **
  34    2020-10-01 02:07    21  **
  35    2020-10-01 02:08    22  ***
  36    2020-10-01 02:09    22  ***
  37    2020-10-01 02:10     ?  -
  38    2020-10-01 02:11    22  ***
 ...    ..(  4 skipped).    ..  ***
  43    2020-10-01 02:16    22  ***
  44    2020-10-01 02:17    23  ****
 ...    ..(  2 skipped).    ..  ****
  47    2020-10-01 02:20    23  ****
  48    2020-10-01 02:21     ?  -
  49    2020-10-01 02:22    23  ****
  50    2020-10-01 02:23    22  ***
  51    2020-10-01 02:24    22  ***
  52    2020-10-01 02:25    22  ***
  53    2020-10-01 02:26    23  ****
 ...    ..(  2 skipped).    ..  ****
  56    2020-10-01 02:29    23  ****
  57    2020-10-01 02:30    24  *****
 ...    ..(  4 skipped).    ..  *****
  62    2020-10-01 02:35    24  *****
  63    2020-10-01 02:36    25  ******
 ...    ..(  5 skipped).    ..  ******
  69    2020-10-01 02:42    25  ******
  70    2020-10-01 02:43    26  *******
 ...    ..( 62 skipped).    ..  *******
 133    2020-10-01 03:46    26  *******
 134    2020-10-01 03:47    25  ******
 135    2020-10-01 03:48    26  *******
 136    2020-10-01 03:49    25  ******
 ...    ..( 62 skipped).    ..  ******
 199    2020-10-01 04:52    25  ******
 200    2020-10-01 04:53    26  *******
 201    2020-10-01 04:54    25  ******
 ...    ..(  4 skipped).    ..  ******
 206    2020-10-01 04:59    25  ******
 207    2020-10-01 05:00    26  *******
 208    2020-10-01 05:01     ?  -
 209    2020-10-01 05:02    24  *****
 ...    ..(  5 skipped).    ..  *****
 215    2020-10-01 05:08    24  *****
 216    2020-10-01 05:09    25  ******
 ...    ..(  2 skipped).    ..  ******
 219    2020-10-01 05:12    25  ******
 220    2020-10-01 05:13    26  *******
 221    2020-10-01 05:14    26  *******
 222    2020-10-01 05:15    25  ******
 223    2020-10-01 05:16    25  ******
 224    2020-10-01 05:17    25  ******
 225    2020-10-01 05:18     ?  -
 226    2020-10-01 05:19    25  ******
 ...    ..(  7 skipped).    ..  ******
 234    2020-10-01 05:27    25  ******
 235    2020-10-01 05:28    26  *******
 ...    ..(  3 skipped).    ..  *******
 239    2020-10-01 05:32    26  *******
 240    2020-10-01 05:33    25  ******
 ...    ..(  3 skipped).    ..  ******
 244    2020-10-01 05:37    25  ******
 245    2020-10-01 05:38     ?  -
 246    2020-10-01 05:39    25  ******
 247    2020-10-01 05:40    24  *****
 ...    ..(  2 skipped).    ..  *****
 250    2020-10-01 05:43    24  *****
 251    2020-10-01 05:44    25  ******
 ...    ..(  3 skipped).    ..  ******
 255    2020-10-01 05:48    25  ******
 256    2020-10-01 05:49     ?  -
 257    2020-10-01 05:50    25  ******
 258    2020-10-01 05:51    24  *****
 259    2020-10-01 05:52    24  *****
 260    2020-10-01 05:53    24  *****
 261    2020-10-01 05:54    25  ******
 ...    ..(  4 skipped).    ..  ******
 266    2020-10-01 05:59    25  ******
 267    2020-10-01 06:00    24  *****
 268    2020-10-01 06:01    24  *****
 269    2020-10-01 06:02     ?  -
 270    2020-10-01 06:03    24  *****
 ...    ..(  5 skipped).    ..  *****
 276    2020-10-01 06:09    24  *****
 277    2020-10-01 06:10    25  ******
 ...    ..(  6 skipped).    ..  ******
 284    2020-10-01 06:17    25  ******
 285    2020-10-01 06:18    26  *******
 ...    ..( 11 skipped).    ..  *******
 297    2020-10-01 06:30    26  *******
 298    2020-10-01 06:31    27  ********
 ...    ..(  8 skipped).    ..  ********
 307    2020-10-01 06:40    27  ********
 308    2020-10-01 06:41    26  *******
 ...    ..(  2 skipped).    ..  *******
 311    2020-10-01 06:44    26  *******
 312    2020-10-01 06:45    25  ******
 ...    ..(  2 skipped).    ..  ******
 315    2020-10-01 06:48    25  ******
 316    2020-10-01 06:49    24  *****
 ...    ..(  4 skipped).    ..  *****
 321    2020-10-01 06:54    24  *****
 322    2020-10-01 06:55    23  ****
 ...    ..(  5 skipped).    ..  ****
 328    2020-10-01 07:01    23  ****
 329    2020-10-01 07:02    22  ***
 330    2020-10-01 07:03    23  ****
 331    2020-10-01 07:04    22  ***
 ...    ..(  5 skipped).    ..  ***
 337    2020-10-01 07:10    22  ***
 338    2020-10-01 07:11     ?  -
 339    2020-10-01 07:12    22  ***
 ...    ..(  3 skipped).    ..  ***
 343    2020-10-01 07:16    22  ***
 344    2020-10-01 07:17    23  ****
 ...    ..(  4 skipped).    ..  ****
 349    2020-10-01 07:22    23  ****
 350    2020-10-01 07:23    22  ***
 ...    ..( 11 skipped).    ..  ***
 362    2020-10-01 07:35    22  ***
 363    2020-10-01 07:36    21  **
 ...    ..( 50 skipped).    ..  **
 414    2020-10-01 08:27    21  **
 415    2020-10-01 08:28    20  *
 416    2020-10-01 08:29    21  **
 417    2020-10-01 08:30    21  **
 418    2020-10-01 08:31    21  **
 419    2020-10-01 08:32     ?  -
 420    2020-10-01 08:33    21  **
 421    2020-10-01 08:34    20  *
 422    2020-10-01 08:35    20  *
 423    2020-10-01 08:36    21  **
 424    2020-10-01 08:37    21  **
 425    2020-10-01 08:38    22  ***
 ...    ..(  2 skipped).    ..  ***
 428    2020-10-01 08:41    22  ***
 429    2020-10-01 08:42    23  ****
 ...    ..(  7 skipped).    ..  ****
 437    2020-10-01 08:50    23  ****
 438    2020-10-01 08:51    22  ***
 ...    ..(  9 skipped).    ..  ***
 448    2020-10-01 09:01    22  ***
 449    2020-10-01 09:02    21  **
 ...    ..( 32 skipped).    ..  **
   4    2020-10-01 09:35    21  **
   5    2020-10-01 09:36    20  *
   6    2020-10-01 09:37    20  *
   7    2020-10-01 09:38    20  *

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)

Device Statistics (GP/SMART Log 0x04) not supported

Pending Defects log (GP Log 0x0c) supported [please try: '-l defects']

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  2            0  Command failed due to ICRC error
0x0002  2            0  R_ERR response for data FIS
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0005  2            0  R_ERR response for non-data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
0x0008  2            0  Device-to-host non-data FIS retries
0x0009  2            0  Transition from drive PhyRdy to drive PhyNRdy
0x000a  2            1  Device-to-host register FISes sent due to a COMRESET
0x000b  2            0  CRC errors within host-to-device FIS
0x000d  2            0  Non-CRC errors within host-to-device FIS
0x000f  2            0  R_ERR response for host-to-device data FIS, CRC
0x0012  2            0  R_ERR response for host-to-device non-data FIS, CRC
0x8000  4     13892822  Vendor specific

root@NAS4-CASA-GP:~#
Message 5 of 13
StephenB
Guru

Re: Volume: The volume XXXX encountered an error and was made read-only. It is recommended to backup

The WD40EFAX might be part of the equation.  It uses SMR technology (which WD finally admitted last April),  SMR drives slow down significantly during sustained writes, so they aren't great for RAID arrays.  https://www.servethehome.com/wd-red-smr-vs-cmr-tested-avoid-red-smr/2/  Personally I would replace it.  You could reach out to WD, not sure if they would exchange it for a WD40EFRX or not.  Given the controversy, and the fact that you are having some problems with your array, they might be open to that.

 


@gpaolo wrote:
Oct 01 01:09:06 NAS4-CASA-GP kernel: BTRFS warning (device md126): Skipping commit of aborted transaction.
Oct 01 01:09:06 NAS4-CASA-GP kernel: BTRFS: error (device md126) in cleanup_transaction:1864: errno=-5 IO failure

This of course is the error.  There is a disk test function on the volume tab that you might want to run.  But it is possible that the IO failure was a timeout of some kind (because of the major write slowdown SMRs can suffer from).

 

While it is likely possible that you could recover the volume with ssh, the best approach is to rebuild the array with s factory reset.  If you are skilled with ssh, we can provide you with some steps to recover the array.

 

BTW, it isn't a great idea to post the log zip file publicly, as there is some leakage of private information.  I've taken the liberty of redacting your download link.

 

 

Message 6 of 13
gpaolo
Luminary

Re: Volume: The volume XXXX encountered an error and was made read-only. It is recommended to backup

Thank you Stephen, you are very helpful as usual.

I became aware of the EFAX story too late to send back the drive, but I might try to contact WD as you suggest.

Actually, since the drives are new, I was thinking of moving them to a USB unit and use them for backup of the NAS, and get two new drives... 

I have the disk test in progress, do you know if, in case it is successful, will unlock the volume? 

Thanks also for the link of the log packet, you are right.

Message 7 of 13
gpaolo
Luminary

Re: Volume: The volume XXXX encountered an error and was made read-only. It is recommended to backup


@StephenB wrote:

 

While it is likely possible that you could recover the volume with ssh, the best approach is to rebuild the array with s factory reset.  If you are skilled with ssh, we can provide you with some steps to recover the array.

 

 


Honestly I would hope to avoid another factory reset. I had to do another one a few months ago when the previous volume failed... Also because the other volume is ok. And then I would need to do another one again if I change the drives, I guess. 

If there is a way to try to recover the volume I would like to try it, please. Of course, once the backup is finished.

Message 8 of 13
StephenB
Guru

Re: Volume: The volume XXXX encountered an error and was made read-only. It is recommended to backup


@gpaolo wrote:

 

I have the disk test in progress, do you know if, in case it is successful, will unlock the volume? 

 


It won't.

 

As far as volumes go, there are three different categories of problems:

  • disk failures
  • RAID synchronization failures
  • file system corruption.

These are often linked (disk failures can lead to RAID synchronization, or to file system corruption). 

 

In your case, the file system corruption is what made the volume go read-only.  That corruption remains, and a passed disk test won't undo it.  As I mentioned earlier, it is possible that you could manually recover the volume with ssh. But there could be some lost data if you do that.  The best approach is to rebuild the array and restore from the backup.

Message 9 of 13
gpaolo
Luminary

Re: Volume: The volume XXXX encountered an error and was made read-only. It is recommended to backup


@StephenB wrote:

In your case, the file system corruption is what made the volume go read-only.  That corruption remains, and a passed disk test won't undo it.  As I mentioned earlier, it is possible that you could manually recover the volume with ssh. But there could be some lost data if you do that.  The best approach is to rebuild the array and restore from the backup.

 

 


Ok, I'll do as you suggest. I have already ordered a new set of disks (Seagate this time...) and I will use these two WD as external backup, they should be fine working by themselves. 

Two questions: the volume with the problems include the first and the second disk of the NAS. If I remove those, am I going to lose also the OS and do I have ot restart from scratch? Is the second volume also going to be affected?

All applications have disappeared now: is there any way to re-enable them so I can try to save the settings and avoid the complete reconfigurations of all services?

 

Thanks

Message 10 of 13
StephenB
Guru

Re: Volume: The volume XXXX encountered an error and was made read-only. It is recommended to backup


@gpaolo wrote:

Two questions: the volume with the problems include the first and the second disk of the NAS. If I remove those, am I going to lose also the OS and do I have ot restart from scratch? Is the second volume also going to be affected?

The OS partition is present on all your disks, and is mirrored (so it should be up do date on all disks, and the NAS should boot w/o the first two).

 


@gpaolo wrote:

All applications have disappeared now: is there any way to re-enable them so I can try to save the settings and avoid the complete reconfigurations of all services?

 


Apps on the other hand are actually installed on one of the data volumes (generally the first one).  So are the home folders (if you use them).  Unfortunately Netgear doesn't provide a way to back up the apps.  But it is possible to copy them to a folder on the other volume.

 

 

 

 

 

 

 

Message 11 of 13
gpaolo
Luminary

Re: Volume: The volume XXXX encountered an error and was made read-only. It is recommended to backup

Ok, understood.

Well I'm glad I didn't even try something so harmless like a reboot before making a second backup. Now the volume is completely gone and marked as inactive. All applications are lost (sigh, again reconfiguring everything...).

Well, I admit it's rather scary that something that is not even a hardware failure causes the loss of the entire RAID1 volume. One broken drive caused less harm than whatever happened here.

Message 12 of 13
gpaolo
Luminary

Re: Volume: The volume XXXX encountered an error and was made read-only. It is recommended to backup

Well, end of the story. For posterity:

 

when it happens, do not attempt anything except a backup copy. 

Files should be ok (they were all fine in my case, no data loss), as someone else said in another similar thread office files won't open, but I think it's because office tries to lock them and it is not able to write on the share, but if you copy them somewhere else they will be ok.

But don't do anything else, after I tried to reboot the volume was completely gone, together with the data. 

If this failure happens in the volume including the first disk, applications are lost too. Lesson learned here (it took a while): backup all configurations in a share, which is backed up again.

My two 6-months new WD HDD are now moved to external USB units for automatic backup and two new Seagate Ironwolf disks are in.

Rebuilding is in progress (my personal discovery, copy files from backup USB unit to volumes is a lot faster if done from SSH), let's hope it will last for a while...

Thanks again to @StephenB for the support!

Message 13 of 13
Top Contributors
Discussion stats
  • 12 replies
  • 3070 views
  • 1 kudo
  • 2 in conversation
Announcements