× NETGEAR will be terminating ReadyCLOUD service by July 1st, 2023. For more details click here.
Orbi WiFi 7 RBE973
Reply

RN312 random volume degraded with no indication why

Regis-IT
Aspirant

RN312 random volume degraded with no indication why

RN312 (2x 2TB RAID1) now reporting volume degraded, but interface gives no clues as to why. I can see a picture of two drives, one grey, one blue, both with green "leds". Hovering over both shows both online and no apparent errors. 

RegisIT_0-1720435512074.png

I'm guessing one drive being grey is the issue but no indication why it is grey. Looking at Settings > RAID there's some random graphic with no explanation hence means nothing:

RegisIT_2-1720435957637.png

 

From what I've read on other posts the mdstat.log file is important, but with no explanation of what it contains difficult to determine what it telling us. Can anyone assist and decipher this log?

 

Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md127 : active raid1 sdb3[1]
2925414784 blocks super 1.2 [2/1] [_U]
bitmap: 3/22 pages [12KB], 65536KB chunk

md1 : active raid1 sda2[0] sdb2[1]
523712 blocks super 1.2 [2/2] [UU]

md0 : active raid1 sda1[0] sdb1[1]
4190208 blocks super 1.2 [2/2] [UU]

unused devices: <none>
/dev/md/0:
Version : 1.2
Creation Time : Wed May 27 09:52:05 2015
Raid Level : raid1
Array Size : 4190208 (4.00 GiB 4.29 GB)
Used Dev Size : 4190208 (4.00 GiB 4.29 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent

Update Time : Mon Jul 8 11:07:33 2024
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

Consistency Policy : unknown

Name : 43f63a50:0 (local to host 43f63a50)
UUID : 25dfb646:f48bce33:f99a4586:d7fd9509
Events : 70

Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
/dev/md/1:
Version : 1.2
Creation Time : Wed May 27 09:52:05 2015
Raid Level : raid1
Array Size : 523712 (511.44 MiB 536.28 MB)
Used Dev Size : 523712 (511.44 MiB 536.28 MB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent

Update Time : Mon Jul 8 09:53:20 2024
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

Consistency Policy : unknown

Name : 43f63a50:1 (local to host 43f63a50)
UUID : 07b53467:882f59f7:3260327d:9e432f63
Events : 22

Number Major Minor RaidDevice State
0 8 2 0 active sync /dev/sda2
1 8 18 1 active sync /dev/sdb2
/dev/md/data-0:
Version : 1.2
Creation Time : Wed May 27 09:52:05 2015
Raid Level : raid1
Array Size : 2925414784 (2789.89 GiB 2995.62 GB)
Used Dev Size : 2925414784 (2789.89 GiB 2995.62 GB)
Raid Devices : 2
Total Devices : 1
Persistence : Superblock is persistent

Intent Bitmap : Internal

Update Time : Mon Jul 8 11:02:02 2024
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0

Consistency Policy : unknown

Name : 43f63a50:data-0 (local to host 43f63a50)
UUID : 64d9cf7c:b71d6e4c:d9055d9c:442c31dd
Events : 6716

Number Major Minor RaidDevice State
- 0 0 0 removed
1 8 19 1 active sync /dev/sdb3

 

I'm guessing the last three lines are important. It suggests to me that one drive has simply dropped out of the array, despite there being nothing apparently wrong with the drive. Here's disk_info if it helps:

 

Device: sda
Controller: 0
Channel: 0
Model: WDC WD30EFRX-68EUZN0
Serial: WD-WCC4N7VXZF16
Firmware: 82.00A82W
Class: SATA
RPM: 5400
Sectors: 5860533168
Health data
ATA Error Count: 0
Reallocated Sectors: 0
Reallocation Events: 0
Spin Retry Count: 0
Current Pending Sector Count: 0
Uncorrectable Sector Count: 0
Temperature: 31
Start/Stop Count: 15910
Power-On Hours: 58939
Power Cycle Count: 23
Load Cycle Count: 15957

Device: sdb
Controller: 0
Channel: 1
Model: WDC WD30EFRX-68EUZN0
Serial: WD-WCC4N5VPU47D
Firmware: 82.00A82W
Class: SATA
RPM: 5400
Sectors: 5860533168
Pool: data
PoolType: RAID 1
PoolState: 3
PoolHostId: 43f63a50
Health data
ATA Error Count: 0
Reallocated Sectors: 0
Reallocation Events: 0
Spin Retry Count: 0
Current Pending Sector Count: 0
Uncorrectable Sector Count: 0
Temperature: 31
Start/Stop Count: 16592
Power-On Hours: 33884
Power Cycle Count: 23
Load Cycle Count: 16634

 

 

Any assistance would be greatly appreciated. I don't currently have physical access to this device.

 

Thanks.

Message 1 of 6

Accepted Solutions
StephenB
Guru

Re: RN312 random volume degraded with no indication why


@Regis-IT wrote:

Many thanks @StephenB, PM sent.


dmesg.log (among others) is flooded with unrecoverable read errors on disk 1 (sda, serial # WD-WCC4N7VXZF16)

 

[Mon Jul  8 11:07:43 2024] do_marvell_9170_recover: ignoring PCI device (8086:3a22) at PCI#0
[Mon Jul  8 11:07:43 2024] ata1.00: exception Emask 0x0 SAct 0x4000 SErr 0x0 action 0x0
[Mon Jul  8 11:07:43 2024] ata1.00: irq_stat 0x40000008
[Mon Jul  8 11:07:43 2024] ata1.00: failed command: READ FPDMA QUEUED
[Mon Jul  8 11:07:43 2024] ata1.00: cmd 60/01:70:4f:00:90/00:00:00:00:00/40 tag 14 ncq 512 in
         res 41/40:00:4f:00:90/00:00:00:00:00/40 Emask 0x409 (media error)
[Mon Jul  8 11:07:43 2024] ata1.00: status: { DRDY ERR }
[Mon Jul  8 11:07:43 2024] ata1.00: error: { UNC }
[Mon Jul  8 11:07:43 2024] ata1.00: configured for UDMA/133
[Mon Jul  8 11:07:43 2024] sd 0:0:0:0: [sda] tag#14 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[Mon Jul  8 11:07:43 2024] sd 0:0:0:0: [sda] tag#14 Sense Key : Medium Error [current] [descriptor] 
[Mon Jul  8 11:07:43 2024] sd 0:0:0:0: [sda] tag#14 Add. Sense: Unrecovered read error - auto reallocate failed
[Mon Jul  8 11:07:43 2024] sd 0:0:0:0: [sda] tag#14 CDB: Read(16) 88 00 00 00 00 00 00 90 00 4f 00 00 00 01 00 00
[Mon Jul  8 11:07:43 2024] blk_update_request: I/O error, dev sda, sector 9437263

 

 So you need to replace this disk.

 

WD no longer makes WD30EFRX.  If you want to stick with WD 3 TB, then good options are the WD30EFPX or WD30EFZX (WD Red Plus drives that replace the WD30EFRX).  Avoid the WD Red version (WD30EFAX) as this is SMR, and not a good option for RAID.

 

There's no problem with mixing WD and Seagate in the same array so you can also get the Seagate Ironwolf (ST3000VN006). 

 

If you get a larger drive, you won't get additional space until you replace sdb with one of the same size.

 

Netgear recommends making sure you have an up-to-date backup before manipulating disks, and I agree.  At the moment your array is unprotected, and that will continue until the problem drive is replaced and synced to drive 2. 

View solution in original post

Message 4 of 6

All Replies
StephenB
Guru

Re: RN312 random volume degraded with no indication why


@Regis-IT wrote:

 

Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md127 : active raid1 sdb3[1]
2925414784 blocks super 1.2 [2/1] [_U]


This is telling you that the first disk (sda with serial WD-WCC4N7VXZF16) has dropped out of the array.

 

It's not enough to tell you why that happened (or exactly when).  There could be more information on that in

  • dmesg.log
  • kernel.log
  • readynasd.log
  • status.log
  • system.log
  • systemd-journal.log

but as you say, it can be hard to interpret.  If you want me to take a look, you could put the entire zip into cloud storage and send me the link in a PM (private message) using the envelope icon in the upper right of the forum page.  Make sure the permission is set so anyone with the link can download.

Message 2 of 6
Regis-IT
Aspirant

Re: RN312 random volume degraded with no indication why

Many thanks @StephenB, PM sent.

Message 3 of 6
StephenB
Guru

Re: RN312 random volume degraded with no indication why


@Regis-IT wrote:

Many thanks @StephenB, PM sent.


dmesg.log (among others) is flooded with unrecoverable read errors on disk 1 (sda, serial # WD-WCC4N7VXZF16)

 

[Mon Jul  8 11:07:43 2024] do_marvell_9170_recover: ignoring PCI device (8086:3a22) at PCI#0
[Mon Jul  8 11:07:43 2024] ata1.00: exception Emask 0x0 SAct 0x4000 SErr 0x0 action 0x0
[Mon Jul  8 11:07:43 2024] ata1.00: irq_stat 0x40000008
[Mon Jul  8 11:07:43 2024] ata1.00: failed command: READ FPDMA QUEUED
[Mon Jul  8 11:07:43 2024] ata1.00: cmd 60/01:70:4f:00:90/00:00:00:00:00/40 tag 14 ncq 512 in
         res 41/40:00:4f:00:90/00:00:00:00:00/40 Emask 0x409 (media error)
[Mon Jul  8 11:07:43 2024] ata1.00: status: { DRDY ERR }
[Mon Jul  8 11:07:43 2024] ata1.00: error: { UNC }
[Mon Jul  8 11:07:43 2024] ata1.00: configured for UDMA/133
[Mon Jul  8 11:07:43 2024] sd 0:0:0:0: [sda] tag#14 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[Mon Jul  8 11:07:43 2024] sd 0:0:0:0: [sda] tag#14 Sense Key : Medium Error [current] [descriptor] 
[Mon Jul  8 11:07:43 2024] sd 0:0:0:0: [sda] tag#14 Add. Sense: Unrecovered read error - auto reallocate failed
[Mon Jul  8 11:07:43 2024] sd 0:0:0:0: [sda] tag#14 CDB: Read(16) 88 00 00 00 00 00 00 90 00 4f 00 00 00 01 00 00
[Mon Jul  8 11:07:43 2024] blk_update_request: I/O error, dev sda, sector 9437263

 

 So you need to replace this disk.

 

WD no longer makes WD30EFRX.  If you want to stick with WD 3 TB, then good options are the WD30EFPX or WD30EFZX (WD Red Plus drives that replace the WD30EFRX).  Avoid the WD Red version (WD30EFAX) as this is SMR, and not a good option for RAID.

 

There's no problem with mixing WD and Seagate in the same array so you can also get the Seagate Ironwolf (ST3000VN006). 

 

If you get a larger drive, you won't get additional space until you replace sdb with one of the same size.

 

Netgear recommends making sure you have an up-to-date backup before manipulating disks, and I agree.  At the moment your array is unprotected, and that will continue until the problem drive is replaced and synced to drive 2. 

Message 4 of 6
Regis-IT
Aspirant

Re: RN312 random volume degraded with no indication why

Many thanks - much appreciated. Would be more helpful if this was exposed on the interface rather than buried in log files! Along with logs marrying up so one doesn't say it's clean while another says it's failing! 🙂

Message 5 of 6
StephenB
Guru

Re: RN312 random volume degraded with no indication why


@Regis-IT wrote:

Would be more helpful if this was exposed on the interface rather than buried in log files! 


Yes.  Netgear didn't do a very good job on reporting disk status.

Message 6 of 6
Top Contributors
Discussion stats
  • 5 replies
  • 329 views
  • 0 kudos
  • 2 in conversation
Announcements