× NETGEAR will be terminating ReadyCLOUD service by July 1st, 2023. For more details click here.
Orbi WiFi 7 RBE973
Reply

RN424 , 4x 8TB WD Red Plus failing after 120 days?

berillio
Aspirant

RN424 , 4x 8TB WD Red Plus failing after 120 days?

This problem is with a RN424 with 4x 8TB WD80EFAX, all identical HDs purchased at the same time as the 424. The system has 6.10.4 Hotfix1 and has been up for 120 days, I had not even transferred all the files on it yet

There are also two other RNs, RN214a and RN214b, both with 4x 4TB (OS is 6.10.3), but I was running out of space hence the 424 earlier this year: I bit the bullet and got bigger disks in what is (supposedly) a better chassis.

 

This morning I thought that I could do some maintenace and started transferring files across. Did some folders first (~65 GB), then selected another batch (~435Gb).

 

Some time later I saw that I received two mails from RN424 – volume degraded.

“Disk Model:WDC WD80EFAX-68KNBN0 Serial:VGJL3SDG was removed from Channel 2 of the head unit.”

 

I cannot really say, my guess is that the failure happened after the first file operation finished (maybe).

 

I interrupted the copying and downloaded the logs. And this is where I am now.

NO BACKUP in existence – some doubling of data here and there, but not for the RN424 data

Opening the front of the 424, I see the red light of what I call disk 2 (disk 1,2,3,4 counting from the left) ( I am not familiar with th e424 front panel aat all)

 

The RN424 has ~14.03TB used and 7.74TB free.

The RN214a has ~7.71TB used and 3.19TB free

The RN214b has ~4.22TB used, 4.97TB of snapshots and 1.77TB free

(I must say that I really understand NOTHING about snapshots – my most precious data is on RN214a, this was something I wanted to read about and sort out, but haven’t had the time to do yet – infact I had forgotten about it)

 

Is it normal for a disk to fail after 120 days of usage? The last message I received from RN424 was the balancing operation, last Monday, no warnings.

Status.log simply says

[21/07/30 12:20:38 GMT] warning:disk:LOGMSG_DELETE_DISK Disk Model:WDC WD80EFAX-68KNBN0 Serial:VGJL3SDG was removed from Channel 2 of the head unit.

[21/07/30 12:20:41 GMT] warning:volume:LOGMSG_HEALTH_VOLUME Volume data health changed from Redundant to Degraded.”

 

Diskinfo.log reports on Channel 0,2,3 but misses the one which is “removed”.

Which log has the past reports from the disks (to see if they were warning of the failure?

How can I be POSITIVE of the disk been faulty from the logs? (I remember reading about errors other times but not in this occasion)

(yes I can PHISICALLY remove the disk and test it)

 

System-journal.log reports the event (~50 lines) (I cannot intepret it)

 

Is it possible for the RN424 to “misdiagnose” an otherwise healthy disks or to suffer an hardware fault?

 

 

But what should I do now

  1. a) should I switch the RN424 off and take it offline? I can copy the few folders I would be working on RN214B and continue there for some time.
  2. b) should I MOVE data back to other NASes freeing space (but that would not do anything, the Volume is degraded because one disk is missing, it will remain degraded: the only advantage would be that – if there is another HD fault, I would have lost less data – mott point, maybe.

Many thanks in advance,

Berillio

Model: RN424|ReadyNAS 424 – High-performance Business Data Storage - 4-Bay
Message 1 of 26
StephenB
Guru

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?


@berillio wrote:

This problem is with a RN424 with 4x 8TB WD80EFAX, all identical HDs purchased at the same time as the 424. The system has 6.10.4 Hotfix1 and has been up for 120 days, I had not even transferred all the files on it yet

 

Status.log simply says

[21/07/30 12:20:38 GMT] warning:disk:LOGMSG_DELETE_DISK Disk Model:WDC WD80EFAX-68KNBN0 Serial:VGJL3SDG was removed from Channel 2 of the head unit.

[21/07/30 12:20:41 GMT] warning:volume:LOGMSG_HEALTH_VOLUME Volume data health changed from Redundant to Degraded.”

 

(yes I can PHISICALLY remove the disk and test it)

 

Is it possible for the RN424 to “misdiagnose” an otherwise healthy disks or to suffer an hardware fault?

 


Disks can fail at any time, so it would be useful to test it in the PC.    

 

The bay in the NAS might also have failed - if you have a spare disk, you might try doing a factory install (with only the spare disk inserted, in bay 1).  Then power down and move the disk to bay 2.  Power up, and make sure it works. If you try this test, label the disks by slot as you remove them.

 


@berillio wrote:

 

But what should I do now

  1. a) should I switch the RN424 off and take it offline? I can copy the few folders I would be working on RN214B and continue there for some time.
  2. b) should I MOVE data back to other NASes freeing space (but that would not do anything, the Volume is degraded because one disk is missing, it will remain degraded: the only advantage would be that – if there is another HD fault, I would have lost less data – mott point, maybe.

I suggest running the disk test on the RN424 before you copy new data onto it (look on the volume settings wheel for the disk test).  That will take a while, and you can test disk 2 in a Windows PC in parallel.  Once you know the disk health, you can sort out the path forward.

 

Since the volume is degraded, the data is at more risk than your other NAS.  If a second disk fails in the RN424, you will lose the data on it.  So finding a way to back up the critical data on the RN424 would be prudent. 

 

Do you have a backup plan in mind for the RN424?  Ideally you'd move everthing on to it, and use the RN214s as backups. 

Message 2 of 26
berillio
Aspirant

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?


I meant"Disks can fail at any time, so it would be useful to test it in the PC. "

OK - "hot removal" from the 424?

 


Message 3 of 26
berillio
Aspirant

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?

Sorry,  ignore that "I meant" before the quote from Stephen B..

Message 4 of 26
StephenB
Guru

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?


@berillio wrote:

I meant"Disks can fail at any time, so it would be useful to test it in the PC. "

OK - "hot removal" from the 424?

 



Even if it's healthy, its out of sync with the array.  So hot removal is fine.

Message 5 of 26
berillio
Aspirant

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?

The disk is SO dead that the WD utilities (neither Lifeguard Diagnostic DLGDIAG 1.37 nor the “Dashboard”) could not even see it.

Seagate Seatools for Windows saw it, tested it and failed it

--------------- SeaTools for Windows v1.4.0.7 ---------------

Short DST - Started 30/07/2021 21:04:19

Short DST - Pass 30/07/2021 21:06:06

Identify - Started 30/07/2021 21:21:31

Model: EFAX-68KNBN0

Serial: 152D20337A0C

Firmware: Unknown

Model Number: WDC WD80EFAX-68KNBN0

Serial Number: VGJL3SDG

Firmware Revision: 81.00A81

Drive Capacity: 8.00 TB / 7.28 TiB

Max LBA: 15628053167

Cache Size: 256 MB

Power-On Hours: 2930

Drive Temperature (C/F): 46 / 115

WWN: 5000CCA0BEE46BC8

Sector size (Logical/Physical/Allignment): 512 / 4096 / 0

Rotation rate: 5400 RPM

Form factor: 3.5 inch

Specification Supported: ACS-2

Encryption Support: Not Supported

Security Mode: Supported

SMART: Enabled

Host Protected Area features: Enabled

Advanced Power Management: Enabled

Download Microcode: Segmented, Deferred

Short Generic - Started 30/07/2021 21:24:55

Short Generic - FAIL 30/07/2021 21:26:46

 

Victoria also saw it, as fail it immediately too.

 

On the 424 I run the extended disk test as adviced. The test completed on Saturday afternoon. I received an email stating that it had been completed, no mention of any Warning or Errors, I presumed it was ok, but I just downloaded the logs, and in volume.log, the last line reads:

“data        disk test  2021-07-30 20:54:30  2021-07-31 13:49:00  pass”.

 

The other advice was:

 

“The bay in the NAS might also have failed - if you have a spare disk, you might try doing a factory install (with only the spare disk inserted, in bay 1).  Then power down and move the disk to bay 2.  Power up, and make sure it works. If you try this test, label the disks by slot as you remove them.”

 

Do you mean a factory default as in

https://kb.netgear.com/23114/How-do-I-reset-the-firmware-on-my-ReadyNAS-OS-6-storage-system-to-facto...

 

or  via Option 2 in the boot menu (P97 on the HM manual) – (I presume it is the same operation)

 

I guess that it would format the ”spare disk” (I don’t have a free one, but I suppose I can re-clone one 50Gb hd with a spare W7 copy I have)

Model: RN424|ReadyNAS 424 – High-performance Business Data Storage - 4-Bay
Message 6 of 26
berillio
Aspirant

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?

Apologies, I thought I posted the message (within the “==”)  last week, but I must have prepared the reply but NOT pressed the “Post” button.

Then, of course, I was checking the mails for eventual replies which never came. Sorry !

==

The disk is SO dead that the WD utilities could not even see it (neither Lifeguard Diagnostic DLGDIAG 1.37 nor the “Dashboard”).

 

Seagate Seatools for Windows saw it, tested it and failed it

--------------- SeaTools for Windows v1.4.0.7 ---------------

Short DST - Started 30/07/2021 21:04:19

Short DST - Pass 30/07/2021 21:06:06

Identify - Started 30/07/2021 21:21:31

Model: EFAX-68KNBN0

Serial: 152D20337A0C

Firmware: Unknown

Model Number: WDC WD80EFAX-68KNBN0

Serial Number: VGJL3SDG

Firmware Revision: 81.00A81

Drive Capacity: 8.00 TB / 7.28 TiB

Max LBA: 15628053167

Cache Size: 256 MB

Power-On Hours: 2930

Drive Temperature (C/F): 46 / 115

WWN: 5000CCA0BEE46BC8

Sector size (Logical/Physical/Allignment): 512 / 4096 / 0

Rotation rate: 5400 RPM

Form factor: 3.5 inch

Specification Supported: ACS-2

Encryption Support: Not Supported

Security Mode: Supported

SMART: Enabled

Host Protected Area features: Enabled

Advanced Power Management: Enabled

Download Microcode: Segmented, Deferred

Short Generic - Started 30/07/2021 21:24:55

Short Generic - FAIL 30/07/2021 21:26:46

 

Victoria also saw it, as fail it immediately too.

 

On the 424 I run the extended disk test as adviced. The test completed on Saturday afternoon. I received an email stating that it had been completed, no mention of any Warning or Errors, I presumed it was ok, but I just downloaded the logs, and in volume.log, the last line reads:

“data        disk test  2021-07-30 20:54:30  2021-07-31 13:49:00  pass”.

 

The other advice was:

 

“The bay in the NAS might also have failed - if you have a spare disk, you might try doing a factory install (with only the spare disk inserted, in bay 1).  Then power down and move the disk to bay 2.  Power up, and make sure it works. If you try this test, label the disks by slot as you remove them.”

 

Do you mean a factory default as in

https://kb.netgear.com/23114/How-do-I-reset-the-firmware-on-my-ReadyNAS-OS-6-storage-system-to-facto...

 

or  via Option 2 in the boot menu (P97 on the HM manual) – (I presume it is the same operation)

 

I guess that it would format the ”spare disk” (I don’t have a free one, but I suppose I can re-clone one 50Gb hd with a spare W7 copy I have)

==

Update – I eventually managed to find a free-ish SATA disk on some PC, backed it up, and low-level formatted it. I am just about  putting in the RN424 in Slot 1. I imagine it should do a factory default automatically; after that I will switch it off and try the other slots.

Message 7 of 26
berillio
Aspirant

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?

I am having problems posting my replies. I posted  the one below 3hours ago, and it never seemed to appear, but I only noticed 1h ago. Lets try again.

===

Apologies, I thought I posted the message (within the “==”)  last week, but I must have prepared the reply but NOT pressed the “Post” button.

Then, of course, I was checking the mails for eventual replies which never came. Sorry !

==

The disk is SO dead that the WD utilities could not even see it (neither Lifeguard Diagnostic DLGDIAG 1.37 nor the “Dashboard”).

 

Seagate Seatools for Windows saw it, tested it and failed it

--------------- SeaTools for Windows v1.4.0.7 ---------------

Short DST - Started 30/07/2021 21:04:19

Short DST - Pass 30/07/2021 21:06:06

Identify - Started 30/07/2021 21:21:31

Model: EFAX-68KNBN0

Serial: 152D20337A0C

Firmware: Unknown

Model Number: WDC WD80EFAX-68KNBN0

Serial Number: VGJL3SDG

Firmware Revision: 81.00A81

Drive Capacity: 8.00 TB / 7.28 TiB

Max LBA: 15628053167

Cache Size: 256 MB

Power-On Hours: 2930

Drive Temperature (C/F): 46 / 115

WWN: 5000CCA0BEE46BC8

Sector size (Logical/Physical/Allignment): 512 / 4096 / 0

Rotation rate: 5400 RPM

Form factor: 3.5 inch

Specification Supported: ACS-2

Encryption Support: Not Supported

Security Mode: Supported

SMART: Enabled

Host Protected Area features: Enabled

Advanced Power Management: Enabled

Download Microcode: Segmented, Deferred

Short Generic - Started 30/07/2021 21:24:55

Short Generic - FAIL 30/07/2021 21:26:46

 

Victoria also saw it, as fail it immediately too.

 

On the 424 I run the extended disk test as adviced. The test completed on Saturday afternoon. I received an email stating that it had been completed, no mention of any Warning or Errors, I presumed it was ok, but I just downloaded the logs, and in volume.log, the last line reads:

“data        disk test  2021-07-30 20:54:30  2021-07-31 13:49:00  pass”.

 

The other advice was:

 

“The bay in the NAS might also have failed - if you have a spare disk, you might try doing a factory install (with only the spare disk inserted, in bay 1).  Then power down and move the disk to bay 2.  Power up, and make sure it works. If you try this test, label the disks by slot as you remove them.”

 

Do you mean a factory default as in

https://kb.netgear.com/23114/How-do-I-reset-the-firmware-on-my-ReadyNAS-OS-6-storage-system-to-facto...

 

or  via Option 2 in the boot menu (P97 on the HM manual) – (I presume it is the same operation)

 

I guess that it would format the ”spare disk” (I don’t have a free one, but I suppose I can re-clone one 50Gb hd with a spare W7 copy I have)

==

Update – I eventually managed to find a free-ish SATA disk on some PC, backed it up, and low-level formatted it. I am just about  putting in the RN424 in Slot 1. I imagine it should do a factory default automatically; after that I will switch it off and try the other slots.

Model: RN424|ReadyNAS 424 – High-performance Business Data Storage - 4-Bay
Message 8 of 26
StephenB
Guru

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?


@berillio wrote:

 

The other advice was:

 

“The bay in the NAS might also have failed - if you have a spare disk, you might try doing a factory install (with only the spare disk inserted, in bay 1).  Then power down and move the disk to bay 2.  Power up, and make sure it works. If you try this test, label the disks by slot as you remove them.”

 

Do you mean a factory default as in

https://kb.netgear.com/23114/How-do-I-reset-the-firmware-on-my-ReadyNAS-OS-6-storage-system-to-facto...

 

or  via Option 2 in the boot menu (P97 on the HM manual) – (I presume it is the same operation)

 

I guess that it would format the ”spare disk” (I don’t have a free one, but I suppose I can re-clone one 50Gb hd with a spare W7 copy I have)


A factory default is one way.  If the spare disk is unformatted, you can just power up the NAS with it installed, and it will automatically do a factory default (no boot menu action required).

 

It would be destructive. 

 

Not sure if you need to do this, since you found a failed disk.

 

WDC will likely give you a recertified drive, though given the quick failure you might be able to talk them into a new replacement.  You could also see if the seller will exchange it (unlikely given the 120 days, but maybe worth a shot).

Message 9 of 26
elaplace
Aspirant

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?

This often happens, which is normal. I mean the WD RED disks in the same batch can fail in the high probability. The NAS players around me accept it very early, and are not friendly to the WD RED.

Message 10 of 26
StephenB
Guru

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?


@elaplace wrote:

This often happens, which is normal. I mean the WD RED disks in the same batch can fail in the high probability. The NAS players around me accept it very early, and are not friendly to the WD RED.


AFAICT we are seeing just one drive failure here, not multiple.  Usually when I see multiple early failures I am thinking there might have been damage in storage or shipment.  But failures at ~120 days are somewhat unusual - in my experience they either fail out-of-the-box or run quite a bit longer.  In any event, they can and do fail at any time (which is why you need backups).

 

FWIW, the WD80EFAX is a WD Red Plus drive, not a WD Red.  WD rebranded about a year ago.  The current WD Red models are all SMR, the Red Plus and Red Pro models are CMR.  I don't recommend SMR for ReadyNAS.  BTW, Seagate Barracudas are also largely SMR.  

 

I have mostly WD Red Plus drives (mixed with a few Seagates), including three of this particular model - and haven't seen any unusual failure rates with either.  As far as I can tell, the Seagate Ironwolf and WD Red Plus are both reliable, and users here seem satisfied with both lines. In general, NAS-purposed or Enterprise disks are good options for ReadyNAS (other than the SMR WD Reds).

 

Message 11 of 26
elaplace
Aspirant

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?

Yes!

 

In my machine I use the red drive, purple drive, black drive, and Seagates considering of CMR not SMR, and NETGEAR recommendation page:

 

https://kb.netgear.com/20641/ReadyNAS-Hard-Disk-Compatibility-List

 

At last  in my budget I prefer to the purple drive and Seagates.

Message 12 of 26
StephenB
Guru

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?


@elaplace wrote:

 

In my machine I use the red drive, purple drive, black drive, and Seagates considering of CMR not SMR, and NETGEAR recommendation page:

 

https://kb.netgear.com/20641/ReadyNAS-Hard-Disk-Compatibility-List

 


Although Netgear will qualify desktop drives, I don't recommend them.  One reason is that over time the manufacturers will change them.  Barracudas for example are now SMR, but they weren't a few years ago. NAS-purposed or Enterprise (excluding those pesky SMR Reds) are safe choices, even if they aren't on the HCL (which tends to lag new drive introductions by many months).

 


@elaplace wrote:

 

In my machine I use the red drive, purple drive, black drive

 

At last  in my budget I prefer to the purple drive and Seagates.


FWIW, I used a Black drive (quite a few years ago now), and found it was quite noisy.  But I'm sure the drive tech has changed since then.  In any event, I do check the acoustic and power specs when I am considering a new disk model.

 

Purple (surveillance) drives are tuned for write performance when streaming, so read performance might be somewhat lower than NAS-purposed.  They should work reliabily though, and are unlikely to be SMR.

 

Budget is of course a consideration. I do look at prices when I purchase new drives - my most recent purchase was an on-sale Seagate Exos which was actually cheaper than Ironwolf or WD Red Plus.  Sometimes there can be a big delta between Seagate and WD, though usually the gap is pretty narrow.

 

 

Message 13 of 26
berillio
Aspirant

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?

Hello Stephen B, thank you for coming back to us; @ elaplace, thank you for contributing.

Yesterday (Thursday) I carried out a low-level formatting on a 75Gb HD I took out of another PC. This morning (Friday) I inserted that disk in Slot 1, powered it up, and the RN424 created a volume as RAID1 - JBOD (see fig 1). I then changed password and NAS name, powered it down, placed the disk in Slot 2, powered it up and it tested fine (see fig 2). Ditto when insterted in Slot 3 and Slot 4. I then returned it to Slot 1… and the RN424 went incommunicado, the page timed out on three different PCs for 4 hours (but this, although strange, may not be relevant – see belo).

As the test has been successful, I have since removed the test disk. I returned the original 8TB disks in their locations (Slot 1, 3 & 4). The RN424 has NOT been repowered yet because a) I wished to be cast iron positive that it was the right thing to do without endangering the array, and b) I wish NOT to use the RN424 in a degraded state, so long I have space on the other NAS, I will try to avoid using it.

Tomorrow I will try to contact the supplier of both RN424 and see what they can do about the failed disk. [[ Update – DONE – but they are closed for the weekend ]]

If I am correct, once I have a replacement 8TB, I should

  1. Power-up the RN424, and I should see the full array in a degraded state
  2. “hot insert” the replacement disk in Slot 2, where it belonged
  3. Wait ~24/48h until the 14TB array is fully rebuilt and the Volume return to a “redundant” state

With regards to the RN424 going incommunicado, I just realised that it was using both NIC on a bonded connection (CAT6 to a GS324T); but after the  factory default, the NIC were not bonded anymore and maybe after a reboot I did not check the IP address (which until then never changed, it was always 192.168.2.63, I was filming every reboot during the Slot 1,2,3,4 tests)

 

@StephenB on the reply to elaplace’s comment

“FWIW, the WD80EFAX is a WD Red Plus drive, not a WD Red”

Apologies, well spotted, TY for the correction.

I am awell aware of the difference between CMR  and SMR, I returned and swapped two HDs on another NAS.

 

“AFAICT we are seeing just one drive failure here, not multiple.  Usually when I see multiple early failures I am thinking there might have been damage in storage or shipment.  But failures at ~120 days are somewhat unusual - in my experience they either fail out-of-the-box or run quite a bit longer.  In any event, they can and do fail at any time (which is why you need backups).”

 

Three points: a) all the 4 disk have the same date of manufacture (22Dec 2020) and very close SN (VGJKAN6G, VGJLL44XG, VGJJLL4G, are the three good HDs, and VGJL3SDG, is the failed one);

b) the “mode” of failure caught me by surprise, in all the failures I had in the past, the disk started showing errors which ramped up – sometimes very quickly. Not in this case, it seemed to have been a CATASTROPHIC and sudden failure. I also find odd that the WD diagnostics did not even SEE the disk

c) backups – YES I need a backup strategy. But with 25TB (and growing) original data, even a new RN214 with 4 x8TB would not be enough for the first backup (and then there would be incremental), I really do not know which startegy is open to me. Any suggestions welcomed. Also, I don’t think I really understood “snapshots” and where and when use them.

 

@elaplace

“The NAS players around me accept it very early, and are not friendly to the WD RED”

…”not friendly”? I am not sure what you mean…..

Model: RN424|ReadyNAS 424 – High-performance Business Data Storage - 4-Bay
Message 14 of 26
elaplace
Aspirant

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?

I mean more RED drives used and more complains to it than other drives on the forum.

Message 15 of 26
StephenB
Guru

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?


@elaplace wrote:

I mean more RED drives used and more complains to it than other drives on the forum.


FWIW, I haven't seen more complaints about WD Red Plus drives than other models here.  There have certainly been complaints about SMR drives (both Seagate and WD) over the years.

 

 

 

 

Message 16 of 26
StephenB
Guru

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?


@berillio wrote:

 

c) backups – YES I need a backup strategy. But with 25TB (and growing) original data, even a new RN214 with 4 x8TB would not be enough for the first backup (and then there would be incremental), I really do not know which startegy is open to me. Any suggestions welcomed. Also, I don’t think I really understood “snapshots” and where and when use them.

Larger drives are of course available, and cost-per-TB is approximately level up to 14 TB models at the moment.  Using JBOD on the backup NAS is another option, though you do need to shift shares around occasionly to make sure you have enough free space on all volumes.  And expansion is a pain, since you need to destroy a volume, and then re-do the backup.

 

My volume isn't as large as yours (I have about 14 TiB at the moment).  I do use other ReadyNAS for backup (I happen to use more than one).  One is an RN202 that uses JBOD (a 14 TB drive and a 6 TB drive at the moment).

 


@berillio wrote:

 

 I then returned it to Slot 1… and the RN424 went incommunicado, the page timed out on three different PCs for 4 hours (but this, although strange, may not be relevant – see below).

...

With regards to the RN424 going incommunicado, I just realised that it was using both NIC on a bonded connection (CAT6 to a GS324T); but after the  factory default, the NIC were not bonded anymore and maybe after a reboot I did not check the IP address (which until then never changed, it was always 192.168.2.63, I was filming every reboot during the Slot 1,2,3,4 tests)

Loss of bonding could have been a factor, especially since the switch was still assuming a LAG. That's not a scenario I've tested.

 

It might be good to grab the log zip file from that setup, if you still have that disk available.

 


@berillio wrote:

 

Tomorrow I will try to contact the supplier of both RN424 and see what they can do about the failed disk. [[ Update – DONE – but they are closed for the weekend ]]

 It sounds like you purchased the NAS with disks installed - is that the case?   Were the disks provided by Netgear or the reseller?  Either way, you should contact the seller before trying to exchange the disk with WD.

 


@berillio wrote:

 

If I am correct, once I have a replacement 8TB, I should

  1. Power-up the RN424, and I should see the full array in a degraded state
  2. “hot insert” the replacement disk in Slot 2, where it belonged
  3. Wait ~24/48h until the 14TB array is fully rebuilt and the Volume return to a “redundant” state

 


 Correct.

 


@berillio wrote:

the “mode” of failure caught me by surprise, in all the failures I had in the past, the disk started showing errors which ramped up – sometimes very quickly. Not in this case, it seemed to have been a CATASTROPHIC and sudden failure. I also find odd that the WD diagnostics did not even SEE the disk

I have seen this sometimes.  If the controller circuitry on the disk fails, then it can be catastrophic - a different failure mode than the more common one of running into increasing bad sector counts or timeouts.  A significant percentage of drive failures happen with no warning whatsoever (36% in a rather old Google study, the numbers might be somewhat better now).

Message 17 of 26
berillio
Aspirant

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?

@stephen,

“It might be good to grab the log zip file from that setup, if you still have that disk available.”

 

I do, just DL the zip with the logs

When I tested the 424 with the spare disk, I changed the NAS name and I can see on the log page. You can see the consecutive shutdown when I was swapping slots (in the pic)

 

But I did not change the time to UK, all the entries started @3:46 am, that was probably 9:46am

 

And I also have the log of the 424 after it failed from which you could see the bonding I had ( cannot remember now ).

 

Incidentally, I spent quite a bit of time tying to get the double bonding sorted out, without any success. I had the bond (partially) working on the 424, meaning that I had no increase in performance from the the single NIC; the 214s, instead, would simply drop out of line within half a minute the moment I set it, so I am using a single NIC.

And that was while following all the advice I could find, from  three or four posts (mostly yours) – but it never worked.

 

It is all documented somewhere ( right now I don’t remember any of that ).

 

I planned to start a new discussion trying to get to the bottom it, but I never had the time.

Model: RN424|ReadyNAS 424 – High-performance Business Data Storage - 4-Bay
Message 18 of 26
StephenB
Guru

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?


@berillio wrote:

I do, just DL the zip with the logs

 


You can send me a PM (private message) using the envelope icon at the top right of the forum page, with a download link to the logs (google drive, dropbox, etc), and I could take a look at it.  Don't post the download link publicly.

 


@berillio wrote:

I had the bond (partially) working on the 424, meaning that I had no increase in performance from the single NIC


What bonding mode are you using?

 

It's hard to see any consistent gain in performance unless you have a lot of clients accessing the NAS simultaneously.  Two often isn't enough.

 

 

Message 19 of 26
berillio
Aspirant

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?

Just a quick update – I got in touch with the vendor (from where I bought the RN424 and the 4 HDs), they recommend doing the RMA directly with WD, which I did, got the reply with the RMA number and RMA Date: 08/11/2021 03:51 PM) and sent them my failed disk that afternoon, WD should have received next day.

 

Update #2  WD dispatched a replacement disk on 21 August (Saturday morning @5:02). unfortunately UPS failed to recognise the tracking number during the weekend, and also on Monday morning, but on Monday afternoon I received more mails following the despatching, with delivery expected by the end of 24 August.

 

Update #3 Received the replacement from WD on Wednesday 25 August 2021~9:00 am

I think I have read somewhere StephenB saying that he tests his replacement HDs, so I put on test using DLGDIAG – Estimated time required ~66h

 

Update #4 Friday 27 ~19:30. DLGDIAG fails the replacement HD  “Too many bad sectors detected”. I immediately filed a “Question” on the WD Support (could not find any specific way of “connecting” to the RMA).

 

StephenB, Thank You !!!

 

Update #5 Wednesday 1 September ~19:05 “Incident Update” reply from WD

 

“……..I am sorry to hear that your replacement device was faulty well; relevant teams are informed about this issue and we will continue assisting you until we provide a solution.

 

It may take time to hear back from relevant teams about how to proceed but we will do our best to conclude this issue as fast as possible. ………”

 

Update #6  Thursday 2 September @18:05 (UK time)

 

On 30 July , when the incident occurred, the situation was:

RN424 has ~14.03TB used and 7.74TB free.

RN214a has ~7.71TB used and 3.19TB free

RN214b has ~4.22TB used, 4.97TB of snapshots and 1.77TB free

 

On 2 September, the situation is

RN424 – unprotected volume, NAS is offline. I did not copy/move any conspicuous amount of data, just  the end of the folders in use, so the usage can be considered to be the same as above

RN214a – 8.35TB data, 138GB snapshots, 2.42 TB free of 10.90TB

RN214b – 5.73TB data, 955.54GB snapshots, 4.24 TB free of 10.90TB

 

Update #7 Monday 4 October (yes, it has been a month)

WD is still trying to sort out a replacement disk for me – there have been some wishful emails on their part expecially because my spare NAS has gone to 30% “space left”, then to 20% “space left”.

But I cannot wait any longer so I bought another 8TB WD Red Plus, CMR 7200rpm (as well as UPS, which was needed)

 

Update #8 Tuesday 5 October – Parcel delivered Hurray…..

Unfortunately I just checked and the WD80EFBX (supposedly the replacement for the WD80EFAX) is NOT in the compatibilty list…….

Message 20 of 26
Sandshark
Sensei

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?

The compatibility list is a guide, and has not been updated in a while.  The WD80EFBX is completel;y compatible -- one of my NAS has 4 of them.

Message 21 of 26
berillio
Aspirant

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?

Brilliant, thanks, pfeeww

I'll power the 424, which has been offline since the event, and "hot insert" the new disk.

Cheers !

Message 22 of 26
berillio
Aspirant

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?

Hi all, sorry for the delays

 

Tue 5 Oct at 21:55

Disk Model:WDC WD80EFBX-68AZZN0 Serial:VRGRW69K was added to Channel 2 of the head unit.

 

Tue 5 Oct at 21:55

Resyncing started for Volume data.

 

Wed 6 Oct at 15:30

Volume data is resynced.

 

Wed 6 Oct at 15:30

Volume data health changed from Degraded to Redundant.

 

Pfeww Hurray

Thank you to all for the support

 

 

Added note – WD has not replaced the failed disk yet.

When I will receive it (if I do), I could get another NAS, maybe a 2 bay one, and I could twin it with another 8TB on RAID 1, purely for backup duties.

Unfortunately NETGEAR seems to have disappared from the UK market, for reason I cannot really fathom.

One simple question – in the event of a NAS failure – can a RAID1 disk be read by any LINUX machine or would I need recovery software (which is kind of pricey)?

Message 23 of 26
StephenB
Guru

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?


@berillio wrote:

in the event of a NAS failure – can a RAID1 disk be read by any LINUX machine or would I need recovery software (which is kind of pricey)?


Yes.  You would need to install mdadm and btrfs on the linux system.  But no paid software packges are needed. 

Message 24 of 26
berillio
Aspirant

Re: RN424 , 4x 8TB WD Red Plus failing after 120 days?

@@Stephen B wrote:

Yes.  You would need to install mdadm and btrfs……….

Thank you 🙂

 

Updates

Wednesday 13 October 2021 ~ 15:30

UPS parcel, WD80EFAX  S.N. VGGW8JKK ; 256 MB cache 5400 rpm (? I thought it was 7200rpm..?)

Disk goes on test with DLGDIAG, Quick test  Passed, starting Extended test (on my USB system ~60-70 h)

 

Thursday 14 October 2021 @14:02 – Too Many Bad Sectors (after 22h10m, 44h49m remaining)

 

Mailed WD with the screencap of DLGDIAG

 

Did somebody put a Jinx on me ???

   😞  😞  😞

Message 25 of 26
Top Contributors
Discussion stats
  • 25 replies
  • 4611 views
  • 2 kudos
  • 4 in conversation
Announcements