- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
NetGear ReadyNAS RN3138 Disk Failure
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
NetGear ReadyNAS RN3138 Disk Failure
Hi All,
We have a NetGear ReadyNAS 3138 fitted with 4 x 6TB Western Digital NASware disks.
This is now the second time we have had disk failure on disk 2 and disk 4 causing us to loose the entire volume configured in RAID 5.
Previously support notified us that the problem was cause by iSCSI volume reaching 80% capacity and the problem not resolved immediately. We have monitored the device closely and had 2 incidents where we were notfied that the iSCSI volume reached 80% and the volume were expanded immediately in order to remove the error. Despite this the same two disks failed and we lost the data on the volume for a second time.
We suspect that the device itself may be faulty due to the exact same thing happening to the same disks on the device.
Does anyone have any addtional information which may point to a reason for this happening?
Thanks
Mark Harris
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: NetGear ReadyNAS RN3138 Disk Failure
@harris008 wrote:
This is now the second time we have had disk failure on disk 2 and disk 4 causing us to lose the entire volume configured in RAID 5.
...
Despite this the same two disks failed and we lost the data on the volume for a second time.
So you never replaced the disks? Have you looked in system.log and kernel.log for disk-related errors?
FWIW, I had one 6 TB WD60EFRX fail without any errors showing up in the usual SMART stats. There were disk-related errors in system.log and kernel.log.
Digging deeper, I also saw some UNC errors using smartctl -x via ssh on the NAS.
Error 8 [7] occurred at disk power-on lifetime: 30420 hours (1267 days + 12 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 40 -- 51 00 00 00 00 21 68 85 60 40 00 Error: UNC at LBA = 0x21688560 = 560498016
You could also look for this signature, and replace any disks that are showing UNC errors (perhaps with a WD80EFAX).
When I tested that disk with Lifeguard, the disk failed that diagnostic also. So I'd suggest the two disks in a Windows PC using WDC's Lifeguard software - running both the long non-destructive test and the full erase test.
I also saw this signature on a couple of other my WD60EFRX drives (though they didn't fail Lifeguard) - it might be prudent to replace any that show UNCs.
@harris008 wrote:
We suspect that the device itself may be faulty due to the exact same thing happening to the same disks on the device.
SATA bays can fail, though usually they either work or they completely fail. So this is possible, but it is more likely something going on with the disks.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: NetGear ReadyNAS RN3138 Disk Failure
Hi,
Yes the two drives in Bay 2 & 4 were replaced. The original drives which failed were replaced by WD.
I will have to test the two disk which I removed from the storage device using lifeguard to determine if there are errors.
1. Same two bays.
2. New drives.
3. Same failure as 11 months ago.
We cannot keep replacing disks on a device where the disks in the same bays are failing.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: NetGear ReadyNAS RN3138 Disk Failure
@harris008 wrote:
We cannot keep replacing disks on a device where the disks in the same bays are failing.
Understood, still it is important to rule out the drives somehow.
The warranty on the NAS is 5 years, so all RN3138s should still be covered. You could also try to do an RMA if you are the original purchaser. But I think they will want stronger evidence that it's the NAS.