- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
Re: Readynas Ultra 4 RAID 5 - 2 of 4 disks fail at the same time
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Readynas Ultra 4 RAID 5 - 2 of 4 disks fail at the same time
Before I was able to replace the disks (for which smart disk errors were reported previously), the next reboot gave me an unrecoverable disk problem (End of Life error). Checking the fs_check log, revealed a set of block reading errors resulting in short reads. It suggests to run fsck manually. (See end of log file below)
.....
Error reading block 242810730 (Attempt to read block from filesystem resulted in short read). Ignore error? yes Force rewrite? yes fsck.ext4: Attempt to read block from filesystem resulted in short read while trying to re-open /dev/c/c /dev/c/c: ********** WARNING: Filesystem still has errors ********** ***** File system check performed at Sat Aug 25 19:18:09 CEST 2018 ***** fsck 1.42.12 (29-Aug-2014) /dev/c/c: recovering journal Error reading block 242799888 (Attempt to read block from filesystem resulted in short read). /dev/c/c: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY. (i.e., without -a or -p options)
However when I run fsck with -v -n to get more information, it does not get past de short read error. (See below)
/root$ fsck.ext4 -v -n /dev/c/c e2fsck 1.42.12 (29-Aug-2014) fsck.ext4: Attempt to read block from filesystem resulted in short read while trying to open /dev/c/c Could this be a zero-length partition?
How can I run fsck manually in the correct way, to attempt to identify an possibly resolve the disk errors.
Any suggestions are apreciatied.
Regards,
Mark
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Readynas Ultra 4 RAID 5 - 2 of 4 disks fail at the same time
Hi @Tomahna
Happy New Year!
We wouldn't recommend doing any further action as this might cause much more damage on recovery of data.
I would suggest to try and get the data off the disks through Data recovery software or service and try to clone the disks. Contacting Support for assistance on this is also advisable though Data Recovery and Support might need to be purchased. You can login to my.netgear.com and create a case for your product so the experts can assist you on escalating the case.
Have you checked which of the 2 disks failed the last? This might be needed for re-assembling the RAID.
Hope this helps!
Regards
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Readynas Ultra 4 RAID 5 - 2 of 4 disks fail at the same time
Hello Marc,
Thank you for your reply. I will check out my.netgear.com
PS both disks failed at exactly the same time, which I found rather suspious, but never the less the the problem remains the same. Unless another cause can be identified that would cause both disks to become unusbale (powersupply or driver), which can be recovered in another way.
Best regards,
Mark
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Readynas Ultra 4 RAID 5 - 2 of 4 disks fail at the same time
@Tomahna wrote:
... gave me an unrecoverable disk problem (End of Life error).
...PS disks failed at exactly the same time, which I found rather suspious,
"End of Life" isn't a SMART statistic, and it's not an error I've seen before from a NAS.
Are you sure this was the error?
Were these SSD disks?
Perhaps download the log zip file and/or query the disk status with smartctl.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Readynas Ultra 4 RAID 5 - 2 of 4 disks fail at the same time
The “End of life error” was displayed on the Ultra4 unit itself. I cannot find it in the log files.
They are not SSD disks:
- Disk 1 WD2003FYYS-23W0B0 42D0788 42D0791IBM 1863 GB
- Disk 2 WD2003FYYS-23W0B0 42D0788 42D0791IBM 1863 GB
- Disk 3 Seagate ST3000DM001-1ER166 2794 GB
- Disk 4 Seagate ST3000DM001-1ER166 2794 GB
Please find below some interesting snippets from the system.log file
(Also attached the log web-screen positioned form when the problem started.)
Aug 29 10:49:49 Serenia kernel: All bugs added by David S. Miller davem@redhat.com
Probably some kind of joke...
Jan 2 17:36:52 Serenia kernel: ata2.00: cmd 60/00:10:d8:b4:26/01:00:27:00:00/40 tag 2 ncq 131072 in
Jan 2 17:36:52 Serenia kernel: res 41/40:00:e0:b4:26/00:00:27:00:00/40 Emask 0x409 (media error) <F>
A whole set of:
Jan 2 17:36:52 Serenia kernel: ata2.00: status: { DRDY ERR }
Jan 2 17:36:52 Serenia kernel: ata2.00: error: { UNC }
Jan 2 17:36:52 Serenia kernel: ata2.00: configured for UDMA/133
Jan 2 17:36:52 Serenia kernel: ata2: EH complete
Jan 2 17:36:52 Serenia kernel: ata2.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
Jan 2 17:36:52 Serenia kernel: ata2.00: irq_stat 0x40000008
Jan 2 17:36:52 Serenia kernel: ata2.00: failed command: READ FPDMA QUEUED
Jan 2 17:36:52 Serenia kernel: bio: create slab <bio-1> at 1
Jan 2 17:36:52 Serenia kernel: md/raid1:md0: active with 4 out of 4 mirrors
Jan 2 17:36:52 Serenia kernel: md0: detected capacity change from 0 to 4294955008
Jan 2 17:36:52 Serenia kernel: md0: unknown partition table
Jan 2 17:36:52 Serenia kernel: md: bind<sdb2>
Jan 2 17:36:52 Serenia kernel: md: using 128k window, over a total of 1948793216 blocks.
Jan 2 17:36:52 Serenia kernel: md2: unknown partition table
Jan 2 17:36:52 Serenia kernel: md: bind<sdd6>
Unrecovered read error - auto reallocate failed
Jan 2 17:36:52 Serenia kernel: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 27 26 b4 e0 00 00 f8 00
Jan 2 17:36:52 Serenia kernel: end_request: I/O error, dev sda, sector 656848098
Jan 2 17:36:52 Serenia kernel: md/raid:md2: read error not correctable (sector 647410928 on sda5).
Jan 2 17:36:52 Serenia kernel: md/raid:md2: Disk failure on sdb5, disabling device.
Jan 2 17:36:52 Serenia kernel: <1>md/raid:md2: Operation continuing on 3 devices.
Jan 2 17:36:52 Serenia kernel: md/raid:md2: Disk failure on sda5, disabling device.
Jan 2 17:36:52 Serenia kernel: <1>md/raid:md2: Operation continuing on 2 devices.
Jan 2 17:36:52 Serenia kernel: Buffer I/O error on device dm-0, logical block 242778913
Jan 2 17:36:52 Serenia kernel: md/raid:md2: read error not correctable (sector 647410936 on sda5).
Jan 2 17:36:52 Serenia kernel: md/raid:md2: read error not correctable (sector 647410944 on sda5).
Jan 2 17:36:52 Serenia kernel: md/raid:md2: read error not correctable (sector 647410952 on sda5).
Jan 2 17:36:52 Serenia kernel: Buffer I/O error on device dm-0, logical block 242778914
Jan 2 17:36:52 Serenia kernel: md/raid:md2: read error not correctable (sector 647410960 on sda5).
Jan 2 17:36:52 Serenia kernel: Buffer I/O error on device dm-0, logical block 242778915
Jan 2 17:36:52 Serenia kernel: md/raid:md2: read error not correctable (sector 647410968 on sda5).
Jan 2 17:36:52 Serenia kernel: Buffer I/O error on device dm-0, logical block 242778916
Jan 2 17:36:52 Serenia kernel: ata1: EH complete
Jan 2 17:36:52 Serenia kernel: md: md2: resync done.
Jan 2 17:36:52 Serenia kernel: md: checkpointing resync of md2.
Jan 2 17:36:52 Serenia kernel: RAID conf printout:
Jan 2 17:36:52 Serenia kernel: --- level:5 rd:4 wd:2
Jan 2 17:36:52 Serenia kernel: disk 0, o:0, dev:sda5
Jan 2 17:36:52 Serenia kernel: disk 1, o:0, dev:sdb5
Jan 2 17:36:52 Serenia kernel: disk 2, o:1, dev:sdc5
Jan 2 17:36:52 Serenia kernel: disk 3, o:1, dev:sdd5
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Readynas Ultra 4 RAID 5 - 2 of 4 disks fail at the same time
The logical next step is to power down the NAS and test disks 1 and 2 in a Windows PC using WDC's lifeguard software. You can connect the disks using either SATA or a USB adapter/dock.
You could also try powering down, removing disk 1, and then try to boot without disk 1. I'm not seeing the details of the disk 2 failure in the log snippet, but there is plenty of evidence that disk 1 is struggling. So it's conceivable that the system would boot w/o disk 1. If it does, the next step is to backup the data. Do that before you do anything else.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Readynas Ultra 4 RAID 5 - 2 of 4 disks fail at the same time
Hello Stephan,
Thank you for the suggestions. I’ve tried te boot without disk 1. Initially it looked promising, because - on the unit - the disk lights 2,3 and 4 came on, instead of lgihts 1 and 2 immediately start blinking. However, shortly after disk light 2 started blinking. Perhaps this could be resolved by rebooting the unit with the option to perform a file system check. However I‘m not sure if this is a wise action to take (in regards to data loss). I will have a look at the system.log to determine if I can find any new information.
Also I’ve connected Disk 1 using my USB docking station and ran the WD LifeGuard diagnostics. The extensive test ran (a long time) and now reports that the test found bad sectors that may be repairable. I have the option to select repair. However I don’t know if I should do this.
All suggestions are welcome.
Best Regards,
Mark
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Readynas Ultra 4 RAID 5 - 2 of 4 disks fail at the same time
I don't recommend doing a file system check on the NAS, as any repair attempt could do more damage. Though the NAS might already be doing that. Have you tried looking at the status with RAIDar?
I wouldn't try to repair the bad sectors either.
If there are failures on disk 2 also, then you probably will need data recovery. If you are willing to pay for that (either a service or software), then it is best to not attempt any more repairs - it often makes the recovery more difficult.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Readynas Ultra 4 RAID 5 - 2 of 4 disks fail at the same time
I’ve checked with Raidar. What is strange is that Raidar report all 4 disks in status green. (See attached screen) This while disk 1 is not even present at the moment and disk two is currently unasable. The lights on the unit report it correctly though, as the light for disk 1 is off and for disk 2 is flashing.
I will look into the system.log an run the extensive check on disk 2. I’ll report back when I have more info.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Readynas Ultra 4 RAID 5 - 2 of 4 disks fail at the same time
Something must have gone wrong with adding the attachment. Let me try again
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Readynas Ultra 4 RAID 5 - 2 of 4 disks fail at the same time
Yes I looked at Frontview (see attachment for the status screen). The strange thing is that it also reports disk 1 present. While the unit knows it is not (the light for it is off). I will also attach the system log file. It shows that is tries to configure raid for 3 disks instead of 4. So I find it confusing that the status is not consistanly reported. Could there perhaps be some data corruption in de setup or in the local data of the unit itself (i.e. Not stored on the raid disks?).
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Readynas Ultra 4 RAID 5 - 2 of 4 disks fail at the same time
And the system.log for today’s boot.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Readynas Ultra 4 RAID 5 - 2 of 4 disks fail at the same time
The disk numbering starts from 0, so the boot process is detecting that the first disk isn't present. And you are getting read failures on the second disk, so the array can't be mounted.
Your Frontview screenshot shows the first two disks as not working (disk 1 of course is removed) - the SMART stats button for both is grayed out.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Readynas Ultra 4 RAID 5 - 2 of 4 disks fail at the same time
Clear, thank you. I suspected as much when I looked at the system.log file
What remains confusing is that frontview displays the disk specificatiions for disk 1 (only without temperature) while it is not present, but that is not a problem for me. When I check te logs for the disk_smart disk log it reported before that all 4 disks passed the tests. That is probably why RAIDar is reporting 4 (Green lights) available disks.
I will continue doing some more tests this week and determine my options. I’ll report back when I’ve reached that point. Again thank you!
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Readynas Ultra 4 RAID 5 - 2 of 4 disks fail at the same time
It has been a while since I had time to continue working on this problem, but this week I made some progress. Unfortunately I ran into a new (seemingly unrelated) problem. I recovered the first disk by cloning it. The replacement disk resulted in the unit coming back online! Note I have not cloned the second disk, since this disk has so many bad sectors and I intented to use a new disk voor slot 2 and then want sync it.
Unfortenately - for no apparent reason - one of the healthy disks has now been marked as unusable?! As a result only some of my Shares have come back online. Others most likely depend on the presence of the other disk. Checking the system log it reports that disk 3 is non-fresh
Apr 24 11:24:17 Serenia kernel: RAID conf printout:
Apr 24 11:24:17 Serenia kernel: --- level:5 rd:4 wd:3
Apr 24 11:24:17 Serenia kernel: disk 0, o:1, dev:sda5
Apr 24 11:24:17 Serenia kernel: disk 2, o:1, dev:sdb5
Apr 24 11:24:17 Serenia kernel: disk 3, o:1, dev:sdc5
Apr 24 11:24:17 Serenia kernel: md2: detected capacity change from 0 to 5986692759552
Apr 24 11:24:17 Serenia kernel: md2: unknown partition table
Apr 24 11:24:17 Serenia kernel: md: bind<sdc6>
Apr 24 11:24:17 Serenia kernel: md: bind<sdb6>
Apr 24 11:24:17 Serenia kernel: md: kicking non-fresh sdc6 from array!
Apr 24 11:24:17 Serenia kernel: md: unbind<sdc6>
Apr 24 11:24:17 Serenia kernel: md: export_rdev(sdc6)
Perhaps in the attempts to recover the array, a shutdown was not completed properly (however I'm not aware of it), because I found the following information on a linux forum:
This can happen after an unclean shutdown (like a power fail). Usually removing and re-adding the problem devices will correct the situation:
/sbin/mdadm /dev/md0 --fail /dev/sda5 --remove /dev/sda5
/sbin/mdadm /dev/md0 --add /dev/sda5
My question now is, what is the best way to re-activate disk 4. Should I attempt the above commands for sdc6 or should it be sdc5. Or should I go about it differently?
PS I'm confused about the reference to sdc6, because I only see sdc5 listed as a mounted disk in the system log...