NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
GMulford
Jun 09, 2015Aspirant
Cloneing failed drive Readynas Pro OS6
I have a readynas Pro business edition with 6x 3tb drives in single redundancy. It's running the latest 0s6 firmware.
A drive failed last week, I'm still waiting for the replacement to arrive and now another drive has been marked as failed with 26 reallocated sectors. :( Currently it's sitting there showing the volume but the files aren't valid.
Quick question. If I managed to clone the failed drive with something like Knoppix will it try to remount the drives? or do I need a command to force the mount? If It does successfully mount Hopefully I can successfully rebuild with another new drive.
It isn't the end of the world if the data is lost as all critical data is offsite, but there is a lot of nice to have data :D.
P.S. I hate Seagate ST3000DM001 drives 8/10 have failed over last 3 years :S
Thanks
Greg
A drive failed last week, I'm still waiting for the replacement to arrive and now another drive has been marked as failed with 26 reallocated sectors. :( Currently it's sitting there showing the volume but the files aren't valid.
Quick question. If I managed to clone the failed drive with something like Knoppix will it try to remount the drives? or do I need a command to force the mount? If It does successfully mount Hopefully I can successfully rebuild with another new drive.
It isn't the end of the world if the data is lost as all critical data is offsite, but there is a lot of nice to have data :D.
P.S. I hate Seagate ST3000DM001 drives 8/10 have failed over last 3 years :S
Thanks
Greg
7 Replies
Replies have been turned off for this discussion
- mdgm-ntgrNETGEAR Employee RetiredYou would need to:
1. Power down the NAS
2. Remove the disk and clone it:readysecure1985 wrote:
Here is a simple guide to quickly recover a failed drive using dd_rescue.
I often have to deal with pesky failed drives, so here is a quick simple guide how to achieve this with a free Linux Live CD and a PC with two SATA connections.
I will be using a Knoppix 6.2 Live CD for this guide. Can be found at http://www.knoppix.net
Using dd_rescue command allows you to copy data from one drive to another block for block. This is especially useful for recovering a failed drive. Often when a drive fails, the drive is still accessible, it has just surpassed the S.M.A.R.T. error threshold. dd_rescue allows you to ignore the bad sectors and continue cloning the bad drive to a new healthy drive.
1) Connect your old drive and new drive to your PC
2) Boot up using your Linux live CD
3) Launch a terminal window.
4) Run fdisk -l to make sure the system sees both of the hard drives.
5) Run hdparm -i /dev/sdx on both of the drives to find which drive is your source drive and which drive is your destination drive
6) Once you know which drive is which you can start the clone process.
dd_rescue /dev/sdx(source disk) /dev/sdx(destination drive)
7) You will see the process start, just keep an eye on it, it might take a few hours for the clone job to finish, depending on the size of the drive.
Once the process is complete, there will be no notification, the transfer will just stop and you will see the terminal prompt again.
If you see a lot of errors or see that there is no more data being shown as succxfer: it means the drive got marked faulty by the kernel. At this point reboot the system and make sure you know which drive is which again, as it is possible they lettering might switch. Run the dd-rescue command again but this time with -r option. This will start the cloning again but this time will start from the back of the drive and will make sure to get the data that has not been cloned yet.
3. Put the clone in the NAS and power on
Hopefully it will come up fine, but it may/may not.
It is important that you put the clone in while the NAS is off. - GMulfordAspirantHI thanks mdgm :)
Managed to clone the drive last night there were 9 errors so about 90k at the end of the drive.
I've put the drive back into the nas.
It now starts recognises the drives but the volume isn't available. It says "please remove inactive volumes in order to use disks".
A volume "data-0" is shown, Is there a way to attempt to mount this volume? - mdgm-ntgrNETGEAR Employee RetiredHow did you try cloning the disk? Using dd_rescue or some other method?
dd_rescue is designed for cloning disks with problems. There are some other options that can be good too. But a simple dd wouldn't be good enough in most cases when a disk is in a bad way. - GMulfordAspirantYeah I used DDrescue, It went thought 99% of the drive fine and found 9 errors at the end. total size 90,000 bytes.
It seems like the drive only has a few problems, as 99.999999% cloned. But knowing my luck it's killed some important metadata.
I'll log in via scp when I get home and see what the mdadm status is, I haven't looked into BTRFS side though so little lost there. - GMulfordAspirantIt seems that it is not accepting sdc3
RAID conf printout:
--- level:6 rd:5 wd:5
disk 0, o:1, dev:sda2
disk 1, o:1, dev:sdb2
disk 2, o:1, dev:sdc2
disk 3, o:1, dev:sdd2
disk 4, o:1, dev:sde2
md1: detected capacity change from 0 to 1608843264
md1: unknown partition table
btrfs: device label 33ea2359:root devid 1 transid 927169 /dev/md0
systemd[1]: systemd 44 running in system mode. (+PAM +LIBWRAP +AUDIT +SELINUX +IMA +SYSVINIT +LIBCRYPTSETUP; debian)
systemd[1]: Set hostname to <Media-nas>.
udevd[1190]: starting version 175
systemd-journald[1178]: Fixed max_use=32.0M max_size=8.0M min_size=64.0K keep_free=204.6M
systemd-journald[1178]: Vacuuming...
systemd-journald[1178]: Flushing /proc/kmsg...
md: md127 stopped.
md: bind<sdb3>
md: bind<sdc3>
md: bind<sdd3>
md: bind<sde3>
md: bind<sda3>
md: kicking non-fresh sdc3 from array!
md: unbind<sdc3>
md: export_rdev(sdc3)
md/raid:md127: device sda3 operational as raid disk 0
md/raid:md127: device sde3 operational as raid disk 5
md/raid:md127: device sdd3 operational as raid disk 4
md/raid:md127: device sdb3 operational as raid disk 1
md/raid:md127: allocated 6372kB
md/raid:md127: not enough operational devices (2/6 failed)
RAID conf printout:
--- level:5 rd:6 wd:4
disk 0, o:1, dev:sda3
disk 1, o:1, dev:sdb3
disk 4, o:1, dev:sdd3
disk 5, o:1, dev:sde3
md/raid:md127: failed to run raid set.
md: pers->run() failed ...
md: md127 stopped.
md: unbind<sda3>
md: export_rdev(sda3)
md: unbind<sde3>
md: export_rdev(sde3)
md: unbind<sdd3>
md: export_rdev(sdd3)
md: unbind<sdb3>
md: export_rdev(sdb3)
I had a look and the events aren't that far out.
mdadm --examine /dev/sd[a-z][1-6] | egrep 'Event|/dev/sd'
/dev/sda1:
Events : 281660
/dev/sda2:
Events : 21
/dev/sda3:
Events : 244566
/dev/sdb1:
Events : 281660
/dev/sdb2:
Events : 21
/dev/sdb3:
Events : 244566
/dev/sdc1:
Events : 281660
/dev/sdc2:
Events : 21
/dev/sdc3:
Events : 244481
/dev/sdd1:
Events : 281660
/dev/sdd2:
Events : 21
/dev/sdd3:
Events : 244566
/dev/sde1:
Events : 281660
/dev/sde2:
Events : 21
/dev/sde3:
Events : 244566
Examining sdd3
root@Media-nas:~# mdadm --examine /dev/sdc3
/dev/sdc3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 318fc18f:8d9a07c9:696c3a7f:195c8a72
Name : 33ea2359:data-0 (local to host 33ea2359)
Creation Time : Tue Apr 8 18:14:25 2014
Raid Level : raid5
Raid Devices : 6
Avail Dev Size : 5850829681 (2789.89 GiB 2995.62 GB)
Array Size : 14627073920 (13949.47 GiB 14978.12 GB)
Used Dev Size : 5850829568 (2789.89 GiB 2995.62 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262064 sectors, after=113 sectors
State : clean
Device UUID : 5e40c5f2:52184be2:b19f418c:20f3afae
Internal Bitmap : 8 sectors from superblock
Update Time : Mon Jun 8 23:27:14 2015
Checksum : 74ef4c53 - correct
Events : 244481
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 3
Array State : AA.AAA ('A' == active, '.' == missing, 'R' == replacing)
root@Media-nas:~# mdadm --examine /dev/sdc3
/dev/sdc3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 318fc18f:8d9a07c9:696c3a7f:195c8a72
Name : 33ea2359:data-0 (local to host 33ea2359)
Creation Time : Tue Apr 8 18:14:25 2014
Raid Level : raid5
Raid Devices : 6
Avail Dev Size : 5850829681 (2789.89 GiB 2995.62 GB)
Array Size : 14627073920 (13949.47 GiB 14978.12 GB)
Used Dev Size : 5850829568 (2789.89 GiB 2995.62 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262064 sectors, after=113 sectors
State : clean
Device UUID : 5e40c5f2:52184be2:b19f418c:20f3afae
Internal Bitmap : 8 sectors from superblock
Update Time : Mon Jun 8 23:27:14 2015
Checksum : 74ef4c53 - correct
Events : 244481
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 3
Array State : AA.AAA ('A' == active, '.' == missing, 'R' == replacing)
Is the next step to force assemble ? mdadm --assemble --force
Thanks again for the help - GMulfordAspirantI ran
mdadm -v --assemble --really-force /dev/md127 /dev/sda3 /dev/sde3 /dev/sdd3 /dev/sdc3 /dev/sdb3
It marked the array as clean and its all popped back up.
Time for another drive failure lol.
Will be getting HGST or WD Red next time I think. - mdgm-ntgrNETGEAR Employee RetiredYes you can use that if you want to. It is dangerous and even though it has come online there will likely still be some data (hopefully just a small amount) that is corrupt.
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!