NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
i_g
Feb 17, 2012Aspirant
ReadyNAS: lost data after removing a drive : case #17927967
Details are in the support case. A friend was interested in the build quality of the and removed one the drives and put it back. I experienced errors (in the support case) which I managed to resolve but I have lost all my data.
9 Replies
Replies have been turned off for this discussion
- mdgm-ntgrNETGEAR Employee RetiredWell not working for NetGear, I don't know the details of your case. However I can still make a few comments:
1. If you store important data primarily on your NAS you should backup that data regularly e.g. to a USB disk, another NAS or some place else. RAID is not a backup. See Preventing Catastrophic Data Loss
2. I repeat RAID is not a backup. RAID (except for RAID-0) provides redundancy or high-availability. If a disk fails, data remains intact. Now when you remove a disk if you do it while the NAS was on (i.e. hot-remove) you can't trust the data on the disk you remove. When you add the disk back it will be wiped and a resync take place. This resync puts heavy stress on both disks. Now for example if the disk you didn't removed is beginning to fail, the heavy stress of the resync could finish it off taking all your data with it. Experimenting with RAID when you don't have your data backed up is asking for trouble. - i_gAspirantSo basically the data is irrecoverable? To be honest I wasn't around when this happened and didn't plan to experiment with it although I wouldn't have thought an advertised feature (e.g. host swapping) wouldn't cause this much trouble. When you say 'finish it off' do you mean that one of the disk might now be dead? Whether it is asking for trouble or not given the sequence of events the resulting loss of all data is pretty poor for a home device.
It was a new device. Some of the data is backed up offsite but I had transferred some other data that was not yet backed up offsite.
To be more specific, I continued to see error codes in the web management interface (2001010000 and 100101000). There was no detail with the errors and they occurred periodically (when the drive/s were accessed presumably). I read some forum posts which suggested upgrading the firmware (as it might perform some sort of software level restore) and reinstalling the OS from the boot menu. I done both of these and the errors stopped occurring in the web interface. The data however is no where to be seen.
The drive has not been written to since this occurred last night. If the drive/s have been wiped then I would imagine it would be the equivalent of a low level format / marking the sectors as unwritten. Therefore might it be possible to recover the data using Linux / Linux Live CD with recovery tools? - i_gAspirantFurther errors I am experiencing are:
fsck from util-linux-ng 2.17.2
e2fsck 1.41.14 (22-Dec-2010)
fsck.ext4: No such file or directory while trying to open /dev/c/c
Possibly non-existent device?
And I cannot see any volumes in the web management UI. - mdgm-ntgrNETGEAR Employee RetiredNot knowing the nature of the problem, I can't say whether or not data will be recoverable in this case. I am saying that you took a big gamble. Playing poker with your data isn't a good idea.
Hot-swapping is designed for replacing failed disks or disks you no longer wish to use without downtime, not for experimenting to see how RAID works (though you can do this if you wish, but doing this without a backup is very risky).
Disks can and do fail at any time. Having said that there are a range of other problems that could have occurred (some more serious than others) and support can diagnose the situation to get a definitive diagnosis. The important thing is that making changes to your NAS will only make it less likely that data can be recovered. RAID is an enterprise feature that has come to home devices. It is a useful feature for home users. NetGear has designed the ReadyNAS to make it as easy as possible for home users to be able to use this. They do clearly warn in the manuals (http://www.readynas.com/docs) that data should be backed up.
The disk you removed and added again would have been wiped not the other disk. Until the resync completes when you add a disk, the array would be non-redundant. So a problem with a disk during this vulnerable period could result in the NAS to go into life support mode.
Updating the firmware or doing an OS re-install when you have a problem like this is not recommended unless specifically requested to by support. Basically the fewer changes made to the system while you can't access your data the better.
Edited: was thinking of another thread. - i_gAspirantAs I said before I didn't 'experiment' with the device. My landlord (who is a friend of mine) paid a visit to my flat while I wasn't there. He explained the situation but should have known better as we are both programmers. but that doesn't change the situation does it. That being said I really can't believe the result of this to be honest. Basically the unit has corrupted itself. It is hardly as though the external situation / sequence of events is that complicated that it couldn't deal with?! And to boot it could have destroyed one of my drives.... Great
In response to: "Well disk 3 appears to be dead (light not lit) and disk 4 has a huge ATA count which indicates the disk is pretty much dead", is that what the error message I posted would indicate? It is a ReadyNAS Duo v2 with 2 disks in it so unless disk 3 and 4 apply to partitions I'm unsure how that would apply. The lights are alight for both drives and do not flash or indicate any issue. Yes, disks can and do fail at any time but if it has failed and is not dead due to what happened then that is complete unacceptable. A hot swapping feature that trashes drives. In any case, that hasn't been confirmed to be the issue yet.
In regards to backup I consider having a second hard that is being replicated onto a form of backup. In reference to the manual: "Data can be lost due
to a number of events, including natural disaster (for example, fire or flood), theft, improper data deletion, and hard drive failure. By regularly backing up your data, you can recover your data if any of these happen to you". Improper data deletion is obviously a user error, fire a theft is obviously going to result in a loss of data although I don't keep spares of everything I own and leave them at remote locations in case there is a fire or it gets stolen. That leaves the third, hard drive failure. I thought a solution to that was buying a second drive that was replicated to from the first one. Apparently not. Given my scenario there was no hard drive failure. The system has failed and that has resulted in a loss of data.
So you can't say whether the data can be recovered or not.... Or add anything of value. You would prefer to post about the do's and don't's of backup routines. Thanks for the input. - mdgm-ntgrNETGEAR Employee RetiredOops, I was thinking of another thread when mentioning a disk 3 and 4. Sorry for the confusion
i.g wrote:
As I said before I didn't 'experiment' with the device. My landlord (who is a friend of mine) paid a visit to my flat while I wasn't there. He explained the situation but should have known better as we are both programmers. but that doesn't change the situation does it.
So you shouldn't let other people do this kind of thing especially when you're not around. No it doesn't change the situationi.g. wrote:
That being said I really can't believe the result of this to be honest. Basically the unit has corrupted itself. It is hardly as though the external situation / sequence of events is that complicated that it couldn't deal with?!
Disks can and do fail at any time. There are rare problems that can happen even when you don't have a problem disk. RAID isn't perfect. Having said that your data may be recoverable depending on wha that issue is.i.g. wrote:
And to boot it could have destroyed one of my drives....
It wouldn't have destroyed one of your drives. If one of your drives was already failing it could've finished it off though due to the heavy stress on the disks that was initiated.i.g. wrote:
In regards to backup I consider having a second hard that is being replicated onto a form of backup.
Well it's not. Read the article on preventing catastrophic data loss I linked to. High-availability or redundancy whilst it does provide some protection for your data should never be considered backup due to a variety of possible problems it does not protect against including things like accidental file deletions.i.g. wrote:
In reference to the manual: "Data can be lost due
to a number of events, including natural disaster (for example, fire or flood), theft, improper data deletion, and hard drive failure. By regularly backing up your data, you can recover your data if any of these happen to you". Improper data deletion is obviously a user error, fire a theft is obviously going to result in a loss of data although I don't keep spares of everything I own and leave them at remote locations in case there is a fire or it gets stolen. That leaves the third, hard drive failure. I thought a solution to that was buying a second drive that was replicated to from the first one. Apparently not. Given my scenario there was no hard drive failure. The system has failed and that has resulted in a loss of data."
There could be a problem with the disk. it doesn't have to be dead. There could alternatively be a filesystem problem or some other issue. We can only speculate. I'll have to have another read of the manual sometime. However regardless with the dashboard having backup options I think it's fairly clear what is meant by backups.i.g. wrote:
So you can't say whether the data can be recovered or not.... Or add anything of value. You would prefer to post about the do's and don't's of backup routines. Thanks for the input.
If one doesn't learn from mistakes one is likely to repeat them. I was trying to point out the error of what was done. Removing a drive and putting it back is not something to do for fun as the disk you remove and readd will be wiped and have to be synced back in. You're assuming you won't have a problem whilst your array is unprotected. - i_gAspirantFirst of all, I didn't mean to be aggressive but I came on here to get advice on how to fix my problem, not to be criticised and told to read documentation on how to backup.
As I said before I didn't 'experiment' with the device. My landlord (who is a friend of mine) paid a visit to my flat while I wasn't there. He explained the situation but should have known better as we are both programmers. but that doesn't change the situation does it.
By this I mean it was done without asking. Once I realised the data was gone and my house mate told me that my other friend was 'fiddling' with it I called my friend and asked him what he had done.
The ReadyNAS and the drives are 3 weeks old. I would not expect any of them to fail but yes they could but it is unlikely (relatively speaking) given they are 3 weeks old. The Samsung drives that I have ordered have the highest reliability rates for their capacity and given it wasn't DOA then I would consider it highly unlikely and coincidental that one died at the same time as this incident occurred.It wouldn't have destroyed one of your drives. If one of your drives was already failing it could've finished it off though due to the heavy stress on the disks that was initiated.
Is your definition of failing being removed and then being reinserted? E.g., a failing disk with being rebuilt? That isn't sarcasm by the way.Well it's not. Read the article on preventing catastrophic data loss I linked to. High-availability or redundancy whilst it does provide some protection for your data should never be considered backup due to a variety of possible problems it does not protect against including things like accidental file deletions.
It is definitely a form of backup. It is a replication of data, whether that is defined as highly available is another matter. If it's not a form of backup why I don't just have one hard drive that copies remotely on a schedule. If the hardware that facilitates / houses the redundant hard disks (ReadyNAS) is not stable and is able to lose all your data (other forum posts also mention this issue) then redundancy onto another drive counts for nothing. As I stated; the guide you mentioned said you backup in case of drive failure, mistake and natural disaster. A mistake wasn't made and the drives are 3 weeks old and it was not an accidental error (see below). I shouldn't have to consider the product that is meant to assist / play a part in the backup picture actually losing all my data.
As I also said earlier: I have remote storage with sugar sync that I was intending to write an addon for my ReadyNAS to integrate with. The really important data (photos) is already backed up remotely. My music which I had very recently aggregated from many drives onto the ReadyNAS (and was going to backup remotely) was not. There was no fire, I didn't accidentally click delete and now it's is seemingly gone from both drives. So much for redundancy.
What has happened in my opinion is a system fault. If you deliver something for the home market then it should be robust. This is not robust. The data has been lost (unless I can recover it) from a fairly trivial sequence of actions. This sequence is not accidental file deletion. That implies a accidental action of actually deleting the files. What actually happened is a 'side effect' of the system. A highly undesirable one. The ReadyNAS unit has a sophisticated processor in it and relatively significant hardware, it is effectively a mini server capable in my opinion of detecting or preventing these type of issues. It isn't a dumb unit with a couple of hard drives in it.I was trying to point out the error of what was done. Removing a drive and putting it back is not something to do for fun
http://southpark.wikia.com/wiki/Captain_Hindsight - PapaBear1ApprenticeTrusting all of your data to one device, be it a single hard drive or a single RAID device is NOT a backup. If something happens to the device holding all of your data, that is when you need your backup.
As for hard drives not failing withing 3 weeks of use, you as a Professional should know better. Any electronic device is more likely to fail early than later until it gets much older. You should have been trained in backup, backup, backup. That being said, organizations and individuals can make disastrous errors, re: the Danger situation documented in the referenced link by mdgm. The customers of carbonite who thus lost the data they thought was backed up also learned a lesson - never trust your data to a single device -especially if it is in the cloud.
Although not an IT professional, I worked with and beside many in my 35+ year career in accounting and they were almost paranoid about maintaining multiple backups. When I first started they liked to tell the story of one company location that performed a task that corrupted their A/R and billing database, so they removed the corrupted disk, loaded the backup and repeated the operation which corrupted the only copy left - the backup. They failed to copy over their original from the backup. The company then had a policy of never running from a backup. If the original is corrupted, copy over the original from the backup and then, only then proceed.
When I copy data onto my NAS, regardless of source, I never erase the original until after the first backup of my NAS. Please always think of RAID as a convenience so you don't have to start from scratch in case of a disk failure and restore everything from the backup as a best case scenario, not as an absolute.
Keep working with tech support and hopefully they can recover your data. If they do, consider it a near miss.
While I have never experienced a second drive failure during a resync, I know from the number of posts that it is real and can happen. I have have drives give me months of warning about increasing errors, allowing time to make plans to replace the failing drive, and I have had drives fail suddenly overnight with no warning whatsoever. I have had drives fail after years of service and I have had drives fail within the first month of service. In fact the first drive I lost in an NAS was suddenly overnight with no warning less than 30 days after the the setup of my first NAS with only two drives.
Also tell every one to keep their bleeping hands off your NAS. - mdgm-ntgrNETGEAR Employee RetiredDisks can and do fail at any time. Now the NAS will treat removing a disk as if it were a disk failure, then reinserting the disk will wipe that disk and a resync will take place. At this point you are vulnerable to problems as there is no RAID protection. You'd fact the same issue if you were using another brand. Now, support should be able to determine what the problem actually is but I reckon if your friend hadn't removed and reinserted the disk you wouldn't have come into a situation where this thread needed to be started.
Don't get me wrong, RAID does provide some protection for data, but due to its vulnerability to things like accidental file deletions, multiple disk failures, problems while the array is non-redundant, fire, flood, theft, etc. it should not be considered a backup.
High-availability or redundancy is designed to maximise uptime and minimise the likelihood of needing to restore from backup. RAID and backups are complementary not substitutes. Any experienced IT admin should tell you that RAID isn't a backup.
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!