NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
reklar
Feb 09, 2013Aspirant
HELP!: backup failing on NV+ v1
I am getting ready to replace several old disks in my NV+ v1 running X RAID. I went to backup my data (roughly 350GB total with 3 400GB drives) and the backup has failed several times (frontview becomes unresponsive, I turned off frontview and tried rsyncing via root ssh to attached USB, same issue). Had to pull the plug each time and now the unit is telling me that disk 1 has smart errors. These don't look to be too bad at the moment, no reallocated sectors (data below).
What I'm wondering now is:
a) Is it better to try to keep backing up my data and risking more strain on the drives (they are all the same age: old)?
OR
b) Should I pull the (hopefully only) failing drive? Is it possible to pull a drive and then backup to an external disk? I tried this after pulling power but after reboot the filesystem was not mounted.
c) I don't care about down time--I just want the data. What is my best bet here?
Here's the smart data for the 3 disks:
Model: SAMSUNG HD400LJ
Serial: S0H2J1FL605125
Firmware: ZZ100-14
SMART Attribute
Raw Read Error Rate 0
Spin Up Time 8000
Start Stop Count 19553
Reallocated Sector Count 2
Seek Error Rate 0
Seek Time Performance 0
Power On Hours 9874
Spin Retry Count 0
Calibration Retry Count 0
Power Cycle Count 32
Airflow Temperature Cel 34
Temperature Celsius 34
Hardware ECC Recovered 344566107
Reallocated Event Count 2
Current Pending Sector 3
Offline Uncorrectable 0
UDMA CRC Error Count 0
Multi Zone Error Rate 0
Soft Read Error Rate 0
TA Increase Count 0
ATA Error Count 0
Extended Attribute
Hot-add events 0
Hot-remove events 0
Lp stat events 0
Power glitches 0
Hard disk resets 1
Retries 1
Repaired sectors 0
Model: SAMSUNG HD400LJ
Serial: S0H2J1FL605124
Firmware: ZZ100-14
SMART Attribute
Raw Read Error Rate 1
Spin Up Time 7808
Start Stop Count 19521
Reallocated Sector Count 0
Seek Error Rate 0
Seek Time Performance 0
Power On Hours 9708
Spin Retry Count 0
Calibration Retry Count 0
Power Cycle Count 33
Airflow Temperature Cel 39
Temperature Celsius 39
Hardware ECC Recovered 439555273
Reallocated Event Count 0
Current Pending Sector 0
Offline Uncorrectable 0
UDMA CRC Error Count 0
Multi Zone Error Rate 0
Soft Read Error Rate 0
TA Increase Count 0
ATA Error Count 0
Extended Attribute
Hot-add events 0
Hot-remove events 0
Lp stat events 0
Power glitches 0
Hard disk resets 1
Retries 0
Repaired sectors 0
Model: SAMSUNG HD400LJ
Serial: S0H2J1FL605126
Firmware: ZZ100-14
SMART Attribute
Raw Read Error Rate 0
Spin Up Time 7936
Start Stop Count 19522
Reallocated Sector Count 0
Seek Error Rate 0
Seek Time Performance 0
Power On Hours 9364
Spin Retry Count 0
Calibration Retry Count 0
Power Cycle Count 34
Airflow Temperature Cel 37
Temperature Celsius 37
Hardware ECC Recovered 1207
Reallocated Event Count 0
Current Pending Sector 0
Offline Uncorrectable 0
UDMA CRC Error Count 0
Multi Zone Error Rate 0
Soft Read Error Rate 0
TA Increase Count 0
ATA Error Count 0
Extended Attribute
Hot-add events 0
Hot-remove events 0
Lp stat events 0
Power glitches 0
Hard disk resets 1
Retries 0
Repaired sectors 0
What I'm wondering now is:
a) Is it better to try to keep backing up my data and risking more strain on the drives (they are all the same age: old)?
OR
b) Should I pull the (hopefully only) failing drive? Is it possible to pull a drive and then backup to an external disk? I tried this after pulling power but after reboot the filesystem was not mounted.
c) I don't care about down time--I just want the data. What is my best bet here?
Here's the smart data for the 3 disks:
Model: SAMSUNG HD400LJ
Serial: S0H2J1FL605125
Firmware: ZZ100-14
SMART Attribute
Raw Read Error Rate 0
Spin Up Time 8000
Start Stop Count 19553
Reallocated Sector Count 2
Seek Error Rate 0
Seek Time Performance 0
Power On Hours 9874
Spin Retry Count 0
Calibration Retry Count 0
Power Cycle Count 32
Airflow Temperature Cel 34
Temperature Celsius 34
Hardware ECC Recovered 344566107
Reallocated Event Count 2
Current Pending Sector 3
Offline Uncorrectable 0
UDMA CRC Error Count 0
Multi Zone Error Rate 0
Soft Read Error Rate 0
TA Increase Count 0
ATA Error Count 0
Extended Attribute
Hot-add events 0
Hot-remove events 0
Lp stat events 0
Power glitches 0
Hard disk resets 1
Retries 1
Repaired sectors 0
Model: SAMSUNG HD400LJ
Serial: S0H2J1FL605124
Firmware: ZZ100-14
SMART Attribute
Raw Read Error Rate 1
Spin Up Time 7808
Start Stop Count 19521
Reallocated Sector Count 0
Seek Error Rate 0
Seek Time Performance 0
Power On Hours 9708
Spin Retry Count 0
Calibration Retry Count 0
Power Cycle Count 33
Airflow Temperature Cel 39
Temperature Celsius 39
Hardware ECC Recovered 439555273
Reallocated Event Count 0
Current Pending Sector 0
Offline Uncorrectable 0
UDMA CRC Error Count 0
Multi Zone Error Rate 0
Soft Read Error Rate 0
TA Increase Count 0
ATA Error Count 0
Extended Attribute
Hot-add events 0
Hot-remove events 0
Lp stat events 0
Power glitches 0
Hard disk resets 1
Retries 0
Repaired sectors 0
Model: SAMSUNG HD400LJ
Serial: S0H2J1FL605126
Firmware: ZZ100-14
SMART Attribute
Raw Read Error Rate 0
Spin Up Time 7936
Start Stop Count 19522
Reallocated Sector Count 0
Seek Error Rate 0
Seek Time Performance 0
Power On Hours 9364
Spin Retry Count 0
Calibration Retry Count 0
Power Cycle Count 34
Airflow Temperature Cel 37
Temperature Celsius 37
Hardware ECC Recovered 1207
Reallocated Event Count 0
Current Pending Sector 0
Offline Uncorrectable 0
UDMA CRC Error Count 0
Multi Zone Error Rate 0
Soft Read Error Rate 0
TA Increase Count 0
ATA Error Count 0
Extended Attribute
Hot-add events 0
Hot-remove events 0
Lp stat events 0
Power glitches 0
Hard disk resets 1
Retries 0
Repaired sectors 0
3 Replies
- StephenBGuru - Experienced UserOn "Hardware ECC Recovered" http://kb.acronis.com/content/9131
Although this parameter is not considered critical by the most hardware vendors, degradation of this parameter may indicate electromechanical problems of the disk. Regular backup is recommended. If no other (critical) parameters report a problem, hardware replacement is recommended on mission critical systems only.
Aside from that, the first drive is developing some issues (reallocated + pending sector is 5), and you have the one raw read error on drive 2. So that is a sign that disk 1 will need replacement soon. I'd probably order two replacement disks.
a) You are better off backing up the system with all drives in place. The redundancy should deal with any new failures.
b) If you have a suspect disk, try to avoid writing new data to the array.
Once you have a fresh backup, then I would first hot-swap the replacement drive for drive 1. I'd set the second replacement aside (after testing with manufacturer diags) for a little while, and see if any more issues develop with the other two disks.
A second strategy is to just get 3 new drives, do a factory default, and rebuild the NAS from scratch. That might give you more peace of mind, but is a bit more expensive.
Search the forums on the drive model before you purchase. There are certainly some models (both WDC and Seagate) that are on the HCL, but which seem to be problematic and best avoided. Based on what I am seeing, I personally am avoiding Seagate drives >= 2 TB, and am buying Western Digital RED drives. - reklarAspirantThanks for your reply, very helpful!
Awhile back I purchased four Seagate 1TB ST1000DM003 drives, intending to replace 3 and keep one as a spare. I did a quick google and didn't find much--are these drives also on your do not recommend list??? I don't mind purchasing more and I can find another use for the new ones if necessary.
The existing drives are fairly old although the number of power on hours is reporting surprisingly low at about a year of service if I'm reading that right. I had to move and the unit was powered of for a couple lengthy stretches so maybe it's correct or maybe the hours got reset at some point, not sure.
I still don't understand why the filesystem didn't come up after (1. powering down 2. pulling a disk 3. power up). Is there something special that needs to be done to get the filesystem back up in that scenario? I mean, that's the point of this raid system ... when one disk goes down or is not present the other two are there?
Fortunately after killing all extraneous processes (samba, ftp, frontview, etc) I am backing up successfully for about 12 hours now. I'm maybe 1/3 through the backup, so a good start. - StephenBGuru - Experienced UserI'm not using 1 TB drives. The larger cousins (ST2000M001 and ST3000M001) are on my avoid list. Though I am not seeing issues with the ST1000DM003 reported here.
I am buying WDC "Red" drives for replacements myself.
The NAS should have booted with a drive removed, I don't know why it didn't.
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!