NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

RTSwiss's avatar
RTSwiss
Aspirant
Jan 30, 2013

loss of array after drive failure; problems with ST3500320AS

At the top, a modest complaint about the operation of the forum. I started a message several hours ago, and then got interrupted. To be on the safe side I copied the text out to notepad but left the browser idling in the forum. When I returned and tried entering additional text the system allowed me to proceed without any problem, so I assumed my session was still active, and proceeded to complete a relatively long post. When I hit "Preview" the system required me to login again, and when I did the entire post was lost. Surely the system can be setup so that if you have been logged out of a session you are prevented from continuing to write a post.

I am not sure in exactly which forum this belongs. Device is a 4+ year old ReadyNAS NV+, originally shipped with 2x ST3500630NS, both still going strong after 40,000+ hours. I added a third Seagate drive at the beginning, and the device has gone through 2-3 of them in four years. Mostly they have been Seagate ST3500320NS, which are on the HCL but which the machine seems to chew up. They also tend to fail without warning, which happpened to me last fall and happened again this morning. So I do wonder just how NV+ (V1) compatible these really are.

First inidication of failure was inability to access shares. Device responded to ping, but could not be seen by raidar or accessed via frontview. From a failure last fall of a power supply, following replacement of which the device refused to boot (it started, then reported "Checking FS" and hung there), I learned that it also contained a bad drive (which Netgear insisted was unrelated to the PS failure), and which, I recall, was determined by checking its behaviour drive by drive. I recalled, perhaps incorrectly, that one could remove the drives one by one to see if that cured the problem. So I removed drive 3, a 630 NS, and the problem persisted; I replaced that and removed drive 1 (the 320AS), and the machine booted. It did not, however, show any drive LED's; after booting the LED displayed "Drive C: 0/0MB free"; the device could be pinged; it could be seen on Raidar (showing drives 2 and 3 present, but no volume); but could not be accessed by Frontview.

I then added an replacemernt 500 GB drive (another 320AS), and restarted again, with the same results -- no lit drive LED's, visible on Raidar (showing 3 drives and no volume) -- but now it could be accessed by Frontview. But the only available shares are those on a pair of attached USB devices used to make partial backups of the NAS. The Volumes tab under frontview shows 463 GB free on drives 1 (320AS) and 2 (630NS not diagnostically moved) and 0 GB free on drive 3, the 630NS that was removed while trying to identify the bad drive. I should add that before any of this happened, the device reported a capacity of 907 GB with around 500 GB used.

So my question is this. Did I, by removing one of the good drives and trying to restart the machine, irretrievably destroy the array, the volume, and the shares that it contained? I would have thought that, unless the machine tried to write to the drives while checking the file system, reinsertion of the non-failed drive would have allowed the array to be recovered. That's what happened following my PS and drive failure last autumn, but at least thus far it does not seem so in this case.

Any insight any of the experts out there might offer would be appreciated.

Thanks.

-- Ted

14 Replies

Replies have been turned off for this discussion
  • So I heard from tech suppot (L2) in Manilla, and found myself dealing with someone who must have been recently trained, and who (a) claimed that I could not have been asked to open the unit and replace the PSU last August (not true), (b) then, after looking at the service records, claimed that it must have been the chassis, not the PSU, that I had replaced (not true), and finally (c) claimed that there should have been no redundancy after the loss of one drive out of a three-drive X-raid array (also not true, and later corrected by her). When I made clear that the machine had interacted badly with now at least 3 (possibly 4) ST3500320AS's, she was on the verge of replacing the unit, which I suggested would not necessarily solve my principal problem (recovering the volume from the remaining two drives), and seemed not warranted, as the machine had started up fine with just a new 1TB WD.

    Eventually she got back to me, telling me that she could escalate to L3 support on payment of $150 for a 1-year service contract (or something over $300 for a three year contract). When I asked whether, for the money, tech support would guarantee recovery, her answer was no, but after consulting with someone else she informed me that half the charge would be refunded if they couldn't. When I then asked what help this would buy me in figuring out why the device was chewing up 320AS's -- a defective unit, or a defective slot, or maybe just a not widely advertised incompatability with a drive on the HCL -- she didn't bother answering, and that was a day and a half ago.

    So I tried following the directions for an OS reinstall; depress the reset switch on the upper right side of the back panel for around 5 seconds (I counted 6), then power up. The machine booted, at some point in the process reported "Checking root FS," but did not do anything the obviously seemed like a reinstalll; and ended up in the same non-functional unit; device appears on Raidar with lights for each drive but none for a volume; no led's lit on the device itself; accessible via frontview, but no volumes. On the other hand, it appears to respond to my old pwd, so I assume I have not succeeded at the reinstall. Correct? What did I do wrong?

    I'd be happy if necessary to pay tech support, but only with the understanding that this is for recovery of the original volume, not just their best efforts; and also that they enlighten me about the cause of repeated crashes with these same model drives, and actually replace the unit if it turns out to be a defect in the mahine itself. It was, after all, shipped to me five years ago containing a known latent defect -- the PSU -- and at this point I would not put it past Netgear to be indulging in a lack of candor regarding difficulties with this particular model drive.
  • mdgm -- I just noticed that you were online; it was you who helped me through the PS failure last August. Any chance you might have any insight to offer on what has happened this time, or on the repeated failure of this one model drive?

    Thanks.
  • StephenB's avatar
    StephenB
    Guru - Experienced User
    On the OS reinstall - is this what you tried???

    1. Power off your system.
    2. Using a straightened paper clip, press and hold the Reset button.
    3. Press and release the Power button to power on the system.
    4. To perform an OS reinstall, continue to hold the Reset button until all Disk LEDs flash once, after about 5 seconds, and then release the button.
  • Six weeks later . . . I was advised not to try an OS reinstall, and opted to contact Netgear support. After listening to the story, an L2 tech in Manilla thought the issue should be escalated, and was of the opinion that it could be solved if escalated, but Netgear would not process the request without my purchasing a warranty upgrade ($150/1 year, $300/3 years) that covered all original hardware (including the two original drives), software, with NBD replacement. The deal was, however, that if they were unable to recover the data the charge would be limited to a one-time charge of $75. Reckoning the $$ to be less important than the time, especially with the downside limited if they could not recover the data,I agreed. It took 3 passes at an L3 tech telneting into the device, but they in fact were able to recover the (non-redundant) existing array on the remaining two drives, and after the machine restarted I added a third drive hot and the device resynched and is now back to normal.

    Costly, but the device is back online and under warranty for three years, and I am left with a good impression of NG's support and not feeling at all badly treated. The L2 tech in Manilla was quite assiduous in tracking this down and getting it done (at an otherwise very busy and distracted time for me). And that six week delay was almost entirely me, not Netgear.

    Thanks to all who offered advice, especially mdgm.

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More