Re: Disk Failure Detected...

bokvast · ‎2011-12-05

Got an answer from a l3 that they are working on a fix and it is expected to arrive in Q1 2012

thestumper · ‎2011-12-05

That is reasonably good news. Q1 2012 is technically only a few weeks away. Of course, it could be the END of Q1, which would put us into March, but it's a good sign. Do you mind posting your case number so that I can reference it in mine?

My case #: 17077472

I worry that we could have various L3 techs running in different directions on this. Probably just paranoia, but if I get confirmation that there IS an acknowledged problem and it WILL be fixed, I will leverage another solution for Time Machine that will at least keep my NAS running. I don't want to let anyone off the hook until we know what the issue is and that it is solvable. My NAS still sitting in Debug mode at the moment....

-Eric

bokvast · ‎2011-12-06

Yeah sure

17259922

opt2bout · ‎2011-12-06

thestumper wrote:
Does everyone here have a case open for this? I would be curious to see how many there are, plus it might be helpful to reference other cases in our own cases to keep support focussed. I would especially like to see what Plan finds out, as his unit failed when the tech was logged in!

Case open now for 4 months now...since September. Still open. 4.2.20-T15 lasted 3 days...it seems 3 days and ... bam! Dead drive.

Pian · ‎2011-12-07

I am 16756789, case open since 23 Sep - but my L3 is pushing me to close the case even though I still have a problem.

He says that Engineering say that the problem is with Seagate, not Netgear. And Netgear may take the drive off the HCL if Seagate don't solve ...

opt2bout · ‎2011-12-07

He says that Engineering say that the problem is with Seagate, not Netgear. And Netgear may take the drive off the HCL if Seagate don't solve ...

Puts me in a hard place if this is true. Another comment earlier stated that there is a fix planned for Q1 2012. I have four of these drives. Only two installed--afraid to pursue replacing older/smaller drives with these.

However a tech did refer to a link to a discussion on Seagate's user forum, but there is no official statement from seagate. And to put things into perspective, if Netgear firmware 4.2.17 doesn't fail these drives (this all started with 4.2.19 and above) then how is it the drive's fault exactly? I put 4.2.17 back on and the drives lasted two months...only when I updated again to the release of 4.2.19 will the drive fail in a few days(4.2.18 actually worked but was retracted because of a security bug with AFP?).

I can't ship these drives back to anyone and ask for a replacement or refund, so I'm committed. I really hope Negear's final position is to point fingers and walk away from this--but they should update their HCL so others won't follow this disaster as we have.

thestumper · ‎2011-12-07

The problem lies with Netgear. Period. They put the drives on the HCL. They make these work, refund the drive purchase price, refund the NAS purchase price, or wind up in small claims court. I've already talked to an attorney on this. We didn't roll the dice with these drives; Netgear said they would work. I don't care if they take the off the HCL - I have copies of the original when they were on there.

With that off my chest...

I had another drive failure last night, and this time, Time Machine was disabled. I was restoring (AGAIN) from another drive and it blew up under load. I guess I've been naive with my "Time Machine" connection - Time Machine creates load when there are a lot of changes, and this is when I would get the failures.

My L3 says that they have someone from Seagate coming to their facility to work with them on the issue. There's no exact time frame but I'm giving it two weeks. I can't afford to sit on a $1000 purchase (total) that is basically a paperweight now. With the price of drives at the moment, I'm not about to just pitch the old ones and buy news ones that are "on the HCL". How am I supposed to have any faith in the HCL at this point anyway?

Frustrating. The L3 techs have been helpful, but helpful doesn't solve the problem.

opt2bout · ‎2011-12-07

My L3 says that they have someone from Seagate coming to their facility to work with them on the issue.

That's good news. Hopefully either that will give Netgear and/or Seagate enough info to either identify the combination as defective and provide us with an option to repair or replace the drives or fix the problem with firmware (on either drives or NAS). Either way, its a win and shows the commitment of both companies.

With the price of drives at the moment, I'm not about to just pitch the old ones and buy news ones that are "on the HCL". How am I supposed to have any faith in the HCL at this point anyway?

I hear ya. I had plans of making all drives the same (old school thoughts of RAID). But not going anywhere in that direction. And yes, what makes us think that a new FW release in the future won't invalidate the current approved drives!

But the fact that they appear to be working on the issue helps. If they could just go back to 4.2.17 with Lion support (for us Mac users) I would be happy. I don't need all of the video streaming stuff for a business storage unit anyway. I also wouldn't mind being reimbursed for my cost, hours and loss of use of the equipment. But not sure with all of the finger-pointing what it would take to firm up a class action to find remedies. We'd probably end up getting 5 cents on the dollar at best 😞

thestumper · ‎2011-12-07

Class action or small claims (individual) - it would cost more to fight it than to just make it go away. I'm still hoping for a fix; I've said all along I like the concept of the unit, and when it works it's great, but it hasn't worked properly from day one. I hate being a tool about this, but I've been left with no recourse and a $1000 paperweight.

bluewomble · ‎2011-12-08

My problems seem to have started again - I got a disk failure on channel 3 reported. Exactly the same story in the system log.

Interestingly, my NAS have been running with no problems for around 6 months, during which TimeMachine was broken on all my Macs (for unrelated reasons)... As soon as I fixed my sparse bundles to get TimeMachine working again, the NAS broke again. Circumstantially at least it would appear that these problems are intimately related to TimeMachine.

In six months, nothing seems to have been fixed, I think I'm going to gradually replace these disks with different models... I've ordered 3 Samsung Spinpoint F4 HD204UI drives today to start that process.

bokvast · ‎2011-12-08

doesnt seem like a bad idea... Gonna check with my local retailer if a exchange is possible

raykj · ‎2011-12-15

Had a drive "failure" last night and contacted Netgear for RMA of hard drive (before noticing this thread). This is a RNDU2220 unit which came preloaded with two of the Seagate 2G ST2000DL003 drives. I'm now guessing that replacing the drive will not really correct the problem.

Interestingy, the failure occurred when nothing was really happening (the unit should have been idling). It dropped the erroring drive out of the RAID.

The drive sent me an email at 6:25 in the morning. Interestingly, the kernel log shows some issue at 23:36 pm the night before:

The unit has been running fine for weeks with non-intensive use. Have a few shares, copied lots-o-data and an iSCSI connection (which, I'm guessing had a little background activity from Win2008 R2 access).

ferg1 · ‎2011-12-17

I've just had another ST2000DL003 - disc failure in my ReadyNas Pro. This is the fifth drive that I'll be RMA'ing now. Although I've had 4 further failures where I've pulled the drive, rebooted then added it again, which seems to bring the drives back to life

roger_armstrong · ‎2011-12-22

We've got 6x Seagate ST2000DL003-9VT166 drives (firmware CC32) in an Ultra 6 Plus (on the HCL) and we've had two different drives reported dead within 1 month. We removed the drives and tested them and they were fine, so we put them back in and they rebuilt OK.

I hope there'll be a fix for this asap or at least some response from Netgear - enough other people are reporting problems with these drives to make the issue urgent.

roger_armstrong · ‎2011-12-22

Can anyone confirm the theory that reverting to 4.2.17 fixes the problem? Or that TimeMachine is to blame?

capaust · ‎2011-12-22

roger.armstrong wrote:
Can anyone confirm the theory that reverting to 4.2.17 fixes the problem? Or that TimeMachine is to blame?

I'm not sure about 4.2.17, but our problems with ST2000DL003 drives had nothing to do with Time Machine. We use our ReadyNAS as a file server and just having 25 employees accessing files at the same time caused drives to fail. As with many others here, a couple of resyncs and restarts would 'fix' the problem temporarily, but they would inevitably fail again. It appears that the problem has more to do with loads placed on the drives. Under regular daily use, they would fail intermittently, but I could consistently get them to fail if I ran a large file transfer using RichCopy.

We ended up having to buy different drives. We've had no trouble since.

roger_armstrong · ‎2011-12-23

Does anyone know from experience which other 2 or 3TB drives work reliably in an Ultra 6+. We're backing up over a TB every night to the ReadyNAS so we really need rock solid drives.

PapaBear1 · ‎2011-12-23

For 3TB disks I have had very good service from 4 Hitachi HDS5C3030ALA630 drives (2 each in 2 NVX units, plus 2x1TB Seagates in each as well). These are listed on most websites as PN 0S03230 as well as PN 0F12460 and PN 0S03228 (listed as the retail version). If you go to the Newegg website and click on the images of the drive and zoom in on the labels, you will see they are the same model. (Normal search only brings up 0S03230 and 0F12460, but if you enter "Hitachi 0S03228" in the search box it will show up and is in stock. The four I got from Newegg are PN 0S03230 and I also got one from Amazon. (One of the 5 is a spare still in the sealed static bag) Hitachi uses an opaque silver static bag so you cannot read the drive label until you open the sealed bag. On the label on the bag, only the PN appears, on the drive label, the model number but not the PN appears.

Mine have been in service for about 6 or 7 months without problems and the 1TB Seagates have been in service for 18 months, but one is now throwing ATA errors in the smart info, although I have had no data problems.

FWIW I am running 4.2.17 on both units and have since about a month after it's release. NAS2 (NVX Pioneer) is the nightly backup of NAS1 (NVX Business Edition) which is my server. I have a home network consisting normally of 2 Win 7 Custom built desktops (me) and 1 Win 7 HP laptop. All three are running 64bit Home Premium. From time to time an old (2004) HP laptop with XP Home and an old (2003) HP D530 desktop with XP Pro are also on the network for special purposes only.

ferg1 · ‎2011-12-28

capaust wrote:
roger.armstrong wrote:
Can anyone confirm the theory that reverting to 4.2.17 fixes the problem? Or that TimeMachine is to blame?

I'm not sure about 4.2.17, but our problems with ST2000DL003 drives had nothing to do with Time Machine. We use our ReadyNAS as a file server and just having 25 employees accessing files at the same time caused drives to fail. As with many others here, a couple of resyncs and restarts would 'fix' the problem temporarily, but they would inevitably fail again. It appears that the problem has more to do with loads placed on the drives. Under regular daily use, they would fail intermittently, but I could consistently get them to fail if I ran a large file transfer using RichCopy.

I would concur with seeing this problem under high loads. I have seen the problem with TM and also with multiple concurrent large (+1gb) file transfers. TM does do a lot of disk access as there are a lot of very small (symlinked files). I also saw this multiple times when initially going from a new 4 disc volume (raid5) to a 6 disc (raid6). Enough to cause me to factory reset the unit and start again with a new 6 disc volume.

With the high current price of hard drives I'm hoping for a fix in firmware in the new year as I cannot afford to replace the existing drives yet.

Cheers
Ferg

thestumper · ‎2012-01-04

It's load related. Time machine generates load, but my last failure happned with Time Machine disabled. I was copying a large amount of data (music and video files) to the NAS when it happened. Just a drag-and-drop copy; nothing fancy. I just pinged support again. I took a break over the holidays because I had enough stress but I'm taking it up in earnest again. We're going on 8 pages of problems here, so they need to do something, but I'm not holding my breath because if Netgear can't fix it, they're liable due to the HCL posting. My guess is that if they can't find a fix, they'll procrastinate until someone takes action that's more serious than complaining on a forum or hassling the support techs 🙂 Honestly, I may end up selling the unit and just use the drives as stand alone externals. I can stream video and music from my Mac on a couple, and use a couple as Time machine targets. Not optimal, but if I can't use the unit, maybe someone can.

Or maybe disks will become cheap again some day 🙂

ferg1 · ‎2012-01-06

I've just had an additional drive failed when I had to poweroff due to a power cut. I have a raid6 Pro with 4 of these ST2000DL003 drives and 2 of a different type. The unit was previously in a single failed drive state which I've been talking to Support about. When booted back up the "failed disc" started resynching. It reached 65% and then froze. I cannot access the unit. RAIDar now indicates that two drives have failed. Frontview is unreachable. Note that the additional failed drive is also a ST2000DL003.

Of these four drives each one has been RMA'ed at least once (some twice). I must be into double figures of disc failures.

Pian · ‎2012-01-06

ferg wrote:
I've just had an additional drive failed when I had to poweroff due to a power cut. I have a raid6 Pro with 4 of these ST2000DL003 drives and 2 of a different type. The unit was previously in a single failed drive state which I've been talking to Support about. When booted back up the "failed disc" started resynching. It reached 65% and then froze. I cannot access the unit. RAIDar now indicates that two drives have failed. Frontview is unreachable. Note that the additional failed drive is also a ST2000DL003.

Of these four drives each one has been RMA'ed at least once (some twice). I must be into double figures of disc failures.

The scariest post I've read in quite a long time.

I'm in the ridiculous position of trying not to use data on my ReadyNAS in case I get a double failure too. And with disks not yet coming down in price it's as if if I can hear a bomb ticking ...
😞

ferg1 · ‎2012-01-06

Pian wrote:
ferg wrote:
I've just had an additional drive failed when I had to poweroff due to a power cut. I have a raid6 Pro with 4 of these ST2000DL003 drives and 2 of a different type. The unit was previously in a single failed drive state which I've been talking to Support about. When booted back up the "failed disc" started resynching. It reached 65% and then froze. I cannot access the unit. RAIDar now indicates that two drives have failed. Frontview is unreachable. Note that the additional failed drive is also a ST2000DL003.

The scariest post I've read in quite a long time.

I'm in the ridiculous position of trying not to use data on my ReadyNAS in case I get a double failure too. And with disks not yet coming down in price it's as if if I can hear a bomb ticking ...
😞

I hear you. Luckily for me I went with RAID6 when I realised just how unreliable the original five ST2000DL003 drives were, by purchasing a sixth different disc! Still it's highly probable that a third disc will fail and then I'm back into the task of putting 6GB's of backup back.

If discs were not now so expensive I would be at the shop now and putting these ST2000DL003 in the bonfire.

ferg1 · ‎2012-01-06

Unfortunately Level2 have told me that the drive has known issues that they just cannot deal with. They recommend to either exchange the drive or wait for a firmware fix from Seagate.

Not a happy customer.

bokvast · ‎2012-01-07

that just wont do! How are we suppose to just exchange the drives? what store in their right mind would take back the drives? Just leave them and buy new ones? Dont know about you guys but i'm definitely not made of money!

Do SOMETHING Netgear!!