NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

ensign_fodder's avatar
Aug 19, 2011

A sad tale of ProSupport woe with ReadyNAS Pro Pioneer!

So I bought and equipped a ReadyNAS Pro Pioneer (6 Bay BYOD) NAS 7/2/2010 and added the ProSupport 24x7 Support and Maintenance upgrade.

Overall I have been pleased with the unit (started with 4 x 1TB 7.2K added 2 more, then converted over to 1.5 TB drives over a few months) as a CIFS/UNC NAS device. Tried iSCSI for VMware but the performance was not there but I teamed the NIC's so it maybe an issue of the overall congestion on the single interface.

What I have not been pleased about is the long wait times to reach a ProSupport tech, the times that the person answering was not in the storage group and had to put me back on hold till someone was available, that I cannot communicate with the phone support via the my.netgear.com section except to upload files, and that I cannot email the tech so I have to use the telephone which circles me back to the top of this set of complaints.

Original Unit - Hardware/backplane failure
It took three separate cases to get me back to an operation state.
  1. 16134961 RMA

  2. 16143235 Closed without my knowledge once phone call was complete

  3. 16144283 Additional help in getting resync to complete


  4. How it went down
    7/14/2011 - See dead drive on Web UI (no messages/alert had been sent) and swap in cold spare, I am running X2 with single redundancy.

    From Ticket/Tracking log for case # 16134961 Was doing maintenance on a user password and noticed on the volume tab (web interface) that there was a dead volume. Swapped cold spare for failed member but system did not detect and rebuild. ~ 10 minutes after swap the system popped the dead drive warning, sent email, and such.

    Your RMA request for ReadyNASRNDP600E[XXXXXXXXXXXXX] has been approved by NETGEAR.
    Your RMA number is: XXXXXXX

    Completed RMA process 5:28 PM 7/14/2011


    When I asked about when I would have the replacement unit the support tech was a little rude and really could not have cared said "Since the RMA was not processed or approved until after 5 PM EST the warehouse would not ship it until Friday for Monday delivery. When I explained that the only reason it was after 5 EST was due to the hold time, he said "Its not like we can make the shipping department work overtime to get your unit out."

    Kimroth Y. at Tracelogix who actually handles the RMA for NetGear was absolutely outstanding! Kimroth was the person on the other end of the phone when I called the the RMA status phone number not realizing I was talking to someone at another company. When I explained my issue she explained how the process and procedures ensured I would not get my replacement before Monday. I begged and pleaded a little and she agreed to talk to her boss and see what could be done. From that point on she worked to make it happen, called to let me know it was going to ship, and sent me tracking information so I could follow it to my door.

    Once I had the replacement unit I had a great number of issues with the migration to the replacement unit due to incompatible firmware version and eventually had to put the "dead" drive back in the NAS and get it to build consistency across all drives before I could migrate.

    From Ticket/Tracking log for case # 16143235 Verify migration - just move drives in same slots, need to pull down configuration from website (currently does not complete), etc...?

    *NOTE: Need to bring replacement drive firmware up to 4.2.17 to match existing chassis.

    OS/Configs are RAID'ed across drives in hidden partition so migrating the drives migrates settings and such

    Currently 60 minutes in to resync after power cycle of unit 0% progress
    if no progress after 120 shutdown unit and move "hot" drives only then once running add "cold" spare
    If progresses then wait until complete and shutdown unit and move all drives

    Waited past 120 minutes and when 0% complete told unit to shutdown.

    Waited 30 minutes and now device is not responsive to web admin but front says "shutting down 0.2% Resync" so I assume it will have to sync to shut down.


    From Ticket/Tracking log for case # 16144283 Unable to boot replacement unit with a single drive inserted.

    Unable to boot replacement unit with five drives.

    Replaced drive and allowed to rebuild redundancy

    RAID sync started on volume C.

    [Fri Jul 15 17:51:52 EDT 2011]

    RAID scrubbing finished on volume C.

    [Sat Jul 16 17:48:04 EDT 2011]


    So at this point I am back in business with a new chassis and life is good.

    Till today when same situation of NAS not being available via CIFS/UNC

    How it went down
    From Ticket/Tracking log for case # 16458558 Unit was not responding via UNC path
    Tried web interface and was able to log in but it never fully painted screen/UI
    Went and tried various short pushes of power button without any effect
    Did a long press of power button until unit shut down
    Powered unit back on and watched boot process
    When boot/sync completed displayed dead drive
    Logged into web interface and confirmed issue
    Still no email notice from the unit about issue
    23+ minutes of hold time on ProSupport line until being dumped into voice mail
    Called normal support line - had to work to get to case number and agent insisted they would get me to a person - several times on hold
    Call time +15 - Told by agent that she cannot keep me on hold any longer but she will pass me to the ProSupport line and write up an escalation ticket to make sure someone picks up the call or if no one does someone will call me back.
    Unit not responding on web interface but RAIDar showns unit - still able to access file shares
    Call time +27 minutes - Dumped into ProSupport voice mail
    Rebooting device and swapping disk 5 with prior disk 5 - which passes all Mfg Disk Tool test.
    Unit "passed" Disk 5 which had only had SMART error previously but is one that "failed" in prior unit
    Received emails from unit about removal/replacement
    Resynching on "new" drive but unable to access Web UI


    3 1/2 hours since first issue - no response from support.
    4 1/2 hours since first issue - no response from support.

    I decide to call the ProSupport line again and open web ticket #16460006
    Within a minute of submission of web ticket agent on ProSupport line answered.
    Tech was friendly and apologetic about prior issues and repeated failures to reach support and assured me he was going to work and resolve the case.

    After going over what had happened today (and previously on prior unit) and asking several questions the only guess at cause is a corrupted OS image that made it over from the prior system and I should reload from the flash OS on device which I cannot do until the resynch is complete.

    From Ticket/Tracking log for case # 16458558
    Wait for full sync and then reload Firmware 4.2.17 then do OS reload on device (4.2.19 Beta T4 is on forums)


    So all told today I have spent hours listing to how much NetGear cares and how an agent will be with me in just a moment.

    I will update as the recover/restoration completes.

25 Replies

Replies have been turned off for this discussion
  • mdgm wrote:
    Would still suggest keeping the Pro Pioneer for backups. Data stored on a single device is not backed up.


    True but data you cannot get to after it is backed up is worse then having no backup at all!

    mdgm wrote:
    If you don't want to keep the Pro Pioneer you should be able to sell it either via the ReadyNAS Marketplace (http://www.readynas.com/forum/viewforum.php?f=33) or on a site such as eBay.


    I fully expect I could pass this turd on to another for a decent price but would that really be fair to someone else?

    If anything the possibility of using an ITX MB and the existing back-plane for a custom build is what I would want and if it dies in the process then I guess it was "the plan" from the storage gods!

    Thanks!
  • So here is the update:

    After dragging on and getting frustrated with a lack of progress and the latest suggestion being to try using it via iSCSI, I called a halt.

    I have spent too much time and trouble to try and identify and resolve an issue that I admit is a fringe case. To that end I did not want to continue the root cause research at my site and requested a "come to Jesus meeting" on how this problem was going to be resolved. Important people on the call and actions planned.

    0. Have Dev team review logs for anything missed - not expected but additional eyes don't hurt.
    1. Ship new shiny unit in case issue is related to unfixed/fixable OS image. Manually "copy" device/share/user configurations and then Rsync data from unit A to Unit B.
    2. Once comfortable that the data has been migrated, ship the bad unit off to be tortured in the lab until it is made to talk/reveal the issue.
  • ensign.fodder wrote:

    I have spent too much time and trouble to try and identify and resolve an issue that I admit is a fringe case. To that end I did not want to continue the root cause research at my site and requested a "come to Jesus meeting" on how this problem was going to be resolved. Important people on the call and actions planned.

    The major takeaway was that NetGear wanted the next "incident" to take place in their lab. Well, as of 20 minutes ago that goal has failed.
    ensign.fodder wrote:

    0. Have Dev team review logs for anything missed - not expected but additional eyes don't hurt.

    No idea if this happened or if anything was discovered.
    ensign.fodder wrote:

    1. Ship new shiny unit in case issue is related to unfixed/fixable OS image. Manually "copy" device/share/user configurations and then Rsync data from unit A to Unit B.

    So after six days of nothing I emailed to check and they had the chassis but were waiting on drives to fill it with. Sounds like my upgrade from Pro Pioneer to Pro Business might not be present on new chassis.
    ensign.fodder wrote:

    2. Once comfortable that the data has been migrated, ship the bad unit off to be tortured in the lab until it is made to talk/reveal the issue.

    Don't know if the unit will ever give up its secrets but at this stage of the game I just want to be f*cking done with the ReadyNAS and NetGear.
  • LapGear's avatar
    LapGear
    NETGEAR Employee Retired
    We are sorry the problem happen before we get everything in place. We knew it is a possibility but unfortunately couldn’t beat the clock.
    We will get the new unit to your as soon as possible and thank you for sharing the log with us from this instance
  • So to update and possible close this thread.

    Finally received new Pro 6 unit and drives, backup jobs to move shares and contents, confirmed file counts and sizes, shipped old unit off, and registered new unit 6/6/2012.

    Results of torture testing in lab:
    The out of memory condition has been reproduced on the unit, although the unit itself has not yet completely locked up. Regardless, a developer is looking into the out of memory condition to see where it is coming from.


    It appears the method the software uses to read and write data via CIFS was causing a large amount of files’ extended attributes to be read and re-read very quickly, each attempt of which in turn attempts to allocate large amounts of physically contiguous memory for the operation. We’ve implemented a change when this type of behavior occurs that will fall back to a virtual memory allocation system that should be better for this type of operation.

    It should be noted that the Pro system I have in the lab has been throwing these allocation errors without hanging during my tests. While I do think that this is the source of the original issue, it is possible that the hangs are due to something else. I will be testing the code that has this change in it to verify that the allocation errors stop and that the unit remains stable.


    It looks like everything was OK after a little more than a week of heavy CIFS load, part of which came from backup software. I believe that the change made in the firmware to allow for backup software's specific type of access pattern will be implemented in our next firmware release.


    So the new unit got the mentioned firmware a while back and I re-pointed the backup software at it and no issues on the ReadyNAS side since. For my existing advanced support contract they issued a new one, but that is another fight since:
    1. They originally said they could not transfer
    2. So I asked for new
    3. Then they could transfer but not extend the term
    4. So finally I made a big enough stink that they decided to issue new contact

    But in the f*cking around in trying to transfer the old my warranty has components expired in the future.


    Standard Product Warranty:
    - Hardware Warranty: Available till 29/May/2017
    - Free Installation Support Warranty: Expired on 28/Aug/2012
    - Power Supply Warranty: Expired on 29/May/2017
    - Accessories Warranty: Expired on 29/May/2017

    Advanced Support / Hardware Contracts:
    OnCall 24x7 3 Yrs, Phone+NBD Replacement, Cat. 2 - Available till 30/May/2015


    So in summation, if you can reach the right people in the right places and you are not willing to let yourself be mistreated, mountains can be moved.

    Just ask yourself before you buy, should that be the way it is or should you be treated better?

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More