NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

berillio's avatar
berillio
Aspirant
May 24, 2022

RN214 goes Offline and NICs may be dead

Hello RN Forum,

As you may remember from other posts (still on hiatus, sic), my current set up in Birmingham UK is made of:

RN214a (4x WD40EFAX), FW 6.10.3, RAID 5

RN214b (3x WD80EFBX + 1x WD40EFRX), FW 6.10.3, RAID 5

RN424 (3x WD80EFAX + 1x WD80EFBX), FW 6.10.4 Hotfix1, RAID 5

The RN214s were purchased in April / May 2020 and the RN424 in April 2021

I went back in Italy for Xmas but family commitments kept me there until now, mid May.

While abroad, very occasionally, I logged on the NASs, and in February, I noticed that RN214b had gone offline; I asked a friend to go and check and she rebooted it (I instructed her to pull the plug, as it was unresponsive). I think I checked it again in April and it appeared to be “Online”, but it was Offline when I arrived back home.

 

This situation seems to be similar to the one described in

RN21400 Suddenly Goes Offline but is actually running

https://community.netgear.com/t5/Using-your-ReadyNAS-in-Business/RN21400-Suddenly-Goes-Offline-but-is-actually-running/m-p/2113700#M192729

as all my NAS have a double NIC to allow bonding, but inreality I tried to implement the bonding but failed on the RN214s. I thought that my “basic” router would not support bonding and I purchased a GS32T, but although on the RN424 the bonding was successful, both the RN214s would go offline as soon as the second “bonded” NIC was connected. I planned to have a further look at this issue with a post on the forum, but I already had two posts open, so I waited a later time.

All NASs were left configured with double bonding in place, but eth1 was unconnected on the RN 214s (so both NICs were “in use and powered”) - (unfortunately): this was the situation for the last year/18 months. No apps installed on RN214b (Plex on 214a, use it once and pretty unnecessary for my use).

I followed the advices on that post above, but unfortunately I did not get to the “happy ending”: the RN21b, now connected using an unbonded single NIC (on eth1 now but there seems to be little difference between the two NICs), would boot up, stay online for a matter of minutes, then becomes unavailable.

The fan does not seem to respond to the software settings either: even if I set it to “Cool”, the rpm remain @<800. I tried to swap it for a similarly identical fan from a dead RN104, when I rebooted RN214b I saw 2785rpm, but five minutes later the data was unaccessible, and next reboot the fan was running <800rm again. The unit is currently off and “naked” (no side panels).

 

This is the situation thus far.

 

Any possibility that a FW update may fix any of the issues (which seem to be hardware issues, to my eyes) ?

Any further test I could do?

 

Further comments

The RN214a is currently using eth0. Is it advisable to switch to eth1, if that seems to be less temperature sensitive, or that is only the case when both NICs are in a bonded state and “in use”?

Is it advisable to keep the double bonding on the RN424, given that the speed advantage is minimal, or should I just use a single unbonded NIC, alternating eth0 and eth1 once a year or so?

 

IF the unit has suffered a terminal fault to both NICs – and has to be considered DEAD, then I really don’t know what to do, because NETGEAR ReadyNAS seems to have disappeared from the UK market: Six months ago, I could still find RN424s on Amazon.de, but that does not seems to be the case anymore; so the option of plugging the entire array in a new unit (214 or 424) doesn’t seem to be available to me anymore.

 

If I am correct, I can switch off the RN 214a, remove all disks (ordered & labelled), load all the disks from the faulty RN214b and power it up. The R214a should read that full array. That should allow me to transfer all the data on a WD10EFAX currently empty. The disks could be formatted and used somewhere else.

 

Thanks for help and suggestion in advance,

Berillio

13 Replies


  • berillio wrote:

     

    Is it advisable to keep the double bonding on the RN424, given that the speed advantage is minimal

     


    Why (given that the speed advantage is minimal)?

     


    berillio wrote:

    I can switch off the RN 214a, remove all disks (ordered & labelled), load all the disks from the faulty RN214b and power it up. The R214a should read that full array. That should allow me to transfer all the data on a WD10EFAX currently empty. 

     


    Correct.  You can also migrate the disks to the RN424 (or in the other direction) - though the system will need to switch the OS from arm->x86 (or vice versa) when you do that.

     


    berillio wrote:

     

    The RN214a is currently using eth0. Is it advisable to switch to eth1

    I don't think it matters though it would do no harm.

     


    berillio wrote:

    The RN214s were purchased in April / May 2020 and the RN424 in April 2021

     


    The hardware warranty is 3 years, so you could request an RMA for RN214b.

    • berillio's avatar
      berillio
      Aspirant

      Thank you Stephen B;

      “ Correct.  You can also migrate the disks to the RN424 (or in the other direction) - though the system will need to switch the OS from arm->x86 (or vice versa) when you do that.”

       

      I went for a simple array migration to the RN214a (which now calls itself RN214b, but using a different IP).

      Unfortunately it showed the same problem as the previous “Unit b”. The file system came up but very slowly. RAIDair showed the unit online, but not the Admin & Browse icons for at least five minutes.

      Then I instructed a full data (minus the snapshots) “Teracopy” over the WD10 target drive, but that did not start (because the target drive was too small by 76Gb), but I only realised that 2h later when I checked it, and by then the unit was frozen; I unplugged it and restarted, simply to see a CPU temperature of 71° and likewise extremely high temps for the drives. OUCH.

      I let it cool down for 2 or 3h, then I managed to transfer 78Gb of data before it hung. This morning I tried to copy one folder, data transfer speeds were ~104Mb/sec but then it froze 10 seconds before the end. This evening, the file system was up for a matter of seconds before hourglassing; the unit hung, although the temps were lower than 30° all around (incidentally, I moved the unit to a more “exposed” position, removed the side cheeks and top panel to allow more air in; the drives were also removed and left on the desk to cool down and inserted just before rebooting).

      Now I don’t know anymore what to think

      Should I return the RN214a array to the “Unit A” and check if that is still functional?

      Should I instead test the RN214a array in “Unit B” to see if that hardware is faulty as I assumed it was?

      Should I presume that the 6.10.3 FW on the RN214b array has got corrupted somewhat, upgrade it to (say) 6.10.4 and see if an uncorrupted firmware can read the exhisting file system?

      Should I try the RN214b array in the RN424, maybe some more robust hardware (also with a much bigger fan) could read the file system? But that would imply a firmware upgrade anyway (arm to x86_64) so basically also similar to the previous option + hardware advantage?

       

      Thank to everybody in advance

       

      p.s        This is the content of diskinfo.log from the logs download taken before switching the array to the RN214a unit:

       

      Device:             sda

      Controller:         0

      Channel:            0

      Model:              WDC WD80EFBX-68AZZN0

      Serial:             VRHBHJRK

      Firmware:           85.00A85W

      Class:              SATA

      RPM:                7200

      Sectors:            15628053168

      Pool:               data

      PoolType:           RAID 5

      PoolState:          1

      PoolHostId:         1132353a

      Health data

        ATA Error Count:                0

        Reallocated Sectors:            0

        Reallocation Events:            0

        Spin Retry Count:               0

        Current Pending Sector Count:   0

        Uncorrectable Sector Count:     0

        Temperature:                    41

        Start/Stop Count:               19

        Power-On Hours:                 4764

        Power Cycle Count:              19

        Load Cycle Count:               215

       

      Device:             sdb

      Controller:         0

      Channel:            1

      Model:              WDC WD80EFBX-68AZZN0

      Serial:             VRHBMEDK

      Firmware:           85.00A85W

      Class:              SATA

      RPM:                7200

      Sectors:            15628053168

      Pool:               data

      PoolType:           RAID 5

      PoolState:          1

      PoolHostId:         1132353a

      Health data

        ATA Error Count:                0

        Reallocated Sectors:            0

        Reallocation Events:            0

        Spin Retry Count:               0

        Current Pending Sector Count:   0

        Uncorrectable Sector Count:     0

        Temperature:                    44

        Start/Stop Count:               18

        Power-On Hours:                 4744

        Power Cycle Count:              18

        Load Cycle Count:               213

       

      Device:             sdc

      Controller:         0

      Channel:            2

      Model:              WDC WD80EFBX-68AZZN0

      Serial:             VRGR7MNK

      Firmware:           85.00A85W

      Class:              SATA

      RPM:                7200

      Sectors:            15628053168

      Pool:               data

      PoolType:           RAID 5

      PoolState:          1

      PoolHostId:         1132353a

      Health data

        ATA Error Count:                0

        Reallocated Sectors:            0

        Reallocation Events:            0

        Spin Retry Count:               0

        Current Pending Sector Count:   0

        Uncorrectable Sector Count:     0

        Temperature:                    43

        Start/Stop Count:               17

        Power-On Hours:                 4615

        Power Cycle Count:              17

        Load Cycle Count:               207

       

      Device:             sdd

      Controller:         0

      Channel:            3

      Model:              WDC WD40EFRX-68N32N0

      Serial:             WD-WCC7K6YX6PYY

      Firmware:           82.00A82W

      Class:              SATA

      RPM:                5400

      Sectors:            7814037168

      Pool:               data

      PoolType:           RAID 5

      PoolState:          1

      PoolHostId:         1132353a

      Health data

        ATA Error Count:                0

        Reallocated Sectors:            0

        Reallocation Events:            0

        Spin Retry Count:               0

        Current Pending Sector Count:   0

        Uncorrectable Sector Count:     0

        Temperature:                    33

        Start/Stop Count:               1158

        Power-On Hours:                 17642

        Power Cycle Count:              78

        Load Cycle Count:               1277

      • StephenB's avatar
        StephenB
        Guru

        berillio wrote:

         

        I went for a simple array migration to the RN214a (which now calls itself RN214b, but using a different IP).

        Unfortunately it showed the same problem as the previous “Unit b”. The file system came up but very slowly. RAIDair showed the unit online, but not the Admin & Browse icons for at least five minutes.

         

        So the disks in RN214b cause the same problem when migrated to RN214a.

        I would next try the RN214a disks in RN214b, and confirm that the problem doesn't occur in RN214b with RN214a's disks.

         

        I'd also take a look at the OS partition fullness (not that likely to be the issue, but easy to check).  Look in volume.log, and scroll down to the df -h section. /dev/md0 is the OS partition.

        === df -h ===
        Filesystem      Size  Used Avail Use% Mounted on
        udev             10M  4.0K   10M   1% /dev
        /dev/md0        3.7G  633M  2.9G  18% /

         

        Did you have any apps running in RN214b?

         

        It might be worth asking a mod ( Marc_V or JeraldM ) to review the entire log zip of the problem system.

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More