NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

Esoteric's avatar
Jul 26, 2021
Solved

Remove inactive volumes after hard drive upgrade RN104

I have an iMac with a RN104 to keep track of all my design/art files. It was starting to fill up so I upgraded and bought 2 x 4tb hard drives to replace the ones in my RN104.   I put in 1 x 4tb dri...
  • rn_enthusiast's avatar
    rn_enthusiast
    Jul 26, 2021

    Thanks for the logs Esoteric 

     

    So, here are the events...

     

    You replaced disk 2 (which was also a dying disk with tons of ATA errors). I can see that had been going on for a while, so I suspect you don't have email alerts setup? In any case, whether by luck or intentional, you replaced the bad disk and the raid started to sync.

    [21/07/22 20:51:22 WEST] notice:disk:LOGMSG_SMART_ATA_ERR_30DAYS_WARN Detected increasing ATA error count: [13955] on disk 2 (Internal) [WDC WD20EFRX-68EUZN0, WD-WCC4M1808373] 11455 times in the past 30 days. This condition often indicates an impending failure. Be prepared to replace this disk to maintain data redundancy.
    [21/07/22 20:51:29 WEST] notice:disk:LOGMSG_SMART_ATA_ERR_30DAYS_WARN Detected increasing ATA error count: [13955] on disk 2 (Internal) [WDC WD20EFRX-68EUZN0, WD-WCC4M1808373] 11455 times in the past 30 days. This condition often indicates an impending failure. Be prepared to replace this disk to maintain data redundancy.
    [21/07/22 20:52:12 WEST] notice:disk:LOGMSG_SMART_ATA_ERR_30DAYS_WARN Detected increasing ATA error count: [13956] on disk 2 (Internal) [WDC WD20EFRX-68EUZN0, WD-WCC4M1808373] 11456 times in the past 30 days. This condition often indicates an impending failure. Be prepared to replace this disk to maintain data redundancy.
    [21/07/22 20:56:20 WEST] notice:disk:LOGMSG_SMART_ATA_ERR_30DAYS_WARN Detected increasing ATA error count: [13956] on disk 2 (Internal) [WDC WD20EFRX-68EUZN0, WD-WCC4M1808373] 11456 times in the past 30 days. This condition often indicates an impending failure. Be prepared to replace this disk to maintain data redundancy.
    [21/07/22 20:57:46 WEST] warning:disk:LOGMSG_DELETE_DISK Disk Model:WDC WD20EFRX-68EUZN0 Serial:WD-WCC4M1808373 was removed from Channel 2 of the head unit.
    [21/07/22 20:57:54 WEST] warning:volume:LOGMSG_HEALTH_VOLUME Volume data health changed from Redundant to Degraded.
    [21/07/22 20:58:45 WEST] notice:disk:LOGMSG_ADD_DISK Disk Model: ST4000VN008-2DR166 Serial:ZGY94XWV was added to Channel 2 of the head unit.
    [21/07/22 20:59:28 WEST] notice:volume:LOGMSG_RESILVERSTARTED_VOLUME Resyncing started for Volume data.

    The raid successfully synced. BTW - StephenB in my experience it is pretty normal for an RN104 to take this long for a raid sync.

    [21/07/23 12:46:10 WEST] notice:volume:LOGMSG_RESILVERCOMPLETE_VOLUME Volume data is resynced.
    [21/07/23 12:46:11 WEST] notice:volume:LOGMSG_HEALTH_VOLUME Volume data health changed from Degraded to Redundant.
    [21/07/23 12:46:11 WEST] notice:disk:LOGMSG_ZFS_DISK_STATUS_CHANGED Disk in channel 2 (Internal) changed state from RESYNC to ONLINE.

    You had correctly waited till the raid had synced and you then replaced disk 1 for new larger disk.

    [21/07/23 13:54:23 WEST] warning:disk:LOGMSG_DELETE_DISK Disk Model:WDC WD10EFRX-68PJCN0 Serial:WD-WCC4J1972758 was removed from Channel 1 of the head unit.
    [21/07/23 13:54:25 WEST] warning:volume:LOGMSG_HEALTH_VOLUME Volume data health changed from Redundant to Degraded.
    [21/07/23 13:56:20 WEST] notice:disk:LOGMSG_ADD_DISK Disk Model: ST4000VN008-2DR166 Serial:ZGY94KA2 was added to Channel 1 of the head unit.
    [21/07/23 13:56:36 WEST] notice:volume:LOGMSG_RESILVERSTARTED_VOLUME Resyncing started for Volume data.

    At this point, you are still good.

     

    But then 3 mins later we see multiple disks being pulled and added - at this point the raid would have stopped since that is a essentially a multiple disk failure during a raid resync. Do you know why this happened? Were you pulling these disks in and out?

    [21/07/23 13:59:55 WEST] notice:disk:LOGMSG_ADD_DISK Disk Model:WDC WD10EFRX-68PJCN0 Serial:WD-WCC4J1977774 was added to Channel 3 of the head unit.
    [21/07/23 13:59:56 WEST] warning:disk:LOGMSG_DELETE_DISK Disk Model:WDC WD10EFRX-68PJCN0 Serial:WD-WCC4J1977774 was removed from Channel 3 of the head unit.
    [21/07/23 14:03:02 WEST] notice:disk:LOGMSG_ADD_DISK Disk Model:WDC WD10EFRX-68PJCN0 Serial:WD-WCC4J1972758 was added to Channel 4 of the head unit.
    [21/07/23 14:03:44 WEST] warning:disk:LOGMSG_DELETE_DISK Disk Model:WDC WD10EFRX-68PJCN0 Serial:WD-WCC4J1972758 was removed from Channel 4 of the head unit.
    [21/07/23 14:04:44 WEST] warning:disk:LOGMSG_DELETE_DISK Disk Model: ST4000VN008-2DR166 Serial:ZGY94KA2 was removed from Channel 1 of the head unit.
    [21/07/23 14:05:09 WEST] notice:disk:LOGMSG_ADD_DISK Disk Model:WDC WD10EFRX-68PJCN0 Serial:WD-WCC4J1972758 was added to Channel 1 of the head unit.

    Following this, we see multiple drives again being pulled and re-added, several reboots and shutdown - even adding back in the old bad disk 2. I assume this was part of the troubleshooting as you indicated in your original post.

    [21/07/23 14:16:24 WEST] info:system:LOGMSG_READYNASD_ABORTED_NOINFO ReadyNASOS service or process was restarted.
    [21/07/23 14:16:57 WEST] info:system:LOGMSG_START_READYNASD ReadyNASOS background service started.
    [21/07/23 14:20:09 WEST] notice:disk:LOGMSG_ADD_DISK Disk Model:WDC WD10EFRX-68PJCN0 Serial:WD-WCC4J1977774 was added to Channel 1 of the head unit.
    [21/07/23 14:20:14 WEST] warning:disk:LOGMSG_DELETE_DISK Disk Model:WDC WD10EFRX-68PJCN0 Serial:WD-WCC4J1972758 was removed from Channel 1 of the head unit.
    [21/07/23 14:20:15 WEST] warning:disk:LOGMSG_DELETE_DISK Disk Model:WDC WD10EFRX-68PJCN0 Serial:WD-WCC4J1977774 was removed from Channel 3 of the head unit.
    [21/07/23 14:20:57 WEST] notice:system:LOGMSG_SYSTEM_HALT The system is shutting down.
    [21/07/23 14:24:36 WEST] info:system:LOGMSG_START_READYNASD ReadyNASOS background service started.
    [21/07/23 14:25:11 WEST] notice:disk:LOGMSG_ZFS_DISK_STATUS_CHANGED Disk in channel 3 (Internal) changed state from RESYNC to ONLINE.
    [21/07/23 14:26:28 WEST] notice:disk:LOGMSG_ADD_DISK Disk Model: ST4000VN008-2DR166 Serial:ZGY94KA2 was added to Channel 4 of the head unit.
    [21/07/23 14:30:43 WEST] notice:system:LOGMSG_SYSTEM_REBOOT The system is rebooting.
    [21/07/23 14:34:16 WEST] info:system:LOGMSG_START_READYNASD ReadyNASOS background service started.
    [21/07/23 14:35:25 WEST] warning:disk:LOGMSG_DELETE_DISK Disk Model: ST4000VN008-2DR166 Serial:ZGY94KA2 was removed from Channel 4 of the head unit.
    [21/07/23 14:35:31 WEST] notice:system:LOGMSG_SYSTEM_REBOOT The system is rebooting.
    [21/07/23 14:38:48 WEST] info:system:LOGMSG_START_READYNASD ReadyNASOS background service started.
    [21/07/23 14:41:24 WEST] notice:system:LOGMSG_SYSTEM_HALT The system is shutting down.
    [21/07/23 14:52:00 WEST] notice:disk:LOGMSG_SMART_ATA_ERR_30DAYS_WARN Detected increasing ATA error count: [13997] on disk 2 (Internal) [WDC WD20EFRX-68EUZN0, WD-WCC4M1808373] 11307 times in the past 30 days. This condition often indicates an impending failure. Be prepared to replace this disk to maintain data redundancy.
    [21/07/23 14:52:00 WEST] notice:disk:LOGMSG_SMART_ATA_ERR_30DAYS_WARN Detected increasing ATA error count: [13997] on disk 2 (Internal) [WDC WD20EFRX-68EUZN0, WD-WCC4M1808373] 11307 times in the past 30 days. This condition often indicates an impending failure. Be prepared to replace this disk to maintain data redundancy.
    [21/07/23 14:52:05 WEST] info:system:LOGMSG_START_READYNASD ReadyNASOS background service started.
    [21/07/23 14:53:49 WEST] warning:disk:LOGMSG_DELETE_DISK Disk Model:WDC WD20EFRX-68EUZN0 Serial:WD-WCC4M1808373 was removed from Channel 2 of the head unit.
    [21/07/23 14:53:55 WEST] notice:disk:LOGMSG_ADD_DISK Disk Model: ST4000VN008-2DR166 Serial:ZGY94XWV was added to Channel 2 of the head unit

     

    The raid died when disks we being pulled during the raid-sync. So, my question is; why these disks were pulled just 3 mins after disk 1 was replaced for a new larger disk? You added a new disk 1 at 21/07/23 13:56:20 WEST but then started to pull drives just 3 minutes after. Do you remember this or what caused you to take that action?

     

    I also observe the NAS being on a very old firmware. While that isn't the cause it should be updated whenever you get the raid back up and running.

    ReadyNASOS!!version=6.6.1,time=1482880160,arch=arm,descr=ReadyNASOS

    What is needed at this point, is some delicate manual raid assembly with all the disks that were in the NAS at 21/07/23 13:56:36 WEST. The new disk 1 added just prior isn't going to help as the raid sync on that disk never finished (as other disks were pulled 3 mins later and the raid stopped working), however the remaining disks that resided in the NAS at that time, should be enough. The raid can be assembled in degraded mode and likely saved without too much trouble but reclaiME isn't going to help you here. It needs manual raid assembly.

     

    My advise to you would be to bite the bullet and pay Netgear Support for a data recovery contract. Let their Level 3 team try and save the raid. As the lads have said already, please ensure to have backups of important data in the future.

     

    Cheers

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More