NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
RT6507
Jan 27, 2023Tutor
ReadyNAS 316 (6x4TB Enterprise) - No volumes after graceful shutdown and restart
I shutdown my RN316 while leaving for a few days. Upon restart, I was able to connect and access normally. However after about 15 minutes when I tried to save a file to the NAS I was not able to acce...
RT6507
Jan 29, 2023Tutor
Your help is much appreciated! I can move the drives (one at a time) to a Windows chassis I have standing by, but it only has three available SATA connections. Can I pull and examine the drives (WD Red Pro 4.0 TB) from the RN316 individually without overwriting any needed RAID tables?
Also, I don't see any explicit drive failure BTRFS errors for drive 1 (SDB3). This is from Systemd-Journal.log. It implies Disk 1 (SDB3) is offline.
Jan 26 08:46:46 NAS_IV kernel: BTRFS: device label 7c6e3b76:root devid 1 transid 4240694 /dev/md0
Jan 26 08:46:46 NAS_IV systemd[1]: systemd 44 running in system mode. (+PAM +LIBWRAP +AUDIT +SELINUX +IMA +SYSVINIT +LIBCRYPTSETUP; debian)
Jan 26 08:46:46 NAS_IV systemd[1]: Set hostname to <NAS_IV>.
Jan 26 08:46:46 NAS_IV udevd[1342]: starting version 175
Jan 26 08:46:46 NAS_IV systemd-journal[1333]: Journal started
Jan 26 08:46:46 NAS_IV kernel: md: md127 stopped.
Jan 26 08:46:46 NAS_IV kernel: md: bind<sdb3>
Jan 26 08:46:46 NAS_IV kernel: md: bind<sdc3>
Jan 26 08:46:46 NAS_IV kernel: md: bind<sdd3>
Jan 26 08:46:46 NAS_IV kernel: md: bind<sde3>
Jan 26 08:46:46 NAS_IV kernel: md: bind<sdf3>
Jan 26 08:46:46 NAS_IV kernel: md: bind<sda3>
Jan 26 08:46:46 NAS_IV kernel: md: kicking non-fresh sdb3 from array!
Jan 26 08:46:46 NAS_IV kernel: md: unbind<sdb3>
Jan 26 08:46:46 NAS_IV kernel: md: export_rdev(sdb3)
Jan 26 08:46:46 NAS_IV kernel: md/raid:md127: device sda3 operational as raid disk 0
Jan 26 08:46:46 NAS_IV kernel: md/raid:md127: device sdf3 operational as raid disk 5
Jan 26 08:46:46 NAS_IV kernel: md/raid:md127: device sde3 operational as raid disk 4
Jan 26 08:46:46 NAS_IV kernel: md/raid:md127: device sdd3 operational as raid disk 3
Jan 26 08:46:46 NAS_IV kernel: md/raid:md127: device sdc3 operational as raid disk 2
Jan 26 08:46:46 NAS_IV kernel: md/raid:md127: allocated 0kB
Jan 26 08:46:46 NAS_IV kernel: md/raid:md127: raid level 5 active with 5 out of 6 devices, algorithm 2
Jan 26 08:46:46 NAS_IV kernel: RAID conf printout:
Jan 26 08:46:46 NAS_IV kernel: --- level:5 rd:6 wd:5
Jan 26 08:46:46 NAS_IV kernel: disk 0, o:1, dev:sda3
Jan 26 08:46:46 NAS_IV kernel: disk 2, o:1, dev:sdc3
Jan 26 08:46:46 NAS_IV kernel: disk 3, o:1, dev:sdd3
Jan 26 08:46:46 NAS_IV kernel: disk 4, o:1, dev:sde3
Jan 26 08:46:46 NAS_IV kernel: disk 5, o:1, dev:sdf3
Jan 26 08:46:46 NAS_IV kernel: created bitmap (30 pages) for device md127
Jan 26 08:46:46 NAS_IV kernel: md127: bitmap initialized from disk: read 2 pages, set 568 of 59543 bits
Jan 26 08:46:46 NAS_IV kernel: md127: detected capacity change from 0 to 19979093934080
Jan 26 08:46:46 NAS_IV start_raids[1325]: mdadm: /dev/md/data-0 has been started with 5 drives (out of 6).
Jan 26 08:46:46 NAS_IV kernel: Adding 2094844k swap on /dev/md1. Priority:-1 extents:1 across:2094844k
Jan 26 08:46:47 NAS_IV kernel: BTRFS: device label 7c6e3b76:data devid 1 transid 285232 /dev/md127
StephenB
Jan 29, 2023Guru - Experienced User
RT6507 wrote:
This is from Systemd-Journal.log. It implies Disk 1 (SDB3) is offline.
Jan 26 08:46:46 NAS_IV kernel: md: kicking non-fresh sdb3 from array!
Jan 26 08:46:46 NAS_IV kernel: md: unbind<sdb3>
Jan 26 08:46:46 NAS_IV kernel: md: export_rdev(sdb3)
Jan 26 08:46:46 NAS_IV kernel: md/raid:md127: device sda3 operational as raid disk 0
Jan 26 08:46:46 NAS_IV kernel: md/raid:md127: device sdf3 operational as raid disk 5
Jan 26 08:46:46 NAS_IV kernel: md/raid:md127: device sde3 operational as raid disk 4
Jan 26 08:46:46 NAS_IV kernel: md/raid:md127: device sdd3 operational as raid disk 3
Jan 26 08:46:46 NAS_IV kernel: md/raid:md127: device sdc3 operational as raid disk 2
Jan 26 08:46:46 NAS_IV kernel: md/raid:md127: allocated 0kB
Jan 26 08:46:46 NAS_IV kernel: md/raid:md127: raid level 5 active with 5 out of 6 devices, algorithm 2
Actually it doesn't. It says that sdb3 is out of sync with the rest of the array, so it is being removed.
Also, at this point in time the array would have been degraded, but would still have been mounted. So something must have happened after that with disk 6.
If you send me the log zip, I could take a look. Do that via a private message (PM) using the envelope icon in the upper right hand of the forum page. You'll need to put the zip into cloud storage (dropbox, etc), and include a sharable link in the PM.
RT6507 wrote:
Your help is much appreciated! I can move the drives (one at a time) to a Windows chassis I have standing by, but it only has three available SATA connections. Can I pull and examine the drives (WD Red Pro 4.0 TB) from the RN316 individually without overwriting any needed RAID tables?
You'd need to power down the NAS, and then test the two disks. Leave the NAS powered down until you return them to their proper slots in the NAS.
- RT6507Jan 30, 2023Tutor
I sent the link for the log files .ZIP. You should also see the WD Dasboard screen shots for Disk 1 (sdb) and Disk 5 (sdf). Let me know if they aren't visible.
- StephenBJan 30, 2023Guru - Experienced User
RT6507 wrote:
I sent the link for the log files .ZIP. You should also see the WD Dasboard screen shots for Disk 1 (sdb) and Disk 5 (sdf). Let me know if they aren't visible.
I am able to access the files.
You are running rather old firmware (6.5.1), it would be good to update that after the issue is resolved.
Piecing together the history from a couple of logs shows this:
Dec 22 09:55:20 NAS_IV readynasd[2914]: Volume data health changed from Redundant to Degraded. Jan 23 12:18:10 NAS_IV readynasd[2914]: The system is shutting down. Jan 26 08:46:46 NAS_IV kernel: md: kicking non-fresh sdb3 from array! Jan 26 08:46:46 NAS_IV start_raids[1325]: mdadm: /dev/md/data-0 has been started with 5 drives (out of 6). Jan 26 08:48:05 NAS_IV mdadm[2296]: DegradedArray event detected on md device /dev/md127 Jan 26 08:49:04 NAS_IV mdadm[2296]: RebuildStarted event detected on md device /dev/md127, component device recovery Jan 26 08:55:22 NAS_IV mdadm[2296]: Rebuild95 event detected on md device /dev/md127, component device recovery Jan 26 08:55:23 NAS_IV kernel: md: md127: recovery interrupted. Jan 26 08:55:23 NAS_IV mdadm[2296]: Fail event detected on md device /dev/md127, component device /dev/sdf3 Jan 26 08:55:24 NAS_IV mdadm[2296]: RebuildFinished event detected on md device /dev/md127, component device recovery Jan 26 08:55:34 NAS_IV readynasd[3000]: Disk in channel 6 (Internal) changed state from ONLINE to FAILED. Jan 26 09:51:38 NAS_IV readynasd[3000]: The system is shutting down. Jan 26 09:54:23 NAS_IV kernel: md: kicking non-fresh sdf3 from array! Jan 26 09:54:23 NAS_IV kernel: md: md127 stopped. Jan 26 09:54:23 NAS_IV start_raids[1323]: mdadm: NOT forcing event count in /dev/sdf3(5) from 16702 up to 16711 Jan 26 09:54:23 NAS_IV start_raids[1323]: mdadm: You can use --really-force to do that (DANGEROUS) Jan 26 09:54:23 NAS_IV start_raids[1323]: mdadm: failed to RUN_ARRAY /dev/md/data-0: Input/output error Jan 26 09:54:23 NAS_IV start_raids[1323]: mdadm: Not enough devices to start the array. Jan 26 10:13:18 NAS_IV readynasd[2744]: The system is rebooting.As you can see, the array became degraded in December. That message is repeated every day at 1 am until 23 January, when the system was shut down. I can't tell which disk caused the degradation event (the detailed logs don't go back that far), but my guess would be disk 2.
When you booted on 26 January, it looks like disk 2 came on-line. The system tried to resync disk 2, but then ran into an error in disk 6 - which caused the resync to fail.
The system was rebooted on 26 January, about 9:54. At that point, both disk 2 and disk 6 were out of sync, so the array failed.
The disk_info.log shows 1 current pending sector for sdb, and 2 for sdf.
Device: sdb Controller: 0 Channel: 1 Model: WDC WD4001FFSX-68JNUN0 Firmware: 81.00A81 Class: SATA RPM: 7200 Sectors: 7814037168 Pool: data-0 PoolType: RAID 5 PoolState: 5 PoolHostId: 7c6e3b76 Health Data: ATA Error Count: 0 Reallocated Sectors: 0 Reallocation Events: 0 Spin Retry Count: 0 Current Pending Sector Count: 1 Uncorrectable Sector Count: 0 Temperature: 45 Start/Stop Count: 92 Power-On Hours: 46365 Power Cycle Count: 92 Load Cycle Count: 39Device: sdf Controller: 0 Channel: 5 Model: WDC WD4001FFSX-68JNUN0 Firmware: 81.00A81 Class: SATA RPM: 7200 Sectors: 7814037168 Pool: data-0 PoolType: RAID 5 PoolState: 5 PoolHostId: 7c6e3b76 Health Data: ATA Error Count: 0 Reallocated Sectors: 0 Reallocation Events: 0 Spin Retry Count: 0 Current Pending Sector Count: 2 Uncorrectable Sector Count: 0 Temperature: 42 Start/Stop Count: 79 Power-On Hours: 45281 Power Cycle Count: 79 Load Cycle Count: 37As far as the error codes in dashboard goes,WDC says
The previous self-test completed having the read element of the test failed. Retest after checking the connections. Replace the drive if the error repeats.
I think it does make sense to double-check the connections, and try the test again.
As far as recovery goes, despite the "DANGEROUS" comment on the --really-force line above, that option is probably your best path to in-place recovery. Disk 2 appears to have been off-line for a very long time, so you'd want to use that option with disk 2 removed. Assuming the volume assembles and mounts, you'd then want to make a full backup before doing anything else.
After getting the array healthy, I'd recommend upgrading the firmware, and also scheduliing the system maintenance tasks. Fixing the problem with email alerts would also be a good idea - if they were working, you'd have received the alert on 22 December when the volume first became degraded.
Let me know if you want to go down this path - we can help with the needed commands. Other options would be to use RAID recovery software in the Windows PC (ReclaiMe or other software that supports BTRFS), or to contract with a data recovery service. Those are more expensive, but would probably have somewhat less risk to your data.
- RT6507Jan 30, 2023Tutor
Ok, I'd like to try and fix based on your help with commands. Will I still be able to resort to Reclaime or a data recovery service if I'm not able to remount the volume?
- StephenBJan 30, 2023Guru - Experienced User
RT6507 wrote:
Will I still be able to resort to Reclaime or a data recovery service if I'm not able to remount the volume?
The more you do on your own, the more difficult recovery can become. But you should still be able to do it.
What you'd need to do is power up the NAS with disk 2 removed. Enable ssh on the NAS, and log in with ssh from a PC. The username is "root", the password is the NAS admin password. (From windows 10 or 11, you'd enter ssh root@nas-ip-address from the windows search bar - using the real NAS IP address of course).
From there, you would enter
mdadm --assemble --really-force /dev/sdf3If that works, you can then manually mount the array with
btrfs device scan mount /dev/md127 /dataIf you run into problems with these commands, just post back.
- RT6507Jan 30, 2023Tutor
While trying to enable SSH I get "Service Operation Failed - cannot start service without volume"
- StephenBJan 30, 2023Guru - Experienced User
RT6507 wrote:
While trying to enable SSH I get "Service Operation Failed - cannot start service without volume"
Annoying.
You can boot up the system in tech support mode, and access with telnet. Instructions are on page 81-82 here:
The username is root, the password is infr8ntdebug.
Once in, you start the RAID and chroot with this command:
rnutil chrootThen you can use the instructions above.
Once the array mounts, you should be able to reboot and still see the volume.
- RT6507Jan 31, 2023Tutor
I will attempt this later today when I have a clear slate.
- RT6507Jan 31, 2023Tutor
OK, I'm telnetted in but mdadm error says: "/dev/sdf3 not identified in config file."
I've uploaded a screen shot of the Telnet session to cloud.
- StephenBFeb 01, 2023Guru - Experienced User
RT6507 wrote:
OK, I'm telnetted in but mdadm error says: "/dev/sdf3 not identified in config file."
I've uploaded a screen shot of the Telnet session to cloud.
Try adding --scan
mdadm --assemble --scan --really-force /dev/sdf3If that doesn't help,try
mdadm --examine /dev/sdf* - RT6507Feb 01, 2023Tutor
The first command (--assemble) did not succeed.
The second command (--examine) returned a list of partitions but only one that referenced /dev/sdf1. I tried "mdadm --assemble --really-force /dev/sd1" but got the mdadm error: "device /dev/sdf1 exists but is not an md array."
- RT6507Feb 01, 2023Tutor
Screenshot on cloud.
- StephenBFeb 01, 2023Guru - Experienced User
RT6507 wrote:
The first command (--assemble) did not succeed.
The second command (--examine) returned a list of partitions but only one that referenced /dev/sdf1. I tried "mdadm --assemble --really-force /dev/sd1" but got the mdadm error: "device /dev/sdf1 exists but is not an md array."
Not a good sign. sdf3 was identified as part of the array in the logs, but that was 26 January. sdf1 would normally be part of the OS partition - so also part of an md array (just not the data volume).
You could try powering down, and putting sdb back into the NAS. Then go back into tech support mode, and try
mdadm --examine /dev/sdb3Note the event counter (if it gives you one). Then also run
mdadm --examine /dev/sda3and see how the event counter compares.
- RT6507Feb 01, 2023Tutor
Replaced Disk 1 (sdb3) and restarted in Tech Support mode. I see six blue LEDs alonside drive caddies. Ran mdadm --examine on sdb3 and sda3 but did not see any event counters. I also ran mdadm --examine on sdf3 and saw the same results.
If I didn't know better I'd think the drives were fine.
Screen shots on cloud.
- StephenBFeb 01, 2023Guru - Experienced User
RT6507 wrote:
Ran mdadm --examine on sdb3 and sda3 but did not see any event counters.
/dev/sda3 Events: 16711 /dev/sdb3: Events: 16711 /dev/sdf3: Events: 16702This is a bit odd, since last time mdadm told you sdf3 wasn't in an array.
A bit more odd is that disk 2 appears to be in sync (the event counter matches sda3)..
I think the next step is to power down, remove sdf, and power up normally, and see what happens. If the volume does mount, then back up the data before doing anything else.
- RT6507Feb 01, 2023Tutor
No luck. Can't access old Windows Shares. WebUI shows all 5 drives healthy but "no volumes exist". I uploaded a screen shot of WebUI status.
- AnishaAFeb 02, 2023NETGEAR Employee Retired
Hello RT6507
Please see the volume logs once and please find whether the md127 is mounted.
If md127 is not mounted then please mount the volume using the below command
df -h
cat/ proc/partitions
mdadm -A --scan
We offer paid data recovery service.
Note: Data recovery is to investigate whether the data can be recovered. We do not promise to recover the data. Data recovery is paid to check and investigate.
Have a lovely day,
Anisha A
Netgear Team
- RT6507Feb 02, 2023Tutor
Uploaded System-log zip to cloud storage.
I don't see a mention of md127 in the Volume.log. Would I issue the mdadm mount commands from Tech Support mode as Telnet session seems refused with NAS_IV in standard bootup?
- RT6507Feb 02, 2023Tutor
OK. ran df -h and others from Telnet - Tech Support mode.
error on 2nd cmd: "-sh: cat/: not found"
mdadm: "/dev/md/data-0 assembled from 4 drives and 1 spare - not enough to start the array."
Screenshot uploaded.
What are the terms for data recovery from you?
- RT6507Feb 02, 2023Tutor
Also, tried these commands with all six drives installed and got slightly different messages. See uploaded screen shot.
- RT6507Feb 03, 2023Tutor
Hi StephenB.
I wonder if you saw my last logs and screen shots uploaded to Box? Do you see any hope to continue trying to restore my RN316 to normal operations or should I shift to a data recovery focus. Either way, Thank You for your insight and patience in trying to help me resolve this problem. Your instructions have been clear and concise. I'm grateful for your help. If its time to admit defeat I'll know we tried our best.
- RT6507Feb 04, 2023Tutor
Anisha A
Can you provide more information on Netgear data recovery service, such as costs and what you would need to access my NAS? Thanks.
- AnishaAFeb 05, 2023NETGEAR Employee Retired
Hello RT6507 ,
The basic pay for data recovery service would cost around 200 USD. And we would need SDM to do the data recovery process.
Note: Data recovery is to investigate whether the data can be recovered. We do not promise to recover the data. Data recovery is paid to check and investigate.
Have a lovely day,
Anisha A
Netgear Team - RT6507Feb 06, 2023Tutor
OK. What is "SDM"?
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!