NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
scarygary71
May 25, 2020Aspirant
RN314 read only file system
I've been using a RN314 for backup purposes for a bunch of years now and all of a sudden the filesystem is read only. As this is a backup system and the primary storage is still intact as well as the...
StephenB
May 25, 2020Guru - Experienced User
You shouldn't post log files publicly, so please unshare your log zip.
There are btrfs errors in the file system.
May 25 00:43:06 nasbackup kernel: BTRFS error (device md127): parent transid verify failed on 3710532419584 wanted 17240 found 17882 May 25 00:43:06 nasbackup kernel: BTRFS error (device md127): parent transid verify failed on 3710532419584 wanted 17240 found 17882 May 25 00:43:06 nasbackup kernel: BTRFS warning (device md127): Skipping commit of aborted transaction. May 25 00:43:06 nasbackup kernel: BTRFS: error (device md127) in cleanup_transaction:1864: errno=-5 IO failure May 25 00:43:06 nasbackup kernel: BTRFS info (device md127): forced readonly May 25 00:43:06 nasbackup kernel: BTRFS: error (device md127) in btrfs_drop_snapshot:9412: errno=-5 IO failure May 25 00:43:06 nasbackup kernel: BTRFS info (device md127): delayed_refs has NO entry
I don't know what is causing them though.
I am a bit confused on the reformatting statement, as there are some much older dates in the log zip. For instance, smart_history.log begins with
2016-04-28 19:04:09 TOSHIBA DT01ACA200 ...
Did you do a factory default? Or just recreate the data volume?
scarygary71
May 25, 2020Aspirant
I had no idea there was sensitive information in the log files. I unshared the file.
I only recreated the data volume. I might have factory reset the nas a long time ago though. Probably years ago.
I've replaced a few drives over the years.
- SandsharkMay 25, 2020Sensei
Your NAS is having trouble communicating with one or more of the drives. Are you seeing ATA errors recorded on any of them? Note that the drive often will record an ATA error even if the problem is with the NAS, so don't jump to the conclusion a drive is bad. That takes some more testing.
Step one would be to (with power off) pull and re-seat all drives, checking the integrity and cleanliness of the connectors as you do so. If the log doesn't point to a specific drive, doing the built-in drive test might.
- scarygary71May 25, 2020Aspirant
No ATA errrors since the last reboot. Smart test passed on all the drives. I'll ask someone on site to reseat the drives. If that doesn't work I guess it might be time to replace the NAS alltogether. It's done it's job flawlessly since 2013.
- scarygary71May 26, 2020Aspirant
I got someone on site to shut the nas down. Pull out all the drives, clean out all the dust and reinsert the drives. Unfortunately it only took a few minutes after powering on until the volume was back to read only. Waiting for a quote for a replacement NAS, but until then I'll try a factory reset as a last resort.
- StephenBMay 26, 2020Guru - Experienced User
scarygary71 wrote:
Unfortunately it only took a few minutes after powering on until the volume was back to read only.
What you did wouldn't solve the btrfs errors that are already on the data volume.
A factory reset will. Though I still recommend getting a UPS for the NAS (this one or the replacement).
- StephenBMay 25, 2020Guru - Experienced User
scarygary71 wrote:
I only recreated the data volume.
Ok, that makes sense then.
The puzzle is figuring out what might be corrupting the file system. Two disks are old (~7 years) but they all look healthy.
This bit from kernel.log doesn't look good:
Apr 06 00:59:36 nasbackup kernel: INFO: task btrfs:28575 blocked for more than 1000 seconds. Apr 06 00:59:36 nasbackup kernel: Tainted: P O 4.4.190.x86_64.1 #1
But something else clearly happened on 17 May, starting here:
May 17 02:27:55 nasbackup systemd[1]: systemd-journald.service: Main process exited, code=killed, status=6/ABRT May 17 02:27:55 nasbackup systemd[1]: systemd-journald.service: Unit entered failed state.
...
May 17 02:27:55 nasbackup systemd[1]: Starting Reboot...Was this a manual reboot? The first btrfs error happened about 40 minutes later:
May 17 03:11:12 nasbackup kernel: BTRFS error (device md127): parent transid verify failed on 393904128 wanted 622897 found 625113
That is when the volume was set to read-only.
It looks like the NAS is using volume encryption, and also that it is not protected by a UPS. Is that correct?
scarygary71 wrote:
I had no idea there was sensitive information in the log files. I unshared the file.
That is best. I can see the email address you are using for alerts. There is also some username leakage in some logs. I don't know what is in db.dump (never looked), but there could be more stuff in there.
I will delete the download when we're done with this discussion about the logs (and of course won't re-share anything).
- scarygary71May 25, 2020Aspirant
Not entirely sure what happened on the 17th. Maybe a patch was applied? I think the NAS hung after the upgrade and someone on site had to pull the plug.
If volume encryption is enabled, it's not by choice. I don't even know where to enable that. There's no UPS connected to the NAS, that's correct.
- StephenBMay 25, 2020Guru - Experienced User
scarygary71 wrote:
Not entirely sure what happened on the 17th. Maybe a patch was applied? I think the NAS hung after the upgrade and someone on site had to pull the plug.
The update to 6.10.3 happened on 17 March (not 17 May). There have been no hot-fixes for 6.10.3.
scarygary71 wrote:
There's no UPS connected to the NAS, that's correct.
I am wondering if there might have been a power glitch that caused the NAS to shutdown and then reboot. Lost (cached) disk writes might then have resulted in a corrupt volume. The log would show the reboot, but not what caused it.
If a glitch is the result of a failing power supply in the NAS, then a UPS wouldn't help. But I think it's worth getting a UPS anyway (one that the NAS can monitor over USB, so it can do a clean shutdown if something happens with the mains power supply). A lot of lost volume stories here begin with a power issue.
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!