NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
ziemowit
Jul 26, 2024Aspirant
BTRFS read-only state, OS 6.8.10, ReadyNAS Ultra 4
Hello, everybody! It seems like I have met the same problem as a few other people here, namely BTRFS becoming read-only. Background Machine is Ultra 4, with 2GB memory, converted to run OS6. Capac...
StephenB
Jul 27, 2024Guru - Experienced User
I agree with Sandshark that you should back up the NAS before doing anything else.
Also, I agree that rebooting a NAS with a read-only volume is dangerous. You are lucky that it remounted.
ziemowit wrote:
I have 4 x 2TB disks in my ReadyNAS and all but one are of the same type (3 x ST32000542AS + 1 x ST2000DL003-9VT166). All have same rotational speed.
.
When the time comes to replace disks, I suggest you go with Seagate Ironwolf or WD Red Plus. Avoid the WD Reds, as they are SMR. Most desktop drives between 2-6 TB are also SMR.
ziemowit wrote:
Questions
- I want to understand what is going on. Which boils down to:
- what logs should I read?
- what am I looking for in the logs?
Generally I start with dmesg. Look for disk and btrfs errors, also look at the mdadm commands.
Looking at the bottom of volume.log can help you sort out if you have a full OS partition (md0)
Readnasd.log and status.log are often useful in sorting out the history of degraded volumes
systemd-journal.log, system.log and kernel.log, mdstat.log often have useful clues as well.
Of course you also need to figure out what to do about what you are seeing.
If you like, you can upload the log zip to cloud storage (dropbox, google drive, etc) and send me a PM (private message) with a link. Make sure the link permissions are set so anyone can download. You'd send a PM using the envelope icon in the upper right of the forum page.
It might take a while (over a week) for me to get back to you, as I'm going on vacation soon and won't have much time to spend on this.
ziemowit wrote:
I want to prevent future problems with similar issues.
If you have no backup plan in place for your NAS, then that is the place to begin. RAID isn't enough to keep your data safe. The best way to do that is to have at least one copy on another device (and ideally one copy off-site).
One thing to do is set up a maintenance schedule on the volume settings page. I run one of the four functions (scrub, balance, disk test, defrag) every month. The scrub and the disk test will access every sector of every disk, so both can give early indications of disk issues. I space those test two months apart (filling the gaps with the balance and defrag).
Don't rely on the web ui to give accurate information on disk health. Often there are clear issues in the log zip that don't show up in the web ui. So it is wise to look at the log zip from time to time - particularly after each scrub and disk test completes.
Make sure you maintain at least 15% free space on the volume. If you use snapshots, I also suggest turning off the smart snapshots and switching to custom settings. Set an explicit retention period, and also configure them to only make snapshots when there are changes. This can help manage the amount of space the snapshots take up.
I also suggest getting a UPS for the NAS. That will ensure it shuts down cleanly when there is a power outage. A lot of volume failures involve unclean shutdowns.
If you run the antivirus software, the file search feature, or have apps installed on your NAS, then you should also consider upgrading the RAM to 4 GB. Generally removing apps and disabling services you don't need is a good idea - improving both performance and stability. Antivirus and file search in particular use a lot of system resources.
ziemowit
Aug 27, 2024Aspirant
Thank you, StephenB !
I hope you have had a nice trip... My reply is delayed for several reasons, one of them being reading logs and tryig to understand stuff on my own. I have 3 NASes that have misbehaved from time to time, sometimes displaying filp_close+9 in the LCD just before hanging, sometimes it was some other text. Long time ago I put those NASes on "nightly reboot", in hope that the problems would diminish if the NAS did a fresh restart every night at 3 AM (via Power Management in the GUI). I also searched logs for error messages displayed, an actually I located one of the error messages in the logs, and also saw that I possibly had a disk problem not reflected in the error count on the disk. One of the logs reported repeated problems re-allocating a number of blocks. Finding disks on Ebay in reasonable condition and resonable price, as well as reinstallig fw (reverting to 6.10.8, as I want my sheell-in-a-box installed), restoring backups and resyncing took some time. But then it was fixed.
So I thought.
Boy, I was wrong! This very evening I could not connect to one of the NASes and decided to do a port scan. It showed that two of the machines were having issues: ports 80, 22 and 443, as well as 21 were gone, although there was still somebody listening on ports 25, 110, 119, 143, 465 and others. I then went to the machines and pressed briefly the backup button to see what they say. The one called NAS5 displayed incoherent characters on the display, i.e. gibberish. The one called NAS6 displayed understandable text about backup button being depressed. Oh, well - time to use PWR button twice to take them down in orderly manner. The NAS6 went down in a few minutes, NAS5 just did nothing, so I had to use a long pressure on PWR to cut power the brutal way. After reboot NAS6 went up without problems, but NAS5 started resyncing.
After the reboot I took a dive into the logs. Nothing unusual in NAS5, although it was not accessible and displayed gibberish on the LCD. In the logs of NAS6 (kernel.log and system.log) I could see that it reported error codes from curl all the time after 7PM. The msmtp.log reported about the NAS6 not being able to locate smtp-mail.outlook.com. So somehow I wonder. Do I have faulty hardware? Do I have other problems that I have no clue about? StephenB will you be able and kind to look at my logs? I will send you a PM with a link.
- SandsharkAug 28, 2024Sensei
You may have a hardware issue, but it's a bit early to say you do. The Ethernet and power on/off circuit in your NAS are powered by a separate voltage from the PSU labeled "+5VSB", for +5 Volts Standby. That voltage stays on even when the NAS is "off". And because of that, it's often the first to go, and if the NAS is in a hot environment, that can be worse because the fan is off when the NAS is "off". This also most often happens on units that are powered on and off routinely, which sounds like what your NAS sees. Some of your issues do sound like they could be related to that. The "error message" you see at power-down may not be that -- it may be something you are never supposed to see because the unit shuts down. But it could also be a software error and the last thing executed before a crash prevented reaching power-off.
Are you also sometimes seeing issues where it won't power up, by schedule, power button, or WoL? If so, that points to a PSU issue even more.
- ziemowitAug 28, 2024Aspirant
Hello, thanks for your input. I'll try to answer as well as I can.
The thought it could be a hardware issue occurred to me as well - why would otherwise one of the machines sometimes disappear from the network or hang, and the other one just hang. Are PSU or hardware issues a common problem with those machines? They are not standing in a specially warm environment, it's regular room temperature. The fans are in "quiet" mode, just because the machines are close to my desk. The problems are not related to time of year (i.e. ambient temperature) from what I can remember.
Maybe I had bad luck when buying my Ultra 4 / Pro 4 on Ebay, but I actually cannot recall having those problems when they were running the old (Radiator?) 4.x firmware. Only when I changed to OS6 I got those issues. On the other hand, I used the very same image file to install OS6 on a Ultra 6 and a Pro 6 and they were rock solid. So the OS6 image should be correct.
As far as power off/on goes: I started with the off/on routine because of the problems I experienced with my non-responsive NASes. I thought the random non-responsiveness was caused by some software problem (e.g. memory leak or similar) - and such are often fixed by a reboot, at least in the Windows world. Hence the power off/on routine, that was initiated because of problems, not the other way round. And yes, when you say about something crashing that prevents the NAS from reaching power down, I have seen that manifested in a very special way:
- NAS starting to shut down before scheduled power off
- NAS trying to send email about shutdown
- Ethernet not there, so loggin an error
- In the morning I see a NAS dead but with power on
- Hard power off via button and subsequent power on
- Only then I get the email about NAS shutting down.
As far as POWER UP is concerned, no issues whatsoever. Not a single time. When it powers off on schedule then it always powers up on schedule. The same with manual power off / on. Never tried WoL, as I do not have that need.Any other ideas that I should dig into?
- SandsharkAug 29, 2024Sensei
The power supplies used in 4-bay legacy ReadyNAS have a modified connector pin-out, but are otherwise just standard supplies for a small-profile desktop computer. As such, they have a limited life -- and few keep a computer as long as they do a NAS. So I would not say they have "problems", but they are a common first casualty due to age.
Have you added RAM to the unit? While most don't have an issue with just the base 1GB until OS 6.10.x, it can be an issue, especially if you run apps.
Replacement supplies are available on eBay, or you could make an adapter cable and use an external ATX supply long enough to see if the supply is the issue. Given the age of both the hardware and firmware, you do need to consider whether it's worthwhile investing any more into a legacy ReadyNAS.
- StephenBAug 29, 2024Guru - Experienced User
As an aside, if you don't have ipv6 enabled in your router, then I recommend turning it off in your NAS as well.
ziemowit wrote:
StephenB will you be able and kind to look at my logs? I will send you a PM with a link.
Gibberish could be a corrupted VPD. But you are using SMR drives in NAS5, which is a very bad idea. On top of that, one of them is failing (181 reallocated sectors so far). It is still trying to sync, and does not appear to have gotten to disk 4 yet. If you have no backup, then leave it running until it resyncs. But (assuming ok NAS hardware), you should replace these disks with CMR models - either Seagate Ironwolf or WD Red Plus.
NAS6 is only using the two Hitachi drives . For some reason the two WD30EFRX drives are global spares. I suggest selecting one of those drives, and then formatting it in the NAS, and see if the system adds it to the array.
There is another unusual error on NAS6 -
Aug 27 20:15:16 NAS6 wsdd2[15607]: ======= Backtrace: ========= Aug 27 20:15:16 NAS6 wsdd2[15607]: /lib/x86_64-linux-gnu/libc.so.6(+0x731af)[0x7f07268cd1af] Aug 27 20:15:16 NAS6 wsdd2[15607]: /lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7f0726952aa7] Aug 27 20:15:16 NAS6 wsdd2[15607]: /lib/x86_64-linux-gnu/libc.so.6(+0xf6cc0)[0x7f0726950cc0] Aug 27 20:15:16 NAS6 wsdd2[15607]: /usr/sbin/wsdd2[0x405c48] Aug 27 20:15:16 NAS6 wsdd2[15607]: /usr/sbin/wsdd2[0x4023e6] Aug 27 20:15:16 NAS6 wsdd2[15607]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f072687bb45] Aug 27 20:15:16 NAS6 wsdd2[15607]: /usr/sbin/wsdd2[0x4027de] Aug 27 20:15:16 NAS6 wsdd2[15607]: ======= Memory map: ======== Aug 27 20:15:16 NAS6 wsdd2[15607]: 00400000-0040a000 r-xp 00000000 00:13 41395 /usr/sbin/wsdd2 Aug 27 20:15:16 NAS6 wsdd2[15607]: 00609000-0060a000 r--p 00009000 00:13 41395 /usr/sbin/wsdd2 Aug 27 20:15:16 NAS6 wsdd2[15607]: 0060a000-0060b000 rw-p 0000a000 00:13 41395 /usr/sbin/wsdd2 Aug 27 20:15:16 NAS6 wsdd2[15607]: 0199d000-019be000 rw-p 00000000 00:00 0 [heap] Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726438000-7f072644e000 r-xp 00000000 00:13 40386 /lib/x86_64-linux-gnu/libgcc_s.so.1 Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f072644e000-7f072664d000 ---p 00016000 00:13 40386 /lib/x86_64-linux-gnu/libgcc_s.so.1 Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f072664d000-7f072664e000 rw-p 00015000 00:13 40386 /lib/x86_64-linux-gnu/libgcc_s.so.1 Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f072664e000-7f0726659000 r-xp 00000000 00:13 40366 /lib/x86_64-linux-gnu/libnss_files-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726659000-7f0726858000 ---p 0000b000 00:13 40366 /lib/x86_64-linux-gnu/libnss_files-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726858000-7f0726859000 r--p 0000a000 00:13 40366 /lib/x86_64-linux-gnu/libnss_files-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726859000-7f072685a000 rw-p 0000b000 00:13 40366 /lib/x86_64-linux-gnu/libnss_files-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f072685a000-7f07269fb000 r-xp 00000000 00:13 40357 /lib/x86_64-linux-gnu/libc-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f07269fb000-7f0726bfb000 ---p 001a1000 00:13 40357 /lib/x86_64-linux-gnu/libc-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726bfb000-7f0726bff000 r--p 001a1000 00:13 40357 /lib/x86_64-linux-gnu/libc-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726bff000-7f0726c01000 rw-p 001a5000 00:13 40357 /lib/x86_64-linux-gnu/libc-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726c01000-7f0726c05000 rw-p 00000000 00:00 0 Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726c05000-7f0726c26000 r-xp 00000000 00:13 40354 /lib/x86_64-linux-gnu/ld-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726e19000-7f0726e1c000 rw-p 00000000 00:00 0 Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726e22000-7f0726e25000 rw-p 00000000 00:00 0 Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726e25000-7f0726e26000 r--p 00020000 00:13 40354 /lib/x86_64-linux-gnu/ld-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726e26000-7f0726e27000 rw-p 00021000 00:13 40354 /lib/x86_64-linux-gnu/ld-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726e27000-7f0726e28000 rw-p 00000000 00:00 0 Aug 27 20:15:16 NAS6 wsdd2[15607]: 7ffe509ee000-7ffe50a0f000 rw-p 00000000 00:00 0 [stack] Aug 27 20:15:16 NAS6 wsdd2[15607]: 7ffe50afe000-7ffe50b00000 r--p 00000000 00:00 0 [vvar] Aug 27 20:15:16 NAS6 wsdd2[15607]: 7ffe50b00000-7ffe50b02000 r-xp 00000000 00:00 0 [vdso] Aug 27 20:15:16 NAS6 wsdd2[15607]: ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] Aug 27 20:15:16 NAS6 systemd[1]: wsdd2.service: Main process exited, code=killed, status=6/ABRT Aug 27 20:15:16 NAS6 systemd[1]: wsdd2.service: Unit entered failed state. Aug 27 20:15:16 NAS6 systemd[1]: wsdd2.service: Failed with result 'signal'. Aug 27 20:15:17 NAS6 systemd[1]: wsdd2.service: Service hold-off time over, scheduling restart.If you have a backup, you might also try doing a factory default with all disks in place, and set up the NAS again.
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!