Hello, everybody!It seems like I have met the same problem as a few other people here, namely BTRFS becoming read-only.BackgroundMachine is Ultra 4, with 2GB memory, converted to run OS6. Capacity 4x2TB, RAID5.I found my machine "misbehaving" this evening. I have had some work in progress during the day with reading and writing files to the network share. But when I restarted my work in the evening I still could read my files, but no longer write anything.I went to the machine and pressed Backup button, nothing strange here. Backups started.Tried to reboot machine, but unfortunately the "hard" way, i.e. Power button.Machine came up with volumes mounted read-only.Checked ATA errors on the disks, zero on all of them, supposedly healthy.Tried to find a fix on the community pages, no luck.Logs downloaded, backup done. (Backups running 2-7 times every week to another ReadyNAS or an USB HD.) The problem occurred to me before, but on another machine (RN104) that had a failing disk. Because of the failing disk I did not think the problem was something to bother about, but this time it happened to me on a machine with disks that did not exhibit any problems. QuestionsI want to understand what is going on. Which boils down to:what logs should I read?what am I looking for in the logs?I want to prevent future problems with similar issues. Which leads me to questions like:are the Ultras sensitive to what disks I use? I have 4 x 2TB disks in my ReadyNAS and all but one are of the same type (3 x ST32000542AS + 1 x ST2000DL003-9VT166). All have same rotational speed.I have had problems with my Ultra 4 / Pro 4 converted to OS 6.8.10; they hang sometimes with a message on the display, some of the messages have been re-occurring. Because of that I decided to reboot all of those every night or every couple of nights. I know it is not a "solution" at all, but otherwise I would have them hanging at random times 1-4 times every month.Any suggestions welcome.

tijgert wrote: I was trying to avoid asking a similar question in a new thread.Doing that twice actually created the clutter I was trying to avoid... But when it turns out the symptoms are similar but the root cause is not, then the advise for each person diverges, and it can get messy and confusion. It's great that you first searched for a solution. Too many don't and post a question that's been answered a dozen or more times. But when you don't find a solution, it's usually best to make a new post, especially if the original post is still active.

tijgert , I answered your question in the other thread where you posted it. Note that it's best to open a new thread so different advice to two users doesn't get confused and posting the same query is certainly unnecessary.

The power supplies used in 4-bay legacy ReadyNAS have a modified connector pin-out, but are otherwise just standard supplies for a small-profile desktop computer. As such, they have a limited life -- and few keep a computer as long as they do a NAS. So I would not say they have "problems", but they are a common first casualty due to age. Have you added RAM to the unit? While most don't have an issue with just the base 1GB until OS 6.10.x, it can be an issue, especially if you run apps. Replacement supplies are available on eBay, or you could make an adapter cable and use an external ATX supply long enough to see if the supply is the issue. Given the age of both the hardware and firmware, you do need to consider whether it's worthwhile investing any more into a legacy ReadyNAS.

As an aside, if you don't have ipv6 enabled in your router, then I recommend turning it off in your NAS as well. ziemowit wrote: StephenB will you be able and kind to look at my logs? I will send you a PM with a link. Gibberish could be a corrupted VPD. But you are using SMR drives in NAS5, which is a very bad idea. On top of that, one of them is failing (181 reallocated sectors so far). It is still trying to sync, and does not appear to have gotten to disk 4 yet. If you have no backup, then leave it running until it resyncs. But (assuming ok NAS hardware), you should replace these disks with CMR models - either Seagate Ironwolf or WD Red Plus. NAS6 is only using the two Hitachi drives . For some reason the two WD30EFRX drives are global spares. I suggest selecting one of those drives, and then formatting it in the NAS, and see if the system adds it to the array. There is another unusual error on NAS6 - Aug 27 20:15:16 NAS6 wsdd2[15607]: ======= Backtrace: ========= Aug 27 20:15:16 NAS6 wsdd2[15607]: /lib/x86_64-linux-gnu/libc.so.6(+0x731af)[0x7f07268cd1af] Aug 27 20:15:16 NAS6 wsdd2[15607]: /lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7f0726952aa7] Aug 27 20:15:16 NAS6 wsdd2[15607]: /lib/x86_64-linux-gnu/libc.so.6(+0xf6cc0)[0x7f0726950cc0] Aug 27 20:15:16 NAS6 wsdd2[15607]: /usr/sbin/wsdd2[0x405c48] Aug 27 20:15:16 NAS6 wsdd2[15607]: /usr/sbin/wsdd2[0x4023e6] Aug 27 20:15:16 NAS6 wsdd2[15607]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f072687bb45] Aug 27 20:15:16 NAS6 wsdd2[15607]: /usr/sbin/wsdd2[0x4027de] Aug 27 20:15:16 NAS6 wsdd2[15607]: ======= Memory map: ======== Aug 27 20:15:16 NAS6 wsdd2[15607]: 00400000-0040a000 r-xp 00000000 00:13 41395 /usr/sbin/wsdd2 Aug 27 20:15:16 NAS6 wsdd2[15607]: 00609000-0060a000 r--p 00009000 00:13 41395 /usr/sbin/wsdd2 Aug 27 20:15:16 NAS6 wsdd2[15607]: 0060a000-0060b000 rw-p 0000a000 00:13 41395 /usr/sbin/wsdd2 Aug 27 20:15:16 NAS6 wsdd2[15607]: 0199d000-019be000 rw-p 00000000 00:00 0 [heap] Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726438000-7f072644e000 r-xp 00000000 00:13 40386 /lib/x86_64-linux-gnu/libgcc_s.so.1 Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f072644e000-7f072664d000 ---p 00016000 00:13 40386 /lib/x86_64-linux-gnu/libgcc_s.so.1 Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f072664d000-7f072664e000 rw-p 00015000 00:13 40386 /lib/x86_64-linux-gnu/libgcc_s.so.1 Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f072664e000-7f0726659000 r-xp 00000000 00:13 40366 /lib/x86_64-linux-gnu/libnss_files-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726659000-7f0726858000 ---p 0000b000 00:13 40366 /lib/x86_64-linux-gnu/libnss_files-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726858000-7f0726859000 r--p 0000a000 00:13 40366 /lib/x86_64-linux-gnu/libnss_files-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726859000-7f072685a000 rw-p 0000b000 00:13 40366 /lib/x86_64-linux-gnu/libnss_files-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f072685a000-7f07269fb000 r-xp 00000000 00:13 40357 /lib/x86_64-linux-gnu/libc-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f07269fb000-7f0726bfb000 ---p 001a1000 00:13 40357 /lib/x86_64-linux-gnu/libc-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726bfb000-7f0726bff000 r--p 001a1000 00:13 40357 /lib/x86_64-linux-gnu/libc-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726bff000-7f0726c01000 rw-p 001a5000 00:13 40357 /lib/x86_64-linux-gnu/libc-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726c01000-7f0726c05000 rw-p 00000000 00:00 0 Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726c05000-7f0726c26000 r-xp 00000000 00:13 40354 /lib/x86_64-linux-gnu/ld-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726e19000-7f0726e1c000 rw-p 00000000 00:00 0 Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726e22000-7f0726e25000 rw-p 00000000 00:00 0 Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726e25000-7f0726e26000 r--p 00020000 00:13 40354 /lib/x86_64-linux-gnu/ld-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726e26000-7f0726e27000 rw-p 00021000 00:13 40354 /lib/x86_64-linux-gnu/ld-2.19.so Aug 27 20:15:16 NAS6 wsdd2[15607]: 7f0726e27000-7f0726e28000 rw-p 00000000 00:00 0 Aug 27 20:15:16 NAS6 wsdd2[15607]: 7ffe509ee000-7ffe50a0f000 rw-p 00000000 00:00 0 [stack] Aug 27 20:15:16 NAS6 wsdd2[15607]: 7ffe50afe000-7ffe50b00000 r--p 00000000 00:00 0 [vvar] Aug 27 20:15:16 NAS6 wsdd2[15607]: 7ffe50b00000-7ffe50b02000 r-xp 00000000 00:00 0 [vdso] Aug 27 20:15:16 NAS6 wsdd2[15607]: ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] Aug 27 20:15:16 NAS6 systemd[1]: wsdd2.service: Main process exited, code=killed, status=6/ABRT Aug 27 20:15:16 NAS6 systemd[1]: wsdd2.service: Unit entered failed state. Aug 27 20:15:16 NAS6 systemd[1]: wsdd2.service: Failed with result 'signal'. Aug 27 20:15:17 NAS6 systemd[1]: wsdd2.service: Service hold-off time over, scheduling restart. If you have a backup, you might also try doing a factory default with all disks in place, and set up the NAS again.

A hard re-boot while the NAS is actively writing files can damage the BTRFS volume, so you likely did this to yourself. There is noting innately different with the Ultra that would cause this. When a volume is damaged, the NAS often puts it in read-only mode to prevent you from making matters worse. Do not reboot again until your backup is up to date. Doing so will not fix your problem and may result in the volume failing to mount. The only guaranteed fix is it destroy and re-create the volume or do a complete factory default. Note that if you have apps loaded, that can be problematic since the apps are on the data volume. So un-install them and re-install after you've re-created the volume or see here: How-to-save-your-apps-when-destroying-your-main-volume-OS6 . As for drives, just avoid any that are SMR.

BTRFS read-only state, OS 6.8.10, ReadyNAS Ultra 4

15 Replies

Replies have been turned off for this discussion

Sandshark
Sensei
Jul 26, 2024
A hard re-boot while the NAS is actively writing files can damage the BTRFS volume, so you likely did this to yourself. There is noting innately different with the Ultra that would cause this. When a volume is damaged, the NAS often puts it in read-only mode to prevent you from making matters worse.

Do not reboot again until your backup is up to date. Doing so will not fix your problem and may result in the volume failing to mount.

The only guaranteed fix is it destroy and re-create the volume or do a complete factory default. Note that if you have apps loaded, that can be problematic since the apps are on the data volume. So un-install them and re-install after you've re-created the volume or see here: How-to-save-your-apps-when-destroying-your-main-volume-OS6 .

As for drives, just avoid any that are SMR.
- ziemowit
  Aspirant
  Jul 27, 2024
  Hello, Sandshark! THANKS for the answer. 🙂
  
  I noticed that while I was writing Stephen B also answered with some more information. Thanks, Stephen, I will read and see what I can understand of it.
  
  OK, I get it. 🙂 I was probably a bit too impatient with the power off. Well, you never cease learning; I thought - correct or wrong - that a NAS would and should survive an unexpected power off. I was wrong and stand corrected. Guilty as charged.
  The curious and "funny" thing is that it seems like I cannot even do "ls" on my /data directory (I/O error), but I can access /data/Pictures without problems, while /data/Videos gives me I/O error. Nevertheless, I have full backups of stuff.
  
  As far as installed applications go, it is not a problem either, as I have downloaded the binaries long time ago and actually only use ssh and possibly MySQL with PHP.
  - tijgert
    Guide
    Aug 02, 2024
    I am adding to this discussion because I am looking for an answer to the exact same question.
    
    My ReadyNas 516 accidentally filled up to the max (due to an emergency backup of another drive) and threw an error because of No Space Left. The Log reflects this to be the only case and there is NO hardware issue, which has been confirmed. SO there is NO backup needed of the system, which I already have (mirrored NAS).
    I just need to be able to delete files again.
    
    The system however keeps switching to ReadOnly mode when I reboot due to lack of space, and due to ReadOnly mode I cannot create more space by deleting files...
    
    So, with all the hardware being just fine, how do I tell the system to let me erase files so I can create more space?
    I am SSH inept, but I can follow instructions if I have to.
    
    I can enter SSH via Putty and I find myself at the prompt:
    admin@NAS516:~$
    
    What can I do next?
StephenB
Guru - Experienced User
Jul 27, 2024
I agree with Sandshark that you should back up the NAS before doing anything else.

Also, I agree that rebooting a NAS with a read-only volume is dangerous. You are lucky that it remounted.

ziemowit wrote:

I have 4 x 2TB disks in my ReadyNAS and all but one are of the same type (3 x ST32000542AS + 1 x ST2000DL003-9VT166). All have same rotational speed.

.

When the time comes to replace disks, I suggest you go with Seagate Ironwolf or WD Red Plus. Avoid the WD Reds, as they are SMR. Most desktop drives between 2-6 TB are also SMR.

ziemowit wrote:

Questions

I want to understand what is going on. Which boils down to:

what logs should I read?

what am I looking for in the logs?

Generally I start with dmesg. Look for disk and btrfs errors, also look at the mdadm commands.

Looking at the bottom of volume.log can help you sort out if you have a full OS partition (md0)

Readnasd.log and status.log are often useful in sorting out the history of degraded volumes

systemd-journal.log, system.log and kernel.log, mdstat.log often have useful clues as well.

Of course you also need to figure out what to do about what you are seeing.

If you like, you can upload the log zip to cloud storage (dropbox, google drive, etc) and send me a PM (private message) with a link. Make sure the link permissions are set so anyone can download. You'd send a PM using the envelope icon in the upper right of the forum page.

It might take a while (over a week) for me to get back to you, as I'm going on vacation soon and won't have much time to spend on this.

ziemowit wrote:

I want to prevent future problems with similar issues.

If you have no backup plan in place for your NAS, then that is the place to begin. RAID isn't enough to keep your data safe. The best way to do that is to have at least one copy on another device (and ideally one copy off-site).

One thing to do is set up a maintenance schedule on the volume settings page. I run one of the four functions (scrub, balance, disk test, defrag) every month. The scrub and the disk test will access every sector of every disk, so both can give early indications of disk issues. I space those test two months apart (filling the gaps with the balance and defrag).

Don't rely on the web ui to give accurate information on disk health. Often there are clear issues in the log zip that don't show up in the web ui. So it is wise to look at the log zip from time to time - particularly after each scrub and disk test completes.

Make sure you maintain at least 15% free space on the volume. If you use snapshots, I also suggest turning off the smart snapshots and switching to custom settings. Set an explicit retention period, and also configure them to only make snapshots when there are changes. This can help manage the amount of space the snapshots take up.

I also suggest getting a UPS for the NAS. That will ensure it shuts down cleanly when there is a power outage. A lot of volume failures involve unclean shutdowns.

If you run the antivirus software, the file search feature, or have apps installed on your NAS, then you should also consider upgrading the RAM to 4 GB. Generally removing apps and disabling services you don't need is a good idea - improving both performance and stability. Antivirus and file search in particular use a lot of system resources.
- ziemowit
  Aspirant
  Aug 27, 2024
  Thank you, StephenB !
  
  I hope you have had a nice trip... My reply is delayed for several reasons, one of them being reading logs and tryig to understand stuff on my own. I have 3 NASes that have misbehaved from time to time, sometimes displaying filp_close+9 in the LCD just before hanging, sometimes it was some other text. Long time ago I put those NASes on "nightly reboot", in hope that the problems would diminish if the NAS did a fresh restart every night at 3 AM (via Power Management in the GUI). I also searched logs for error messages displayed, an actually I located one of the error messages in the logs, and also saw that I possibly had a disk problem not reflected in the error count on the disk. One of the logs reported repeated problems re-allocating a number of blocks. Finding disks on Ebay in reasonable condition and resonable price, as well as reinstallig fw (reverting to 6.10.8, as I want my sheell-in-a-box installed), restoring backups and resyncing took some time. But then it was fixed.
  
  So I thought.
  
  Boy, I was wrong! This very evening I could not connect to one of the NASes and decided to do a port scan. It showed that two of the machines were having issues: ports 80, 22 and 443, as well as 21 were gone, although there was still somebody listening on ports 25, 110, 119, 143, 465 and others. I then went to the machines and pressed briefly the backup button to see what they say. The one called NAS5 displayed incoherent characters on the display, i.e. gibberish. The one called NAS6 displayed understandable text about backup button being depressed. Oh, well - time to use PWR button twice to take them down in orderly manner. The NAS6 went down in a few minutes, NAS5 just did nothing, so I had to use a long pressure on PWR to cut power the brutal way. After reboot NAS6 went up without problems, but NAS5 started resyncing.
  
  After the reboot I took a dive into the logs. Nothing unusual in NAS5, although it was not accessible and displayed gibberish on the LCD. In the logs of NAS6 (kernel.log and system.log) I could see that it reported error codes from curl all the time after 7PM. The msmtp.log reported about the NAS6 not being able to locate smtp-mail.outlook.com. So somehow I wonder. Do I have faulty hardware? Do I have other problems that I have no clue about? StephenB will you be able and kind to look at my logs? I will send you a PM with a link.
  - Sandshark
    Sensei
    Aug 28, 2024
    You may have a hardware issue, but it's a bit early to say you do. The Ethernet and power on/off circuit in your NAS are powered by a separate voltage from the PSU labeled "+5VSB", for +5 Volts Standby. That voltage stays on even when the NAS is "off". And because of that, it's often the first to go, and if the NAS is in a hot environment, that can be worse because the fan is off when the NAS is "off". This also most often happens on units that are powered on and off routinely, which sounds like what your NAS sees. Some of your issues do sound like they could be related to that. The "error message" you see at power-down may not be that -- it may be something you are never supposed to see because the unit shuts down. But it could also be a software error and the last thing executed before a crash prevented reaching power-off.
    
    Are you also sometimes seeing issues where it won't power up, by schedule, power button, or WoL? If so, that points to a PSU issue even more.