ReadyNAS 102 : Management service is offline + EXT4-fs error

guillaume_d · ‎2023-05-24

Hi everyone. The level of help on this forum seems of great quality, so I'll take my chances in the hope that some can guide me !

My second-hand RN 102 was working like a charm, with no alert whatsoever until a few days back. It's running the 6.10.8 firmware which I believe is the most recent for this product. I have two 1TiB disks, seemingly in good condition though used and of different brands, in a RAID 1 array (mirror).

I first was alerted when I noticed one of the FTP shares was empty and read-only. I tried logging in the admin page, but is said that is was offline. I wanted to reboot it and could not manage it via the front button, and didn't think of trying by SSH, so I pulled the plug. Now the situation is the same, admin page offline.

The main share I have is still working though, via FTP or SFTP. I also still have SSH access and thus root access. Raidar finds my NAS and confirms the offline management service, but says my disks are fine.

Here are the suspicious elements I found :

1/ readynasd.service is 'failed' (the other services are fine)

# systemctl
[...]
● readynasd.service                                                                     loaded failed failed    ReadyNAS System Daemon
[...]

It seems to fail due to a core dump :

root@nas-gf:~# systemctl status readynasd
● readynasd.service - ReadyNAS System Daemon
   Loaded: loaded (/lib/systemd/system/readynasd.service; enabled; vendor preset: enabled)
   Active: failed (Result: start-limit-hit) since Tue 2023-05-09 11:42:43 CEST; 3h 5min ago
  Process: 2208 ExecStart=/usr/sbin/readynasd -v 3 -t (code=dumped, signal=SEGV)
 Main PID: 2208 (code=dumped, signal=SEGV)
   Status: "Initialize"

May 09 11:42:43 nas-gf systemd[1]: readynasd.service: Main process exited, code=dumped, status=11/SEGV
May 09 11:42:43 nas-gf systemd[1]: Failed to start ReadyNAS System Daemon.
May 09 11:42:43 nas-gf systemd[1]: readynasd.service: Unit entered failed state.
May 09 11:42:43 nas-gf systemd[1]: readynasd.service: Failed with result 'core-dump'.
May 09 11:42:43 nas-gf systemd[1]: readynasd.service: Service hold-off time over, scheduling restart.
May 09 11:42:43 nas-gf systemd[1]: Stopped ReadyNAS System Daemon.
May 09 11:42:43 nas-gf systemd[1]: readynasd.service: Start request repeated too quickly.
May 09 11:42:43 nas-gf systemd[1]: Failed to start ReadyNAS System Daemon.
May 09 11:42:43 nas-gf systemd[1]: readynasd.service: Unit entered failed state.
May 09 11:42:43 nas-gf systemd[1]: readynasd.service: Failed with result 'start-limit-hit'.

2/ I found a few suspicious lines in the 'journalctl' output.

First this block with the 'capacity change' :

May 09 11:42:27 nas-gf kernel: md: md0 stopped.
May 09 11:42:27 nas-gf kernel: md: bind<sdb1>
May 09 11:42:27 nas-gf kernel: md: bind<sda1>
May 09 11:42:27 nas-gf kernel: md/raid1:md0: active with 2 out of 2 mirrors
May 09 11:42:27 nas-gf kernel: md0: detected capacity change from 0 to 4290772992
May 09 11:42:27 nas-gf kernel: md: md1 stopped.
May 09 11:42:27 nas-gf kernel: md: bind<sdb2>
May 09 11:42:27 nas-gf kernel: md: bind<sda2>
May 09 11:42:27 nas-gf kernel: md/raid1:md1: active with 2 out of 2 mirrors
May 09 11:42:27 nas-gf kernel: md1: detected capacity change from 0 to 535822336
May 09 11:42:27 nas-gf kernel: EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: (null)

And then these multiple lines with slight variations :

May 09 11:42:27 nas-gf kernel: EXT4-fs error (device md0): ext4_find_dest_de:1833: inode #3552: block 31746: comm pilgrim: bad entry in directory: rec_len is smaller than minimal - offset=0,
 inode=0, rec_len=0, name_len=0, size=4096

I'm joining the last log to this message, from boot to the point where readynasd fails, in PDF since I can't join a txt file (seriously ?).

3/ Now I've noticed something more : the ext4-fs errors only spawned after my brutal reboot this morning, not before. So I may have caused it, but it was not the original problem. Here is a bit of the previous boot, I have numerous occurrences of these :

May 09 11:20:13 nas-gf noflushd[1978]: Spinning down disk 2 (/dev/sdb).
May 09 11:20:13 nas-gf event_push[29580]: Failed to open db. (sq3rc=14)
May 09 11:20:13 nas-gf event_push[29580]: Cannot open database
May 09 11:20:22 nas-gf apache2[2163]: [cgi:error] [pid 2163] [client 10.10.21.51:46426] AH01215: Connect rddclient failed: /frontview/lib/dbbroker.cgi, referer: http://nas-gf.local/admin/
May 09 11:20:22 nas-gf apache2[2428]: [cgi:error] [pid 2428] [client 10.10.21.51:46422] AH01215: Connect rddclient failed: /frontview/lib/dbbroker.cgi, referer: http://nas-gf.local/admin/
May 09 11:20:22 nas-gf apache2[28362]: [cgi:error] [pid 28362] [client 10.10.21.51:34126] AH01215: Connect rddclient failed: /frontview/lib/dbbroker.cgi, referer: http://nas-gf.local/admin/

May 09 11:20:22 nas-gf noflushd[1978]: Spinning up disk 2 (/dev/sdb) after 0:00:05.
May 09 11:20:23 nas-gf event_push[29588]: Failed to open db. (sq3rc=14)
May 09 11:20:23 nas-gf event_push[29588]: Cannot open database

4/ in case it is useful:

root@nas-gf:~# cat /proc/mdstat 
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md127 : active raid1 sda3[0] sdb3[1]
      971911808 blocks super 1.2 [2/2] [UU]
      
md1 : active raid1 sda2[0] sdb2[1]
      523264 blocks super 1.2 [2/2] [UU]
      
md0 : active raid1 sda1[1] sdb1[2]
      4190208 blocks super 1.2 [2/2] [UU]
      
unused devices: <none>

I'd first like to get rid of the FS alerts, but I can't fsck since it is root and mounted. And if it is a second problem, I'd like to diagnose and repair the readynasd service. So what could be my next step ? Thanks in advance !

Sandshark · ‎2023-05-24

It sounds like your OS partition (which is EXT4 on a 100 series NAS) is corrupt. An OS re-install may fix it. But it could make it worse, so you should have a current backup before you do so.

ReadyNAS 102 : Management service is offline + EXT4-fs error

ReadyNAS 102 : Management service is offline + EXT4-fs error

Re: ReadyNAS 102 : Management service is offline + EXT4-fs error