× NETGEAR will be terminating ReadyCLOUD service by July 1st, 2023. For more details click here.
Orbi WiFi 7 RBE973
Reply

Re: Not scrubbing, not balancing and SMB hangs every minute

Vinzz
Aspirant

Not scrubbing, not balancing and SMB hangs every minute

Hello,

 

I'm having troubles with my RN316 (6x3tb in X-Raid) since i upgraded from 6.2.0 to 6.7.2.

The nas had a networkshare that had over 2Tb of data on it. Snapshots were als enabled on that share. Resulting in 10Tb of continuous snapshots over the past 4 years. That means that 12.37Tb is 'in use' and 1.03Tb is free atm.

 

After the upgrade to 6.7.2 the frontview showed me that there was only 11Gb on that share. But when viewing the files in the share tells me that there was much more in use. An other thing that didn't go well was deleting snapshots (through frontview).

After a lot of web searching i came up on an article that the snapshots probably weren't converted to the new standard. So i executed the next two statements:

# touch /.force_snapshots_upgrade
# systemctl restart readynasd

Now the frontview showed me that this share had 16777215.2TB in use! How is this even possible?

 

Next step: I deleted the whole share, thinking it would free up all the diskspace (including 10Tb of snapshots), but no.

The yellow bar that showed how much space was taken for snapshots is now add to the blue (space in use) bar.

 

Next step, let's start the Balance tool. After a few hours, still 0%

Stopping the balance through SSH has no effect.

Rebooted device and balance was not running any more. invoked a status in SSH and i get this:

~# btrfs fi balance status /data
Balance on '/data' is paused
0 out of about 0 chunks balanced (0 considered), -nan% left

well, okay, it isn't working, so i kept the balance function paused.

then i tried to start scrubbing. I got a message that scrubbing had started. Shortly after that the frontview became unresponsive for half an hour and SMB writes hung. The frontview came back after half an hour showing 0.00% scrubbing.

When issuing a scrub status command i got this:

~# btrfs scrub status /data
scrub status for d4c3e8e7-1514-4988-9015-a34a165650a6
        scrub started at Fri Jun  2 07:13:05 2017, running for 00:31:47
        total bytes scrubbed: 8.00MiB with 0 errors

invoking the command for the second time showed the same results, which means it has finished. But the frontview still shows 0.00% complete (after a few hours) and no space has been freed up.

Stopping the scrub through frontview did not work. it keeps saying 'scrubbing in progress: 0.00% complete'

 

This device is used in a production environment and SMB writes now also hangs every minute (for every share). I discoverd that rtbfs-transacti is mostly 100% when SMB writes are hanging.

 

I checked all hd SMART data for faults, but non of them are defect.

Does anyone know how i can fix these issues without loosing any data on my other shares? I do have an off-site backup. But i really don't want to start all over again. That would take me a few days to transfer all data over an 100Mbit line 😞

 

Model: RN31600|ReadyNAS 300 Series 6- Bay
Message 1 of 6
mdgm-ntgr
NETGEAR Employee Retired

Re: Not scrubbing, not balancing and SMB hangs every minute

Can you send in your logs (see the Sending Logs link in my sig)?

Care needs to be taken not to blindly enter commands. It would've been better to get some advice here.

Message 2 of 6
Vinzz
Aspirant

Re: Not scrubbing, not balancing and SMB hangs every minute

Hello msgm,

 

thanks for your fast answer.

 

it wasn't blindly, i really convinced myself that these actions could resolve my issue. 🙂  And no data is lost if it really goes wrong,

Asking questions on the www is mostly my last resort. i like to figure out things for myself.

 

I've entered more commands and now scrubbing works though 😉

The frontview does not show any progress, (no progressbar shown), only ssh prints progress.

root@Nasbox:~# btrfs scrub status /data
scrub status for d4c3e8e7-1514-4988-9015-a34a165650a6
        scrub started at Fri Jun  2 07:13:05 2017, running for 00:31:47
        total bytes scrubbed: 8.00MiB with 0 errors

root@Nasbox:~# btrfs scrub status /data
scrub status for d4c3e8e7-1514-4988-9015-a34a165650a6
        scrub started at Fri Jun  2 07:13:05 2017, running for 00:31:47
        total bytes scrubbed: 8.00MiB with 0 errors

root@Nasbox:~# btrfs fi balance status /data
Balance on '/data' is paused
0 out of about 0 chunks balanced (0 considered), -nan% left

root@Nasbox:~# btrfs scrub status /data
scrub status for d4c3e8e7-1514-4988-9015-a34a165650a6
        scrub started at Fri Jun  2 07:13:05 2017, running for 00:31:47
        total bytes scrubbed: 8.00MiB with 0 errors

root@Nasbox:~# btrfs scrub status /data
scrub status for d4c3e8e7-1514-4988-9015-a34a165650a6
        scrub started at Fri Jun  2 07:13:05 2017, interrupted after 00:31:47, not running
        total bytes scrubbed: 8.00MiB with 0 errors
root@Nasbox:~# btrfs scrub resume /data
...
root@Nasbox:~# btrfs scrub status /data
scrub status for d4c3e8e7-1514-4988-9015-a34a165650a6
        scrub resumed at Fri Jun  2 11:37:43 2017, running for 08:57:07
        total bytes scrubbed: 311.27GiB with 0 errors

root@Nasbox:~# btrfs scrub status /data
scrub status for d4c3e8e7-1514-4988-9015-a34a165650a6
        scrub resumed at Fri Jun  2 11:37:43 2017, running for 09:04:07
        total bytes scrubbed: 314.81GiB with 0 errors

i'll start the balance again when scrubbing is finished. Hopfully everything starts working then.

I did send you my logs to the specified mailaddress as asked.

 

Kind regards

Vincent..

 

Message 3 of 6
Vinzz
Aspirant

Re: Not scrubbing, not balancing and SMB hangs every minute

some additional information:

 

Since WannaCry explorered the www a few weeks ago this machine has recently been promoted to production machine. Our previous production nasbox (Pro 6) were kicked from the network (only SMB v1 protocol). So i switched the back-up unit with the production unit. The box had to be re-configured in a hurry. It joined AD 2 times (cached and now not cached), created a few shares. etc. I also tried to make room because an rsync job suddently failed. After noticing no disk space became free, i started deleting old snapshots.

 

Deleteing snapshots from the 'backup' share with the frontview was not possible. Unless i clicked an other share first (in the recover window) and then got back to the 'backup' share. Suddenly the delete button was visible, so i started to delete a bunch of snapshots. waited a short time but no disk space became freed up.  Then i actually found and executed the command to convert the snapshots. This was probably way to late because i already deleted soms snapshots, resulting in this weird behaviour.

After the mess i made i deleted the whole 'backup' share.

 

I should have executed a factory reset and cleaned the volumes before i used this unit in production. But this wasn't possible at that time. The backup share was still an important share. I had to make an extra backup. In the mean time i started importing all data with rsync. Thinking i could delete the backup share later.  All those things brought me where i am now, scrubbing 🙂

 

The shares are currently available for alle users while scrubbing is in progress. SMB write actions still hang every minute though.

 

This newer device had no scrub maintenance configured for over 4years. current state: running for 30:18:28 and it scrubbed 449.77GiB.

Will it scrub 10Tb (since snapshot data got joined as normal used space) or will it only scrub the real data on the volume? (±4Tb)

 

Message 4 of 6
Vinzz
Aspirant

Re: Not scrubbing, not balancing and SMB hangs every minute

i'm currently at 845GiB and scrubbing is running for 104 hours.


I noticed some diskspace freed up again, but i also lost diskspace?
I had 13,40Tb total disk space, now i have 13.36Tb ?

Message 5 of 6
Vinzz
Aspirant

Re: Not scrubbing, not balancing and SMB hangs every minute

Just to let you know that I’ve solved this issue a month ago. I've stopped the long running scrubbing process, and shutdown the nas.

Removed hard disk 1 from the bay and rebooted the nas, which gave me the 'degraded mode'. SMB was still hanging every minute though.
Shutdown again and removed Hard disk 2 also. The nas is configured in Raid 5, so there is redundancy for one disk.

Rebooted the device which gave me 0Mb space. ok, I expected this. I reinserted disk 2, rebooted and immediately shutdown again, then inserted disk 1 and rebooted.

At this point the raid got built up again in only one day (!) and the nas is behaving normally for the past month.

Problem solved, and all data is still intact! I've won a bet with myself! It was a lucky guess.

Message 6 of 6
Top Contributors
Discussion stats
  • 5 replies
  • 2735 views
  • 0 kudos
  • 2 in conversation
Announcements