- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
Hang/Crash after 6.9.5 upgrade - Defrag
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hang/Crash after 6.9.5 upgrade - Defrag
I recently upgraded to 6.9.5, 9th Feb.
I have Defrag (& tests & balance) scheduled on two volumes, they have run without issues on 6.9.3 forever.
A disk test of V1 completed 19th Feb.
A Defrag of V1 started 0100hrs 21 Feb. Sometime ~1500hrs noticed the admin page had timed out, then the Windows maped drives were unavailable. On inspection the RN316 front panel was blank and unresponsive to buttons, I had to hard power down.
It restarted OK.
I downloaded logs, there was no messages of note messages stopped ~1330.
I manually started Defrag at 1630hrs.
Today, 22nd Feb sometime after 1600hrs, I noticed from a distance the front panel had a message.
On inspection 'out_of_memory+21b', again unresponsive to buttons. Hard power down.
I downloaded logs, unhelpfully the logs have lost info (compared to 21st logs) and don't show anything AFAICT.
e.g. 21st kernal.log
Feb 21 13:21:48 ME-NAS-316A connmand[3311]: ntp: adjust (slew): -0.003423 sec -- Reboot -- Feb 21 15:53:06 ME-NAS-316A rsyncd[3140]: rsyncd version 3.1.3 starting, listening on port 873
22nd kernal.log
Feb 14 03:06:09 ME-NAS-316A kernel: BTRFS info (device dm-0): found 32 extents -- Reboot -- Feb 22 17:16:53 ME-NAS-316A kernel: Initializing cgroup subsys cpuset
Other show up to e.g. Feb 21 15:58:54, ie before Defrag was restarted.
Any Netgear'ers want logs?
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
Hi Michael_Oz,
You may provide to me the logs. Please upload it to Google Drive then PM me the download link.
Regards,
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
Thanks John. Logs PM'd.
I have disabled volume schedules.
I ran disk test on both volumes, both completed, no errors.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
Thank you for providing the logs. We will look at the logs soon.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
Hi Michael_Oz,
There is a lot of metadata on your NAS and the unit only has 2GB RAM.
=== filesystem /RN316AV1 ===
Data, single: total=14.03TiB, used=14.01TiB
System, DUP: total=32.00MiB, used=1.84MiB
Metadata, DUP: total=29.50GiB, used=24.90GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
It is not probably not having enough RAM and SWAP to accomplish the cleanup. L3 has cancelled the defrag.
This issue can be resolved by backing up, doing a factory default and restoring the data to a fresh clean volume.
Regards,
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
Thanks John.
> L3 has cancelled the defrag.
What does that mean? The system was hung the first time and OOM the second.
One would think software would maintain it's metadata.
As I mentioned it had been happily defragging on schedule UNTIL the first defrag on 6.9.5.
Now it COULD be that it just happenned to have that one extra nybble to put it over the edge the day after the upgrade, but occam's razor says no.
> backing up, doing a factory default and restoring the data to a fresh clean volume.
Is this going into the Product Brochure? 'BTW you will need to reset & reload your NAS every year or so'
That volume was created 2017/08/05 (on 6.7.5) with data reloaded from backup of a previous volume, and it, and the NAS, is basically idle 99.9% of the time, the most it does is take & delete empty snapshots and ponder the meaning of UPS signals.
It appears the metadata is volume specific, so surely deleting/recreating the volume is all that is required?
Why factory reset? (and hence need to do the other volume too)
I had also regularly scrubbed the volumes, until a while ago.
I started a scrub 6 days ago which is now at 10%, is a scrub going to tidy up the metadata?
If not, surely a metadata_tidyup routine is called for rather than factory resetting, that is not a 21st century solution.
I'm monitoring memory use, it is growing slightly, but is only ~30%.
> That is a lot of metadata for a volume that size. Do you remember when you did a factory reset last, what firmware version you were on?
I don't think I have factory reset since a couple shortly after getting the unit.
[2014/11/03 22:12:40] Factory default initiated due to new disks (no RAID, no partitions)!
[2014/11/03 22:12:58] Defaulting to X-RAID2 mode, RAID level 1
[2014/11/03 22:13:13] Factory default initiated on ReadyNASOS 6.1.6. [2015/09/18 21:15:14] Updated from ReadyNASOS 6.1.6 to 6.1.6. [2015/09/25 01:31:45] Updated from ReadyNASOS 6.1.6 () to 6.2.4 (ReadyNASOS). [2016/03/04 22:44:33] Updated from ReadyNASOS 6.2.4 (ReadyNASOS) to 6.4.2 (ReadyNASOS). [2016/05/25 16:24:48] Updated from ReadyNASOS 6.4.2 (ReadyNASOS) to 6.5.0 (ReadyNASOS). [2016/07/12 18:46:10] Updated from ReadyNASOS 6.5.0 (ReadyNASOS) to 6.5.1 (ReadyNASOS). [2016/11/11 13:35:30] Updated from ReadyNASOS 6.5.1 (ReadyNASOS) to 6.6.0 (ReadyNASOS). [2016/11/17 23:05:03] Updated from ReadyNASOS 6.6.0 (ReadyNASOS) to 6.6.1-T200 (Beta 1). [2016/12/20 14:54:50] Updated from ReadyNASOS 6.6.1-T200 (Beta 1) to 6.6.1-T220 (Beta 3). [2017/01/12 14:18:27] Updated from ReadyNASOS 6.6.1-T220 (Beta 3) to 6.6.1 (ReadyNASOS). [2017/03/02 07:04:04 UTC] Updated from ReadyNASOS 6.6.1 (ReadyNASOS) to 6.7.0-T169 (Beta 2). [2017/03/03 06:47:43 UTC] Updated from ReadyNASOS 6.7.0-T169 (Beta 2) to 6.7.0-T172 (ReadyNASOS). [2017/03/03 23:44:36 UTC] Updated from ReadyNASOS 6.7.0-T172 (ReadyNASOS) to 6.7.0-T180 (Beta 3). [2017/03/16 09:47:38 UTC] Updated from ReadyNASOS 6.7.0-T180 (Beta 3) to 6.7.0-T206 (Beta 4). [2017/03/29 23:56:09 UTC] Updated from ReadyNASOS 6.7.0-T206 (Beta 4) to 6.7.0 (ReadyNASOS). [2017/05/15 06:42:29 UTC] Updated from ReadyNASOS 6.7.0 (ReadyNASOS) to 6.7.1 (ReadyNASOS). [2017/06/11 02:41:54 UTC] Updated from ReadyNASOS 6.7.1 (ReadyNASOS) to 6.7.4 (ReadyNASOS). [2017/07/14 07:02:22 UTC] Updated from ReadyNASOS 6.7.4 (ReadyNASOS) to 6.7.5 (ReadyNASOS). [2017/09/30 01:38:36 UTC] Updated from ReadyNASOS 6.7.5 (ReadyNASOS) to 6.8.1 (ReadyNASOS). [2018/03/30 05:23:48 UTC] Updated from ReadyNASOS 6.8.1 (ReadyNASOS) to 6.9.3 (ReadyNASOS). [2019/02/08 23:09:27 UTC] Updated from ReadyNASOS 6.9.3 (ReadyNASOS) to 6.9.5 (ReadyNASOS).
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
@Michael_Oz wrote:
It appears the metadata is volume specific, so surely deleting/recreating the volume is all that is required?
Why factory reset? (and hence need to do the other volume too)
If the other volume doesn't have the problem, then deleting/recreating this volume should be enough.
@Michael_Oz wrote:
As I mentioned it had been happily defragging on schedule UNTIL the first defrag on 6.9.5.
Now it COULD be that it just happenned to have that one extra nybble to put it over the edge the day after the upgrade, but occam's razor says no.
Personally I don't see this as particularly relevant. The metadata growth almost certainly happened over time. Almost certainly the upgrade to 6.9.5 also contributed to timing of the failure, but the issue was hiding under the surface already. The failure mode is also unusual - not something I recall seeing here before.
But it also appears to me that the volume is essentially completely full:
Data, single: total=14.03TiB, used=14.01TiB
Another thing you could try is to offload as many files as you can (deleting them from the volume). You could then try doing a balance (not a scrub). If you have ssh enabled, you can do a "partial" balance - which would complete more quickly, and not require as much memory. If that frees up enough space, you could follow that up with a full balance.
FWIW, I would use ssh here, as that does give more control over options, and more ability to cancel operations. Though that does depend on your linux skills.
@Michael_Oz wrote:surely a metadata_tidyup routine ...
FWIW, the BTRFS folks seem to agree with you there
https://btrfs.wiki.kernel.org/index.php/FAQ wrote:
If you have full up metadata, and more than 1 GiB of space free in data, as reported by btrfs fi df, then you should be able to free up some of the data allocation with a partial balance:
# btrfs balance start /mountpoint -dlimit=3We know this isn't ideal, and there are plans to improve the behavior.
Note you might not have more than 1 GiB of free space in data.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
Hi @Michael_Oz
You have had this NAS running for a long time now, starting at very early firmwares for the OS6. In the early days, the BTRFS filesystem was not very mature and often times you see things like the metadata running wild. The problem is that BTRFS does prefer to load metadata into RAM and with 2GB of RAM you will suffer when you have 20+ GB of metadata.
It is not easy to fix and it would be much better to factory default and start over. Then you would start on a much more mature version of BTRFS and would be less likely to hit same situation again. Metadata is controlled a lot better now.
@StephenB Data, single: total=14.03TiB, used=14.01TiB
Just to clarify that the above shows data chunk allocation vs data chunks used. It does not give you any info on how full the volume is or how big the volume is. It tells you about the balance of the volume and, in fact, this volume is very well balanced 🙂
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
@Hopchen > more mature version of BTRFS
I see, you are presumably talking about the filesystem structure whuch persists across firmware updates?
But is the volume was created on 6.7.5, what is it that is hanging around from earlier versions?
Are you implying that the system volumes, created at factory default, drive BTRFS structures version regardless of firmware updates?
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
@Hopchen wrote:
Data, single: total=14.03TiB, used=14.01TiB
Just to clarify that the above shows data chunk allocation vs data chunks used.
Thx for the clarification.
FWIW, the volume on my main NAS (RN526x) started in Nov 2016 with 6.6.0 firmware, and it is using a fair mount of metadata also.
=== filesystem /data === Data, single: total=11.17TiB, used=11.00TiB System, DUP: total=64.00MiB, used=1.31MiB Metadata, DUP: total=13.00GiB, used=11.53GiB GlobalReserve, single: total=512.00MiB, used=0.00B
Do you have a rule of thumb, or other guidance on when the amount gets concerning? I think that would be useful to many here.
Also, wouldn't destroying/recreating the volume have the same effect as the factory reset (recreating the on-disk structures with current btrfs)?
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
Hi @StephenB and @Michael_Oz
My reply was not indepth enough to do the discussion justice. I brushed over some things without enough explanation. I will be happy to formulate are more detailed reply but it is getting a bit late here in Europe so I will put something together for you, tomorrow 🙂
Cheers
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
Hi both
Firstly, let me clarify the BTRFS stats that is referred in this thread.
There are two parts to the BTRFS stats in the RN logs. Here is an example from a random BTRFS output.
Primarily you see an overview of the total BTRFS chunk usage. The first line shows how much data is present on the filesystem, 3.60TiB. Second line shows how big the volume is: just over 8TiB. Lastly, we can see that 4.17TiB worth of chunks has currently been allocated.
Total devices 1 FS bytes used 3.60TiB devid 1 size 8.17TiB used 4.17TiB path /dev/md127
One thing to notice is that we have more chunks allocated than needed. So, the volume is slightly unbalanced (3.6TiB worth of chunks needed, 4.17TiB worth of chunks allocated to cover that).
Subsequently, there is a breakdown of the chunk allocation from above. We observe the excessive allocation is happening with the data-data. “Total” and “used” differ by about 570GiB. A balance will essentially re-shuffle the chunks and make sure all chunks are better utilized so that we can free up some chunks back to the overall unallocated pool (getting the “total” number closer to the “used” number, below).
=== filesystem /data === Data, single: total=4.17TiB, used=3.60TiB System, DUP: total=8.00MiB, used=544.00KiB Metadata, DUP: total=1.00GiB, used=652.53MiB GlobalReserve, single: total=73.44MiB, used=32.44MiB
Regarding metadata. In the volume above the NAS only uses 652MiB of metadata for 3.6TiB of data. This is a different picture compared to both your volumes. You have more data but your metadata usage is higher, percentage wise. A side-note: notice how the metadata is listed as DUP. It means that all metadata is duplicated so you need to calculate the space taken by metadata, twice. The reason for the duplication of the metadata is for redundancy. In the event that some metadata becomes unreadable due to things like bad sectors on the disks, then at least we have the same metadata stored elsewhere in the filesystem.
Metadata is hugely important because it contains all the information about the file. Where it is stored, file permissions, time stamps, checksums and much more. It also means that before any data can be presented to the user, the metadata for those files must be fetched and read first. Therefore, slow metadata access = slow access to the data itself. Because of this, BTRFS tend to prefer to load at least some of the metadata into a quickly accessible location - and nothing is faster than RAM. Any changes to metadata is of course written to disks as memory is volatile but for reading purposes it is desirable to have some metadata loaded into memory.
What if my metadata usage is high? Well, then most of the metadata must be fetched directly from the disks. The NAS will still function fine but performance might not be as good. The real kicker is that metadata is written as CoW (Copy-on-Wite) and that will fragment the metadata quite heavily over time. Fragmented metadata can be detrimental to read performance.
Can I do anything about it? Yes, you can actually defrag and balance the metadata itself in order to ensure best performance outcome by having the metadata stored as contiguously as possible on the disks. I do not think the defrag or balance features on the ReadyNAS GUI will touch the metadata (I could be wrong on that) but such operations can be done manually if desired (easy to do).
Why is my metadata usage so high? One other thing that metadata stores is the reference to the extends when data gets scattered/fragmented due to CoW. The more fragmented your filesystem is, the more space is consumed by metadata. It also holds reference points for snapshotted blocks and thus the more snapshots you have and the larger the snapshots, more metadata will be consumed. Another reason could be storing a large number of small files which can lead to a big amount of metadata proportional to the amount of actual data.
About older versions of BTRFS. The filesystem matured a lot over the past years and would have improved on things like better management of reflinks, perhaps some reduced fragmentation and especially more efficient use of the chunks, lowering the need for running balance on the filesystem. In my personal experience you tended to see larger amounts of metadata (proportional to the data) on filesystems that were created/managed on older versions of BTRFS. As you upgrade the NAS firmware you will also benefit from newer BTRFS versions, but it can be difficult to get the filesystem into as clean a state as if you were to start anew. I think this is what NETGEAR is on about here when they advise to factory default or make a clean volume. I would think making a new volume vs factory default will probably yield the same result in this instance.
All in all, there is no “golden” number that your metadata should hit. It really depends on the use-case and on other factors like how old the filesystem is, how fragmented it is, etc. BTRFS now has support for metadata caching on SSDs alongside memory. This is good news because it would alleviate much of the potential problems listed above and I am sure NETGEAR is exploring that option as well. I can only speculate… however it would make sense.
There is loads of good BTRFS info online, so if you feel like nerding out then visit the BTRFS wiki 😊
If your volumes were both created on relatively newer firmware versions, then I would think that your current metadata numbers might be down to use-case. Do you have many/large snapshots? Do you store many small files? Are you running regular maintenance schedules like defrag and balance? Are you using CoW on the shares?
Cheers
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
Thanks for taking the time to post this. Answer to your questions are below.
One answer you didn't ask for: Volume quota is enabled.
@Hopchen wrote:
. Do you have many/large snapshots?
I have about 250 snapshots at present, taking up a total of 560 GiB of space.
I use custom snapshots on most shares. Generally I use 3 months retention, with one share that holds PC image backups using 2 weeks. "Only take snapshots with changes" is set.
@Hopchen wrote:
Do you store many small files?
about 350000 files are <= 512 KiB
about 280000 files are > 512 KiB
@Hopchen wrote:
.Are you running regular maintenance schedules like defrag and balance?
Yes. Scrub, balance, defrag, and the disk test are each run once every three months.
Auto Defrag is also set on some shares (not sure why I set it that way).
BTW, I did manually balance metadata yesterday and it didn't make much difference in the total metadata being used.
@Hopchen wrote:
Are you using CoW on the shares?
Yes. Though in-place modification of files is relatively rare.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
> I did manually balance metadata yesterday and it didn't make much difference in the total metadata
I noticed it does
btrfs balance start -dusage 76 -musage 38
38% seems low if metadata is important for performance.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
@Michael_Oz wrote:
> I did manually balance metadata yesterday and it didn't make much difference in the total metadata
I noticed it does
btrfs balance start -dusage 76 -musage 3838% seems low if metadata is important for performance.
I used ssh
btrfs balance start -m /data
I don't know what the default is. If I have a chance I'll try -musage 0, just to see if that makes any difference. But even if it ends up perfectly balanced in my case the space won't go down much - only about 1 GiB
Metadata, DUP: total=13.00GiB, used=11.53GiB
But I'm not seeing any performance issues, so this is academic for me (though obviously not for you).
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
Hi both
@StephenB
Given the that you do have quite a few snapshots and that the amount of files isn't insignificant (though not super high) the metadata usage is probably reasonable in your case.
Again, it is hard to judge exactly what is "reasonable" but as you point out, your performance is fine and thus I wouldn't worry too much about it. The good performance might also be contributed by the fact that you rarely modify files (if I understand correctly) - reducing fragmentation.
If you run into performance issue at some point (slow scrub, slow balances, slow defrag, slow file access), then it could be worth considering defragging the metadata as well, from time to time. This can be done by amending the recursive flag (-r).
When doing that defrag command it will warm you that files won't be defragged but that is our intention. Adding the -v flag will show which subvolumes (shares) it touched. Example:
root@Datastore:~# btrfs fi defrag -v /data/* WARNING: directory specified but recursive mode not requested: /data/home WARNING: directory specified but recursive mode not requested: /data/Movies WARNING: a directory passed to the defrag ioctl will not process the files recursively but will defragment the subvolume tree and the extent tree. If this is not intended, please use option -r . /data/Documents /data/home /data/Movies /data/Pictures
There are some interesting properties of defrag in BTRFS. For example, if you were to defrag a read-only snapshot it might increase the space that the snapshot takes because the defrag might have to break some reflinks and thus duplicate those blocks instead. It could also be a problem when compression enabled on the shares.
Other times defrag can decrease space taken on the volume and in all cases (bar maybe some oddball scenario) it should improve performance as the data will be more contiguous.
Another thing is that defrag and balance seems to sometimes affect each other - as in you defrag a volume and it becomes a bit more un-balanced and the other way around. So, running both is typically a good idea.
@Michael_Oz
The command you listed, I assume you saw in the "backend" when running the balance from the GUI?
btrfs balance start -dusage 76 -musage 38
This means that the balance will only touch data chunks that are less than 76% full and metadata chunks less than 38% full. It is referred to as a partial balance. Sometimes it is good to be cautions as a full balance can be really heavy. I am sure there NETGEAR added some calculation that looks at what optimal for the volume.
In your case I would try to sort out the metadata as best possible and then consider doing some defrag/balance per share. Here is a little script that can be run to defrag and then balance the metadata incrementally. It does another full metadata defrag and balance in the end.
echo "Starting defrag"
btrfs fi defrag -v /data/*
echo "Defrag done" echo "Starting incremental balances"
sleep 60 for i in 0 2 5 10 20 30 40 50 100; do btrfs balance start -musage=$i /data echo "###Balance musage=$i done###" echo sleep 60 done
echo "Starting another defrag" sleep 60 btrfs fi defrag -v /data/*
echo "Defrag done" echo "Starting another balance"
sleep 60 btrfs balance start -musage=100 /data
echo "Balance done"
echo "---COMPLETE---"
It would probably take quite a long time to run but it should get the metadata into as good a state as possible.
Subsequently, for the data itself, you can consider defragging one share at a time rather than a full defrag. Example:
btrfs fi defrag -r /data/Documents
Cheers
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
@Hopchen wrote:
If you run into performance issue at some point (slow scrub, slow balances, slow defrag, slow file access), then it could be worth considering defragging the metadata as well, from time to time. This can be done by amending (appending?)the recursive flag (-r).
When doing that defrag command it will warn you that files won't be defragged but that is our intention.
Just to clarify - if -r is specified, it will defrag the files.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
@StephenB wrote:
@Hopchen wrote:
If you run into performance issue at some point (slow scrub, slow balances, slow defrag, slow file access), then it could be worth considering defragging the metadata as well, from time to time. This can be done by amending (appending?)the recursive flag (-r).
When doing that defrag command it will warn you that files won't be defragged but that is our intention.Just to clarify - if -r is specified, it will defrag the files.
That is correct, yes.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
@JohnCM_S wrote:
Hi Michael_Oz,
There is a lot of metadata on your NAS and the unit only has 2GB RAM.
=== filesystem /RN316AV1 ===
Data, single: total=14.03TiB, used=14.01TiB
System, DUP: total=32.00MiB, used=1.84MiB
Metadata, DUP: total=29.50GiB, used=24.90GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
It is not probably not having enough RAM and SWAP to accomplish the cleanup. L3 has cancelled the defrag.
This issue can be resolved by backing up, doing a factory default and restoring the data to a fresh clean volume.
Regards,
For the record.
I backed up, deleted & recreated the volume, no factory default, restored.
Note as a solution it is a PITA, as well as the time & effort you loose all your backup & restore jobs.
Defrag now runs to completion.
Size info:
Data, single: total=8.19TiB, used=8.12TiB
System, DUP: total=64.00MiB, used=960.00KiB
Metadata, DUP: total=10.00GiB, used=5.59GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
Note the previous data
Data, single: total=14.03TiB, used=14.01TiB
the difference must be the snapshots.
I previously configured snapshots everywhere, as a nice to have.
I will only be doing it selectively now.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Hang/Crash after 6.9.5 upgrade - Defrag
@Michael_Oz wrote:
I previously configured snapshots everywhere, as a nice to have.
I will only be doing it selectively now.
I use them on almost every share. Shares that have a lot of "churn"should have snapshots disabled - in particular shares where there are a lot of file deletions and/or files that are modified in place (torrents or databases).
I suggest using custom snapshots, and then explicitly setting retention. With the default "smart" snapshots, the monthly snapshots are never pruned, and over time they will take up a lot of space.
I've found that using 3 months retention on most shares (and 2 weeks retention on one share used for PC image backups), I end up with about 5% of the available space going for snapshots. So you could start with those values, and tweak them as desired to control the overall space. I also check the "only make snapshots on changes" option.
BTW, when you defrag the share that has snapshots, the storage needed for the snapshots will often rise. When a file that is in a snapshot and a share is modified, BTRFS will fragment the copy in the main folder. That allows the unmodified blocks to be held in common by the snapshots and the main folder. If you defrag that file, then the unmodified blocks aren't held in common any longer, so the storage needed by the snapshot goes up.