- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
Lost another volume
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
On firmware 6.7.4 ran a scrub before leaving for the weekend on Friday, came back in today after not having received a notification that the scrub had ever finished, couldn't make any changes to the contents of any shares, was getting permission denied, even on accounts that have read/write. Reset permissions for these shares.
Finally rebooted and now the volume has disappeared. To my knowledge this system has never suffered a power failure since the last time I rebuilt. Please let me know which logs to post if any.
Solved! Go to Solution.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi again,
Yes, so there is definitely some corruption on the filesystem - unfortunately. You can probably see that yourself reading through some of those messages. It would cause issues mounting the volume and thus you see no wolume anymore. This is not a result of the scrub by the way.
It is one of the problems with running a RAID0. You are much more prone to these sort of problems as there is no fault tolerance at all. If any of the disks were stalling or if there are errors on any of the disks it can cause serious issues for a RAID0. Are all the disks OK? You can check it in the disk_info.log
Do you have have a backup of the data? If so, you are best to factory default and restore from backups. Also, you might want to consider whether RAID0 is the correct RAID for your setup? It is rather risky on 4 drives I think.
All Replies
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Lost another volume
Hi,
First thing is to check whether the data RAID is running. Can you please post the mdstat.log ?
Thanks
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Lost another volume
mdstat.log
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md127 : active raid0 sda3[0] sdd3[3] sdc3[2] sdb3[1] 23422691328 blocks super 1.2 64k chunks md1 : active raid10 sda2[0] sdd2[3] sdc2[2] sdb2[1] 1046528 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU] md0 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1] 4190208 blocks super 1.2 [4/4] [UUUU] unused devices: <none> /dev/md/0: Version : 1.2 Creation Time : Mon Oct 10 14:44:16 2016 Raid Level : raid1 Array Size : 4190208 (4.00 GiB 4.29 GB) Used Dev Size : 4190208 (4.00 GiB 4.29 GB) Raid Devices : 4 Total Devices : 4 Persistence : Superblock is persistent Update Time : Mon Jun 26 12:53:42 2017 State : clean Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0 Name : 117c606a:0 (local to host 117c606a) UUID : 1d244df2:605bbba2:95ce8d48:297bec93 Events : 86 Number Major Minor RaidDevice State 0 8 1 0 active sync /dev/sda1 1 8 17 1 active sync /dev/sdb1 2 8 33 2 active sync /dev/sdc1 3 8 49 3 active sync /dev/sdd1 /dev/md/1: Version : 1.2 Creation Time : Mon May 15 18:47:04 2017 Raid Level : raid10 Array Size : 1046528 (1022.00 MiB 1071.64 MB) Used Dev Size : 523264 (511.00 MiB 535.82 MB) Raid Devices : 4 Total Devices : 4 Persistence : Superblock is persistent Update Time : Mon Jun 26 10:41:10 2017 State : clean Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0 Layout : near=2 Chunk Size : 512K Name : 117c606a:1 (local to host 117c606a) UUID : 7eb71118:4e648fc4:7b53a41b:99cd94fa Events : 19 Number Major Minor RaidDevice State 0 8 2 0 active sync set-A /dev/sda2 1 8 18 1 active sync set-B /dev/sdb2 2 8 34 2 active sync set-A /dev/sdc2 3 8 50 3 active sync set-B /dev/sdd2 /dev/md/data-0: Version : 1.2 Creation Time : Mon May 15 18:47:04 2017 Raid Level : raid0 Array Size : 23422691328 (22337.62 GiB 23984.84 GB) Raid Devices : 4 Total Devices : 4 Persistence : Superblock is persistent Update Time : Mon May 15 18:47:04 2017 State : clean Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0 Chunk Size : 64K Name : 117c606a:data-0 (local to host 117c606a) UUID : 7abc2a8b:b6a7cb06:e80da4b0:8c8d78e6 Events : 0 Number Major Minor RaidDevice State 0 8 3 0 active sync /dev/sda3 1 8 19 1 active sync /dev/sdb3 2 8 35 2 active sync /dev/sdc3 3 8 51 3 active sync /dev/sdd3
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Lost another volume
Okay, so your data RAID is active which is good. But I can see that it is a RAID0. A scrub will have no effect on a RAID0 as there is no redundancy to recover from in case of the filesystem finding corrupt versions of your files during the scrub. So, if you run a RAID0 (or any other RAID with no redundancy), don't run a scrub.
That being said, the scrub shouldn't break the volume - it just won't do anything useful. Can you enable SSH access and login to the CLI of the NAS and run this command:
journalctl | grep -i btrfs
It is to see if the system logged any filesystem warnings. We need to see if the filesystem is OK.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Lost another volume
Couldn't post the output of that command because it's too long, so I uploaded it to pastebin:
https://pastebin.com/26WTbCLC
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi again,
Yes, so there is definitely some corruption on the filesystem - unfortunately. You can probably see that yourself reading through some of those messages. It would cause issues mounting the volume and thus you see no wolume anymore. This is not a result of the scrub by the way.
It is one of the problems with running a RAID0. You are much more prone to these sort of problems as there is no fault tolerance at all. If any of the disks were stalling or if there are errors on any of the disks it can cause serious issues for a RAID0. Are all the disks OK? You can check it in the disk_info.log
Do you have have a backup of the data? If so, you are best to factory default and restore from backups. Also, you might want to consider whether RAID0 is the correct RAID for your setup? It is rather risky on 4 drives I think.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Lost another volume
The data is not unique and I have backups I can restore from. I'll reformat as XRAID. disk_info.log does not show any issues. Here it is below for your reference, also:
Device: sda Controller: 0 Channel: 0 Model: WL6000GSA12872E Serial: WOL240343186 Firmware: 01.01C01 Class: SATA Sectors: 11721532032 Pool: data PoolType: RAID 0 PoolState: 5 PoolHostId: 117c606a Health data ATA Error Count: 0 Reallocated Sectors: 0 Reallocation Events: 0 Spin Retry Count: 0 Current Pending Sector Count: 0 Uncorrectable Sector Count: 0 Temperature: 36 Start/Stop Count: 184 Power-On Hours: 8504 Power Cycle Count: 133 Load Cycle Count: 153 Device: sdb Controller: 0 Channel: 1 Model: WL6000GSA6472E Serial: WOL240336490 Firmware: 01.0RRE2 Class: SATA RPM: 5700 Sectors: 11721045168 Pool: data PoolType: RAID 0 PoolState: 5 PoolHostId: 117c606a Health data ATA Error Count: 0 Reallocated Sectors: 0 Reallocation Events: 0 Spin Retry Count: 0 Current Pending Sector Count: 0 Uncorrectable Sector Count: 0 Temperature: 39 Start/Stop Count: 227 Power-On Hours: 8890 Power Cycle Count: 140 Load Cycle Count: 187 Device: sdc Controller: 0 Channel: 2 Model: WL6000GSA6472E Serial: WOL240336488 Firmware: 01.0RRE2 Class: SATA RPM: 5700 Sectors: 11721045168 Pool: data PoolType: RAID 0 PoolState: 5 PoolHostId: 117c606a Health data ATA Error Count: 0 Reallocated Sectors: 0 Reallocation Events: 0 Spin Retry Count: 0 Current Pending Sector Count: 0 Uncorrectable Sector Count: 0 Temperature: 39 Start/Stop Count: 213 Power-On Hours: 8737 Power Cycle Count: 136 Load Cycle Count: 175 Device: sdd Controller: 0 Channel: 3 Model: WL6000GSA6472E Serial: WOL240336487 Firmware: 01.0RRE2 Class: SATA RPM: 5700 Sectors: 11721045168 Pool: data PoolType: RAID 0 PoolState: 5 PoolHostId: 117c606a Health data ATA Error Count: 0 Reallocated Sectors: 0 Reallocation Events: 0 Spin Retry Count: 0 Current Pending Sector Count: 0 Uncorrectable Sector Count: 0 Temperature: 36 Start/Stop Count: 221 Power-On Hours: 8826 Power Cycle Count: 139 Load Cycle Count: 178
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Lost another volume
Yes, your disks seems fine here. But a RAID0 is so intolerant that any hicpus can be problematic - the more disks involved the bigger the risk. It is hard to say exactly what caused it. The point is that there is no recovery for the filesystem once it encouters errors - as there is no redundancy.
I am glad you had a backup and that you are considering a RAID with some redundancy.
Just as an FYI - things that can typically lead to filesystem corruption is:
1. Disk issues.
2. Filling the filesystem too much. You should leave about 10% free space.
3. Non-graceful shutdowns (such as power-cuts).
Best of luck!
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Lost another volume
ReclaiME finds my filesystem and it appears that it will be able to recover data from it.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Lost another volume
That is good news if you were in a potential data loss situation. ReclaiME is a good software and might find some files. BTRFS has some built-in tools for that as well, namely: BTRFS restore. But none of those really fixes the filesystem and might find corrupted versions of your files.
So unless you really need some data that isn't backed up, I would still go the factory default route.