NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

Stanman130's avatar
Apr 03, 2014

OS6 Data Recovery - How to Mount BTRFS Volumes

I recently purchased a ReadyNAS 314 and 5 Seagate ST2000VN000 hard disks (one as a cold spare). I work as a system administrator, so I've been reading up on OS6 before I entrust this new system with my data. I'm not very comfortable using BTRFS since it seems to still be a little buggy and I don't need snapshots. But since that's the direction Netgear chose to go, I'd like to give it a fair chance.

The older generation of devices had several excellent web sites with detailed data recovery instructions for situations involving hardware failures. Usually, this involved removing the hard disks and connecting them to a desktop, then booting into a version of Linux and mounting the array using the familiar ext4/LVM tools. I've been searching through the forums and Googling for an updated version, but there don't seem to be any recovery instructions for the new generation. I've seen a lot of discussion about BTRFS including some comments that make me quite concerned.

As mangrove stated in thread 419088
"It should also be said that for EXT solutions, there are restore tools available which you run from a computer. So if the unit fails and you need to get data back, you can without having to buy an identical unit. This is much harder though.

This is impossible with OS6 units as BTRFS is not well supported by restore software."

I'm sure this is paranoid of me, but before I start to trust this device with 4 Tb of data, which is time consuming and difficult to back up in multiple places, I need to know I can access the data if the proprietary hardware fails. The situation I'm thinking of is the ReadyNAS works fine for about 6 years, then the mainboard fails. It's out of warranty, there are no replacement parts and the newer models can't read the volumes due to data structure changes. The disks are fine, the data is there, but it can't be accessed without mounting the disks on a different machine.

Option 1 - Test data recovery using the current OS 6.1.6 build and a spare desktop. Set up the array in the normal way using OS6 tools, save some test data on it, then shut it down and take the disks out. Figure out how to mount the disks and access the volume by connecting them to the desktop and installing the appropriate Linux/kernel/disk tools on a separate disk. Once this is working, create a small Linux build on a thumb drive that is pre-configured to mount up the disks properly. My preferred configuration would be Flex-RAID set up in RAID6. But I'll test Flex-RAID in RAID5 and XRaid2 if I have time.

If that can be done, then I'll go ahead and use the system and just keep an updated version of the recovery thumb drive handy (updated to match each new build of OS6).

I'm not here on the forum to ask someone to do this for me. Since I happen to have a new ReadyNAS 314 with blank hard disks and a spare desktop sitting around, I'm happy to roll up my sleeves and test it myself. I'm not a Linux guru, but the command line doesn't scare me. And at this point, I'm not risking any real data and this will allow me to have my recovery solution already built and ready to go. I'll post the results here in the forum for others, since there doesn't seem to be a definitive solution out there (or if someone can point me to one that already exists? Thanks! I can try that out first and save time!)

What I'm here to ask for, since there are so many very experienced ReadyNAS Jedis, is for some background on the data structure so I can get started. What I need to know is the following:

    1 Which OS would be best to access the data? It appears that Debian is the core of OS6, but which build/kernel should be used?
    [list=2:f5h9ejz4]2 Which tools are needed for testing/repairing/mounting the BTRFS filesystem?
    [list=3:f5h9ejz4]3 A general overview of how the volumes are arranged (ie. it's not LVM anymore, so what is it?)
    [list=4:f5h9ejz4]4 Specific settings to be aware of that might not be standard in the vanilla Linux configuration (ie. block sizes? other things I don't know about at all)
    [list=5:f5h9ejz4]5 Gotchas or special hazards to watch out for when working with BTRFS. I'm really not familiar with it.
    [list=6:f5h9ejz4]6 Which log files show success or failure and which command-line commands can test the volume and data.

    This doesn't have to be super detailed or line-by-line. Just let me know the basics and I can look up the syntax and details in the man pages or online. I'm sure it'll blow up on me at least the first few times and I'll be back on this thread with the output and more questions when I get stuck. :shock:

    Option 2 - Work on a way to change the file structure to the old-style EXT4/LVM so the older generation recovery process works. Yes, I understand that this is not supported and would have to be painfully redone after every OS version upgrade, but it might be a (tedious) solution.

    Just a quick note on what I'm planning to do with this unit - I just need to use iSCSI to connect it to a CentOS 6 virtual server running on VMware. That server will provide all the file management, permissions and network services. I just need OS6 to provide iSCSI connectivity, basic hardware monitoring and UPS-triggered safe shutdown (I think the APC Network Shutdown 3.1 Linux software will work - I'll test that also). The primary role for the ReadyNAS will be to provide centralized backup to another NAS and various servers. Some of the servers will backup and store data from the desktops that will also be backed up to the ReadyNAS.

    I know this probably sounds like belt, suspenders and a flak jacket, but data integrity is a big deal with me. I'm hoping that what I find out will be useful to other people if they have a data access problems (or system meltdown). Plus, since system administration is my day job, this is kind of a scale model of my work system and should be good training as well (score!) :D

    Thanks in advance to all the ReadyNAS experts out there for your time and assistance. I know I'm asking for a lot, but I'll share what I find out in return. Please be patient - I have a little Linux experience, but mostly at the power-user level so I'm weak in some of the admin areas. (Yeah, yeah, I do Windows system administration :( - stop snickering)

    Stan

33 Replies

Replies have been turned off for this discussion
  • mdgm-ntgr's avatar
    mdgm-ntgr
    NETGEAR Employee Retired
    The package is called btrfs-progs (I think I mistakenly referred to it as something else earlier).

    The 4GB partition is the OS partition, the smaller one is for swap and the large one is for your data volume. If you vertically expand your array or create additional volumes there will be additional md devices.
  • Attempting to recover a RAID 6 volume with a missing hard disk

    To attempt this recovery, the configuration is the same as listed in the above entry. I just physically unplugged the ReadyNAS drive number 1 (on the motherboard SATA connector 2) to simulate a completely failed drive.

    I have been attempting to use mdadm to assemble the array in a degraded mode with just the remaining 3 hard disks. So far, no luck, but here's what I found.

    First, the information about the array as I collected previously:

    cat /proc/mdstat
    Personalities:
    md125 : inactive sdd1[3] (S) sdc1[2] (S) sdb1[1] (S)
    12576768 blocks super 1.2

    md126 : inactive sdb2[2] (S) sdd2[3] (S) sdc2[2] (S)
    1572096 blocks super 1.2

    md127 : inactive sdd3[3] (S) sdb3[1] (S) sdc3[2] (S)
    5845994613 blocks super 1.2

    unused devices: <none>


    Here is the result of information gathering about individual drives separately:

    mdadm --examine /dev/sdd3
    /dev/sdd3:
    Magic : a92b4efc
    Version : 1.2
    Feature Map : 0x0
    Array UUID : be4dd378:31e311ab:9af53399:976b5245
    Name : 5e276850:TESTFLX6-0
    Creation Time : Sun Jul 20 07:45:00 2014
    Raid Level : raid6
    Raid Devices : 4

    Avail Dev Size : 3897329742 (1858.39 GiB 1995.43 GB)
    Array Size : 3897329408 (3716.78 GiB 3990.87 GB)
    Used Dev Size : 3897329408 (1858.39 GiB 1995.43 GB)
    Data Offset : 262144 sectors
    Super Offset : 8 sectors
    Unused Space : before=262064 sectors, after=334 sectors
    State : clean
    Device UUID : fa36d5e1:76f3276a:e10be6e3:535a21bb

    Update Time : Mon Jul 21 00:53:10 2014
    Checksum : 21bec207 - correct
    Events : 19

    Layout : left-symmetric
    Chunk Size : 64K

    Device Role : Active device 3
    Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)


    I'm only listing one drive, but the other two report in a similar fashion.



    Here is the result of information gathering about the array:

    mdadm --examine /dev/sdd3 --scan
    ARRAY /dev/md/TESTFLX6-0 metadata=1.2 UUID=be4dd378:31e311ab:9af53399:976b5245 name=5e276850:TESTFLX6-0


    After investigating how to force the array to assemble, I tried this:

    mdadm --create md127 --chunk=64 --level=raid6 --raid-devices=4 --run /dev/sdd3 /dev/sdc3 /dev/sdb3 missing --verbose
    mdadm: layout defaults to left-symmetric
    mdadm: layout defaults to left-symmetric
    mdadm: super1.x cannot open /dev/sdd3: Device or resource busy
    mdmon: ddf: Cannot use /dev/sdd3: Device or resource busy
    mdmon: Cannot use /dev/sdd3: It is busy
    mdadm: cannot open /dev/sdd3: Device or resource busy


    Any ideas about getting this to assemble and mount up as a degraded array with one disk missing? :shock:

    Since RAID 6 should allow an array to be available with up to 2 disks missing, the array should be able to mount with just one missing. The mdadm man information says to use the word "missing" in place of the device that is missing and use the command "--run" to force the assembly of the incomplete array. I tried several variants of the commands including the following (but all state the device is busy):

    mdadm -- assemble --scan --run (FAILED - mdadm: No arrays found in config file or automatically)

    mdadm --assemble --uuid=be4dd378:31e311ab:9af53399:976b5245 --run (FAILED - Device busy)

    mdadm --assemble md127 --uuid=be4dd378:31e311ab:9af53399:976b5245 --run (FAILED - Device busy)

    mdadm --assemble --uuid=be4dd378:31e311ab:9af53399:976b5245 /dev/sdd3 /dev/sdc3 /dev/sdb3 --run (FAILED - Device busy)

    I'd like to find out how to do this with test data first, before I have to try it with real data at risk.

    Thanks for any ideas!

    - Stan
  • Have your tried this order:

    mdadm --create --assume-clean --level=6 --raid-devices=4 --run /dev/md127 missing /dev/sdd3 /dev/sdc3 /dev/sdb3 --verbose


    since sde appears as the first disk on the md127 device (see above post), so "missing" should be placed at first position (I guess). Could you test?

    If I were you I tried the mdadm --assemble before any mdadm --create attempt (read here why).
  • Found the answer!

    After some serious googling and reading tons of related and unrelated articles, I found one that was relevant:

    https://unix.stackexchange.com/questions/26489/mdadm-how-to-reassemble-raid-5-reporting-device-or-resource-busy

    The command shown in this article to eliminate the "device busy" error is "mdadm --stop /dev/mdxxx" in order to release the drives. Apparently the OS picks up the drives automatically during boot up and they're held in "busy" status until the stop command is issued.

    How to Mount Degraded BTRFS Volumes - RAID 6 - One Hard Disk Missing

    Just a quick reminder - the hardware and software set up is the same as the entry above. A FlexRAID RAID 6 array on 4 disks, each disk is 2 Tb and the recovery OS is Fedora Linux 20 updated to the 3.15 kernel.

    Here is the exact method to mount an array with one hard disk failed (ReadyNAS drive 1 disconnected from motherboard SATA connector 2 to simulate drive failure):

    cat /proc/mdstat
    Personalities:
    md125 : inactive sdd1[3] (S) sdc1[2] (S) sdb1[1] (S)
    12576768 blocks super 1.2

    md126 : inactive sdb2[1] (S) sdd2[3] (S) sdc2[2] (S)
    1572096 blocks super 1.2

    md127 : inactive sdd3[3] (S) sdb3[1] (S) sdc3[2] (S)
    5845994613 blocks super 1.2

    unused devices: <none>

    mdadm --stop /dev/md125
    mdadm: stopped /dev/md125

    mdadm --stop /dev/md126
    mdadm: stopped /dev/md126

    mdadm --stop /dev/md127
    mdadm: stopped /dev/md127

    cat /proc/mdstat
    Personalities:
    unused devices: <none>

    mdadm --examine /dev/sdd3 --scan
    ARRAY /dev/md/TESTFLX6-0 metadata=1.2 UUID=be4dd378:31e311ab:9af53399:976b5245 name=5e276850:TESTFLX6-0

    mdadm --assemble md127 --uuid=be4dd378:31e311ab:9af53399:976b5245 --run
    mdadm: /dev/md/md127 has been started with 3 drives (out of 4).

    mount -t btrfs -o ro /dev/md127 /mnt

    cd /mnt
    ls
    home test
    cd test
    ls
    debian-7.4.0-amd64-DVD-1.iso
    <snip>


    It is now possible to use the cp command to copy the files from this directory (/mnt/test) but the directory name will depend on the configuration on your particular ReadyNAS. I copied the test files (5 Debian DVD ISO files - about 17 Gb) over to the home directory and then ran the SHA1SUM utility and verified them. They all passed.

    Note the following:

    1) I don't think this is data destructive since the volume is mounted read-only and this is just requesting an existing volume be assembled (not creating or changing anything). Maybe people with more Linux experience can explain more.

    2) The metadata (ie. superblocks) may change once the volume is assembled - I don't know what that means in terms of data risk. (ie. timestamps?)

    3) I ran SHA1 sums on the ISO files copied from the degraded volume and the copies verified correctly (no data corruption even from a degraded volume).

    4) The mdadm utility will give an error (see above) and I had desktop popups on Fedora stating that the volume was degraded. But the data was accessible.

    5) When using the "mdadm --examine" command, the device can be any of the partitions that are part of the volume you are trying to access. In this case, the md127 volume is the data volume and is made up of the 3 remaining partitions (sdd3, sdc3 and sdb3). To get the UUID to identify the array, the command "mdadm --examine /dev/sdd3 --scan" can be used, but any of the partitions can also be used. The "--scan" option will give the ARRAY information from any of the component partitions.

    6) Any of the other volumes can be mounted the same way (except the swap volume). This might be useful - the OS volume could be mounted in order to check the ReadyNAS OS log files to find out what problems caused a crash.

    For more details, a good page explaining the mdadm utility command line options and syntax:

    http://linux.die.net/man/8/mdadm

    Overall, this appears to be a workable solution for data recovery for RAID 6 with a missing (failed) hard disk.

    Now for hard-core data recovery: 2 disks missing! :D :D :D

    -Stan
  • How to Mount Degraded BTRFS Volumes - RAID 6 - Two Hard Disks Missing

    I just did a quick test of the degraded array with one disconnected hard disk. I shut down the machine normally and reconnected the missing hard disk. Upon start up, the array was identified and assembled normally. The array behaved as it did in the first test of the FlexRAID RAID 6 array and the array was assembled automatically during boot up. All that was needed was just to mount the array using "mount -t btrfs -o ro /dev/md127 /mnt" and then the test files were accessible. So an array can apparently be started up with a missing disk and if the missing disk is added back in, it rejoins the array without requiring a rebuild or any repairs.

    Then I shut down the system and disconnected ReadyNAS disks 2 and 4 from the motherboard to simulate a double hard disk failure.

    Following the exact same procedure as used for the "One Hard Disk Missing" recovery above - everything worked the same.

    Stop the auto-assembled arrays.

    Re-assemble the data array using the UUID and the "--run" switch.

    Then mount it the same way. The data was again copied to the home folder and it passed verification with SHA1 sums. :D :D :D

    RAID 6 does allow 2 disks to be removed from the array and the degraded array can still be mounted and the data recovered. :)

    Note the following cautions:

    1) This is a pretty ideal situation. There is no data corruption from viruses, worms, malware or hardware malfunction. The drives were cleanly removed from a shut down system and suffered no lingering data corruption or slow failure. The success of this recovery method does NOT mean that backups aren't necessary - backup your data! :x

    2) The array gets very slow when it's in a degraded state. Copying 17 Gb of test data is not the same as moving 3.6 Tb of data (the full formatted size of the array). Data recovery will need lots of space and lots of time. A fast processor and plenty of RAM will help (ie. don't assume you can use an old, slow, junk PC to recover your data). And don't forget the cooling or you might have a catastrophic failure in the middle of your recovery process.

    3) The mdadm and underlying BTRFS filesystem appears to be pretty robust (keeping in mind caution number 1!) But I'm not enough of a Linux expert to know where the "gotchas" are in this process. This is definitely a "last resort" process. Try all the normal ReadyNAS repairs and recovery techniques first before resorting to this method.

    4) Don't be afraid to ask for help. There are a lot of knowledgeable people on this forum (thanks everyone! :) ) and a lot of related information floating around at various sites. Most important of all - Don't Panic! Always start by gathering information first and if you're not sure about the changes you're making, ask questions and get help BEFORE you make them. Some are irreversible. I always try all the non-destructive methods first before I start changing settings or zeroing superblocks. I didn't risk any data in this testing and I'm glad I found out how to do this BEFORE I might need to do it. Test thoroughly before making permanent changes.

    -Stan
  • I have been recovering data from my failed 2-bay RN102 using mdgm's suggestions above.

    # apt-get update
    # apt-get install mdadm btrfs-tools
    # mdadm --assemble --scan
    # cat /proc/mdstat
    # mount -t btrfs -o ro /dev/md127 /mnt

    It worked without any problems on the first disk, but when I tried to do the same with the second disk (just to try it), mdadm would not activate the data and swap partitions - only the root partition. data and swap were marked as "Inactive" and the disk was marked as 'Spare' in cat /proc/mdstat

    After a lot of reading up on mdadm, what finally worked for me, was to use the "mdadm --stop ..." command after I was finished using one of the disks - that is

    # mdadm --stop /dev/md127
    mdadm: stopped /dev/md127

    After doing this on both swap and data, "mdadm --assemble --scan" was able to activate all raid-partitions.

    When a raid disk is only mounted to recover data, isn't it correct and necessary to stop the raid partitions (sorry for my wrong terminology) to be able to mount another raid disk?

    Regards

    Hans-Ole,
    DK
  • mdgm-ntgr's avatar
    mdgm-ntgr
    NETGEAR Employee Retired
    Well in the ReadyNAS at least if you have several separate volumes they could be mounted concurrently.

    Was your volume an X-RAID or RAID-1 volume or were you using a separate volume for each disk?

    With just the one volume if the partitions are out of sync then mdadm will typically choose to start the array using one disk not the other. So if you want to try the other partition in such a case you would need to stop the array and start the array again specifying the partition on the disk you want to use.

    The root and swap use RAID regardless of whether that is used for the data volume.
  • Hey all,

    First of all, just want to thank you for this thread, even almost three years on - it's still very helpful!

    I understand I'm resurrecting an old thread here, but having trouble with copying data/migrating to the readynas rn102.

    Using the below method I have been able to mount the readynas drives in a standard Linux machine (running Ubuntu Linux Mint 18.1) and then copy data across to the drive locally.
    As I'm copying about 4tb of data, I didn't want to do this over the network.
    # apt-get update
    # apt-get install mdadm btrfs-tools
    # mdadm --assemble --scan
    # cat /proc/mdstat
    # mount -t btrfs -o ro /dev/md127 /mnt
    Then copy the data through gui on linux

    I can successfully copy data to the drive, and then still mount the drive through the NAS (without it reformatting the drive).

    My only issue is, the NAS can't see the new data. Not through SMB share, nor through the browser frontend.

    I can still take the drive out and read the data in the Linux machine, so it leads me to believe it could either be a file attributes/access issue, or some sort of indexing file of the NAS not updating.

    I can even create a folder on the NAS and remount in Linux to see it. Just not the other way around.

    Has anyone come across this before?

    I'm not all too savvy with Linux, but can follow command line.
    Is there something I'm missing regarding file attributes? Or does the NAS have a built in drive indexing I can access?

    Thanks in advance

    - Matt
    • mdgm-ntgr's avatar
      mdgm-ntgr
      NETGEAR Employee Retired

      Shares on the NAS need to be created using the web admin GUI so don't create a folder directly within the data volume using some other method. Once you've copied the files across depending on which method you use it may be necessary to reset the ownership/permissions.

      • ArctiX's avatar
        ArctiX
        Aspirant

        Thanks mdgm for the reply.

         

        I'm at work at the moment, but I'll give that a try when I get home.

         

        Is this applicable to folders within folders as well? For example, if I created a folder on the NAS to share, then copied files in folders to the shared folder - would they be accessible?

         

        Secondly, I had also tested copying from linux using a video file into an existing shared folder. Strangely, even though I could view the rest of the contents in the shared folder which was copied via SMB, the newly copied video file did not appear.

         

        What is the suggested way of local copying from a linux distro? And how would you also reset the permissions from the RN102?

         

        EDIT :  I am running 2x 3TB disks as single JBOD volumes. Network access is via both AFP and SMB through Mac/Windows respectively, and the same behaviour occurs on both drives.

         

        Thanks again for your help,

  • Hey mdgm, thanks heaps for your help!
    You're completely right! It was the combination of not creating a share folder in the NAS first, and also incorrect permissions for the video file.
    I reset all owner/permissions for the folders/files recursively using
    Chown -R admin <folder>
    Chmod -R 755 <folder>

    I'll probably look at setting correct permissions that coincide with the network through the frontend. As suggested with your link.

    Thanks again for your help!
  • Another option that I have successfully used to transfer/recover data from my RN312 using a standard Windows PC is an external hard drive reader (https://www.amazon.com/gp/product/B00APP6694) and recovery software capable of mounting BTRFS:

    http://www.reclaime.com/library/netgear-recovery.aspx

     

    Long story short, I had to factory reset my RN312.  I left 1 drive in and performed the factory reset. Using the above software I was able to then copy all of the data back onto the now factory reset RN312 before reinserting the second drive.

     

    For anyone looking for a quick BTRFS recovery, this worked well for me.

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More