Forum Discussion

Aspirant

Dec 17, 2015

Solved

Volume scan failed to run properly #26188803

I have a ReadyNAS Ultra 4 with four disks in X-RAID 2. One disk had several ATA errors and I replaced the old disk 1.5 TB with a new 4 TB disc. After reboot the NAS started with sync, and after about...

Adding Disks

Volume scan failed to run properly Ultra 4

JimTho

Jan 30, 2016

Hallelujah!

I have now managed to get my data volume back up! Unfortunately, Netgear L2 tech did not manage and I had to do this by myself.

As promised I will give you the results. Note that I am not a trained IT-engineer, so I got this from Google search and dedication. If you decide to do this it is on your own risk - I take no responsibility that it will work on your system.

I had only 4 sata-ports on my PC, and had to install Linux on a USB in order to get all 4 disks connected. I wanted to use Knoppix Linux, but I am sure most Linux versions would do.

Get Linux installed on the computer:

I installed Knoppix Linux to a USB-stick. This was not trivial as I had a new Z170 motherboard and a regular USB-boot would not work, using Universial-USB-installer-1.9.6.3 or unetbootin-windows-613. I ended up attaching a sata DVD and burned a Knoppix DVD, booted from the DVD and installed Knoppix on the USB.

Removed the DVD-RW drive and attached all the 4 NAS-drives to the PC. Booted up in Knoppix (USB) and started to see if I could access the drives. I noticed that Knoppix displayed one mounted volume and a few volumes that was not mounted (these volumes turned out to be LVM physical volumes). I could access the files on the mounted RAID volume which turned out to be the NAS OS.

I performed several different commands after some Google search, being careful not to run anything that could make changes to the drive, in case I needed to perform some data recovery.

First I checked the partition tables on all four drives using gdisk:

knoppix@Microknoppix:~$ sudo gdisk /dev/sda

They were identical, not shown as I did this in four different windows.

Then I wanted to checked the raid setup and ran:

knoppix@Microknoppix:~$ sudo mdadm --detail --scan

ARRAY /dev/md/4 metadata=1.2 name=A021B7C18D0C:4 UUID=d6301b60:0ce2f767:558c574f:db007ccb

ARRAY /dev/md/1 metadata=1.2 name=A021B7C18D0C:1 UUID=d2791ec8:5adda84e:c7463c2e:c0f2016b

ARRAY /dev/md/0 metadata=1.2 name=A021B7C18D0C:0 UUID=a218f0a3:1b607e2e:953b087b:04ed9c99

INACTIVE-ARRAY /dev/md3 metadata=1.2 name=A021B7C18D0C:3 UUID=5aa62eb3:fa4e39b8:213486da:d587542d

ARRAY /dev/md/2 metadata=1.2 name=A021B7C18D0C:2 UUID=829ccffc:55683ba6:36bb7959:6eed3523

From this I figured out there was an inactive array md3.

Then I used e2fsck to check the partition:

knoppix@Microknoppix:~$ e2fsck /dev/md3

e2fsck 1.42.13 (17-May-2015)

e2fsck: Invalid argument while trying to open /dev/md3

 

The superblock could not be read or does not describe a valid ext2/ext3/ext4

filesystem. If the device is valid and it really contains an ext2/ext3/ext4

filesystem (and not swap or ufs or something else), then the superblock

is corrupt, and you might try running e2fsck with an alternate superblock:

   e2fsck -b 8193 <device>

or

   e2fsck -b 32768 <device>

This made me think there was a problem with the superblocks on the partitions, that turned out not to be important. Searching and looking for answers I decided to stop the array and start it again:

knoppix@Microknoppix:~$ sudo mdadm --stop --scan

knoppix@Microknoppix:~$ sudo mdadm --assemble --scan

mdadm: /dev/md/4 has been started with 2 drives (out of 3).

mdadm: restoring critical section

mdadm: /dev/md/3 has been started with 4 drives.

mdadm: /dev/md/2 has been started with 4 drives.

mdadm: /dev/md/1 has been started with 4 drives.

mdadm: /dev/md/0 has been started with 4 drives.

mdadm: Found some drive for an array that is already active: /dev/md/4

mdadm: giving up.

Then used lvmdiskscan to see if I could see the volumes and if there was a problem with any of them :

knoppix@Microknoppix:~$ sudo lvmdiskscan

/run/lvm/lvmetad.socket: connect failed: No such file or directory

WARNING: Failed to connect to lvmetad. Falling back to internal scanning.

/dev/ram0 [       4.00 MiB]

/dev/md0   [       4.00 GiB]

/dev/ram1 [       4.00 MiB]

/dev/md1   [   1023.88 MiB]

/dev/ram2 [       4.00 MiB]

/dev/md2   [       4.08 TiB] LVM physical volume

/dev/ram3 [       4.00 MiB]

/dev/md3   [     931.50 GiB] LVM physical volume

/dev/ram4 [       4.00 MiB]

/dev/md4   [       3.64 TiB] LVM physical volume

/dev/ram5 [       4.00 MiB]

/dev/ram6 [       4.00 MiB]

/dev/ram7 [       4.00 MiB]

/dev/ram8 [       4.00 MiB]

/dev/ram9 [       4.00 MiB]

/dev/ram10 [       4.00 MiB]

/dev/ram11 [       4.00 MiB]

/dev/ram12 [       4.00 MiB]

/dev/ram13 [       4.00 MiB]

/dev/ram14 [       4.00 MiB]

/dev/ram15 [       4.00 MiB]

/dev/sde1 [       4.46 GiB]

/dev/sde2 [     24.82 GiB]

/dev/sdf1 [       4.46 GiB]

0 disks

21 partitions

0 LVM physical volume whole disks

3 LVM physical volumes

There was 3 volumes listed. Followed up with lvdisplay to see the logical volume:

knoppix@Microknoppix:~$ sudo lvdisplay

/run/lvm/lvmetad.socket: connect failed: No such file or directory

WARNING: Failed to connect to lvmetad. Falling back to internal scanning.

--- Logical volume ---

LV Path               /dev/c/c

LV Name               c

VG Name               c

LV UUID               DHaiSO-OE5j-wbTe-rW1L-Zh1L-DNFP-vbPjvA

LV Write Access       read/write

LV Creation host, time ,

LV Status             NOT available

LV Size               6.80 TiB

Current LE             111404

Segments               3

Allocation             inherit

Read ahead sectors     auto

From here I assumed the volume c was not available. Followed up with lvscan:

knoppix@Microknoppix:~$ sudo lvscan

/run/lvm/lvmetad.socket: connect failed: No such file or directory

WARNING: Failed to connect to lvmetad. Falling back to internal scanning.

inactive         '/dev/c/c' [6.80 TiB] inherit

Hmm. The data volume (c) was inactive. Now, I had previously tried to activate the array using mdadm --detail --scan. I searched the web further and came across this site/post that solved the case: http://pissedoffadmins.com/os/mount-unknown-filesystem-type-lvm2_member.html

knoppix@Microknoppix:~$ modprobe dm-mod

knoppix@Microknoppix:~$ sudo vgchange -ay

/run/lvm/lvmetad.socket: connect failed: No such file or directory

WARNING: Failed to connect to lvmetad. Falling back to internal scanning.

1 logical volume(s) in volume group "c" now active

Voila! The volume came up and I then managed to mount it! I put all the disks back in the Netgear NAS and it booted normally. I am now transferring files to the other backup Netgear NAS as we speak. I guess this will take a bit. Also the 4th disk is now resyncing.

Sat Jan 30 17:04:37 CET 2016 System is up.

Sat Jan 30 17:04:37 CET 2016 Volume C is approaching capacity: 88% used 878G available

Sun Jan 17 12:15:59 CET 2016 System is up.

Sun Jan 17 12:15:59 CET 2016 The paths for the shares listed below could not be found. Typically, this occurs when the ReadyNAS is unable to access the data volume. Squeezeboxserver Documents Video media Photos Music

Sun Jan 17 12:15:41 CET 2016 Volume scan failed to run properly.

I hope this can be useful for others, including the L2 Netgear support, which in my opinion should have been able to address this issue in the first place. Not letting me go searching around the web for possible solutions. If I am able to figure this out (though I have a PhD in genetics, and have been around computers for 25 years) an engineer at Netgear definitely should have fixed this easily. This in my point qualify for a refund! Also, that Netgear does not log their service to provide proof/documentation of their work is surprising.

I am happy I figured it out, and hope this can be useful for someone else in a similar situation.

StephenB

Guru - Experienced User

Dec 31, 2015

tech support mode would allow telnet access on your local LAN, but that is normally blocked by your router over the internet.

It also allows remote access by Netgear (though they need an access code that you provide them in order to do that)

JimTho

Aspirant

Dec 31, 2015

StephenB wrote:
tech support mode would allow telnet access on your local LAN, but that is normally blocked by your router over the internet.

It also allows remote access by Netgear (though they need an access code that you provide them in order to do that)

Ok, but I am sure someone can exploit this, and that is why I am reluctant in keeping it in this mode for too long.

I have not yet been told to give them a code, so I guess they will not access it this year ... :smileysad:

JimTho
Aspirant
Jan 14, 2016
I promised to keep this post updated, but I am sorry to inform that I do not have much more information to update with yet.

Tech Support has not asked for an "access code" but have accessed the NAS without.

The NAS seems to still be in sync process after 9 days, at least that is the last response I got from Tech Support.

The sync process seems to be very very slow process, but I will keep this post updated when I get more information. Tech Support is not sure if the data is ok yet, I guess they will have to wait for the sync process to finish. Myself, I am puzzled that the sync process can take this long, but I have no possibility to monitor the status myself as the NAS is still in "debug mode" and has been so for the last 15-16 days. :mansad:
- JimTho
  Aspirant
  Jan 17, 2016
  UPDATE!
  I got feedback from Tech Support the 15. of January:
  
  Tech Support wrote 15. january 2016
  Hello Jim,
  
  I spoke to my colleague who was working on the NAS.
  He explains the issue seems to either be caused at the time the one disk failed, or during the sync when the larger disk was used as a replacement.
  It may have tried to reshape the RAID and this has caused the issue. At present it is not something we can fix at level 2.
  
  To escalate this issue to level 3 you would need to purchase a data recovery contract.
  The contract is around 155euro.
  
  But it is not clear if the data can be recovered. It is hard to estimate but my colleague who looked at the RAID said he thinks there is about a 50/50% chance.
  Please understand its very hard to estimate, even with the data recovery and level 3 involvement there is a chance no data can be recovered.
  Due to the nature of disks, raid and storage in general it is not something to which a guarantee can be given.
  
  Let me know if you are interested in the data recovery contract or if you have any questions.
  
  The NAS is still missing the volume. How could L2 tech support start a sync process if the volume was not fixed?
  
  I have connected the drives back to my computer and run the recovery software today. ALL FILES are GONE! The drives can not be assembled in ReclaiMe. I have also tried "NAS data recovery" from Runtime to no awail. I am so utterly upset now that I can hardly sit still. Almost 7 TB of data seems to be gone after Netgear Support L2 initiated ("forced"?) sync on the NAS. I should have recovered the data while I had the chance using the recovery software, but I put the trust in tech support. Now, tech support suggest to pay 155 euro for L3 to have a look at it. :smileysad:
  
  I have requested all log files to be handed over to me as agreed upon with L2 support when I accepted the terms, all SSH commands/script used during the 15+ days it was in "debug mode". Maybe there is something that can be fixed after I get this data.
  
  Anyone with suggestions of what to do next?
  - JimTho
    Aspirant
    Jan 17, 2016
    Just got hold of some of the log files from the NAS. Interestingly it seems that the raid.conf file content has changed after L2 involvement:
    
    Old file content raid.conf:
    /dev/md0,root!!number=0,chan=0,dev=/dev/sda1,model=WDC WD40EFRX-68WT0N0,sectors=7814037168,raid_disk=0!!number=1,chan=1,dev=/dev/sdb1,model=WDC WD40EFRX-68WT0N0,sectors=7814037168,raid_disk=1!!number=2,chan=2,dev=/dev/sdc1,model=WDC WD20EARS-00MVWB0,sectors=3907027054,raid_disk=2
    
    /dev/md1,swap!!number=0,chan=0,dev=/dev/sda2,model=WDC WD40EFRX-68WT0N0,sectors=7814037168,raid_disk=0!!number=1,chan=1,dev=/dev/sdb2,model=WDC WD40EFRX-68WT0N0,sectors=7814037168,raid_disk=1!!number=2,chan=2,dev=/dev/sdc2,model=WDC WD20EARS-00MVWB0,sectors=3907027054,raid_disk=2!!number=4,chan=3,dev=/dev/sdd2,model=Seagate ST4000VN000-1H4168,sectors=7814037168,raid_disk=3
    
    /dev/md2,C!!number=0,chan=0,dev=/dev/sda3,model=WDC WD40EFRX-68WT0N0,sectors=7814037168,raid_disk=0!!number=1,chan=1,dev=/dev/sdb3,model=WDC WD40EFRX-68WT0N0,sectors=7814037168,raid_disk=1!!number=2,chan=2,dev=/dev/sdc3,model=WDC WD20EARS-00MVWB0,sectors=3907027054,raid_disk=2!!number=3,chan=3,dev=/dev/sdd3,model=Seagate ST31500341AS,sectors=2930275054,raid_disk=3
    
    /dev/md4,swap!!number=0,chan=0,dev=/dev/sda5,model=WDC WD40EFRX-68WT0N0,sectors=7814037168,raid_disk=0!!number=1,chan=1,dev=/dev/sdb5,model=WDC WD40EFRX-68WT0N0,sectors=7814037168,raid_disk=1!!number=2,chan=3,dev=/dev/sdd5,model=Seagate ST4000VN000-1H4168,sectors=7814037168,raid_disk=2
    
    New file content raid.conf:
    /dev/md0,root!!number=0,chan=0,dev=/dev/sda1,model=WDC WD40EFRX-68WT0N0,sectors=7814037168,raid_disk=0!!number=1,chan=1,dev=/dev/sdb1,model=WDC WD40EFRX-68WT0N0,sectors=7814037168,raid_disk=1!!number=2,chan=2,dev=/dev/sdc1,model=WDC WD20EARS-00MVWB0,sectors=3907027054,raid_disk=2!!number=4,chan=3,dev=/dev/sdd1,model=Seagate ST4000VN000-1H4168,sectors=7814037168,raid_disk=3
    
    /dev/md1,swap!!number=0,chan=0,dev=/dev/sda2,model=WDC WD40EFRX-68WT0N0,sectors=7814037168,raid_disk=0!!number=1,chan=1,dev=/dev/sdb2,model=WDC WD40EFRX-68WT0N0,sectors=7814037168,raid_disk=1!!number=2,chan=2,dev=/dev/sdc2,model=WDC WD20EARS-00MVWB0,sectors=3907027054,raid_disk=2!!number=4,chan=3,dev=/dev/sdd2,model=Seagate ST4000VN000-1H4168,sectors=7814037168,raid_disk=3
    
    /dev/md2,swap!!number=0,chan=0,dev=/dev/sda3,model=WDC WD40EFRX-68WT0N0,sectors=7814037168,raid_disk=0!!number=1,chan=1,dev=/dev/sdb3,model=WDC WD40EFRX-68WT0N0,sectors=7814037168,raid_disk=1!!number=2,chan=2,dev=/dev/sdc3,model=WDC WD20EARS-00MVWB0,sectors=3907027054,raid_disk=2!!number=4,chan=3,dev=/dev/sdd3,model=Seagate ST4000VN000-1H4168,sectors=7814037168,raid_disk=3
    
    /dev/md4,swap!!number=0,chan=0,dev=/dev/sda5,model=WDC WD40EFRX-68WT0N0,sectors=7814037168,raid_disk=0!!number=1,chan=1,dev=/dev/sdb5,model=WDC WD40EFRX-68WT0N0,sectors=7814037168,raid_disk=1!!number=2,chan=3,dev=/dev/sdd5,model=Seagate ST4000VN000-1H4168,sectors=7814037168,raid_disk=2
    
    I noticed in the new raid.conf there is no /dev/md*,C!
    Should it be present? If so, why is it no longer on sda3?
    
    Noticed also that the reference to the model name is wrong in the new config file, in bold, last part of the lines. Not sure if this is of importance though.

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

Learn More

Forum Discussion

Volume scan failed to run properly #26188803

Related Content

Volume Expansion question

Nighthawk App not working properly

Volume scan failed to run properly - ReadyNAS Ultra 4

Orbi app stopped functioning properly today - RBR750

ReadyNAS Pro 6 - Volume scan failed to run properly

NETGEAR Academy

ProSupport for Business