× NETGEAR will be terminating ReadyCLOUD service by July 1st, 2023. For more details click here.
Orbi WiFi 7 RBE973
Reply

Disk scrubbing followed by fsck TRASHES filesystem?

dshng
Tutor

Disk scrubbing followed by fsck TRASHES filesystem?

I've been on and off with tech support for 24 hours now…

I explained the first time around that my ReadyNAS 4200 was inaccessible and stuck on FS_CHECK.

The first support professional said that he had seen this before and it could take anywhere from 5 minutes to 20 hours to complete.
I asked him for some more detailed information as I had clients and staff waiting for this essential data to which he responded,

"Whoa… don't tell your client anything."

He continued to say there was nothing he could do and then quietly said,

"Don't tell anybody I told you but just unplug it and it will stop the file system check"

Wow. So I unplugged it and started it up again, but as I suspected, FS_CHECK… so he did say it could take as long as 20 hours so I let it run, and run. 10 hours later I woke up and a bit nervous started to do some checking after looking at some of the logs. fsck is VERY UNHAPPY apparently… thousands of errors about conflicting inode tables.

Talk to tech support again and now I'm elevated to Level 3-

After putting the unit into techsupport mode with the engineer I've waited about 12 hours with a few calls to try for updates with no real answer as to whether all my 18 Tb of data is indeed gone.

The only information I was able to extract was the case notes he read back to me:

    * data recovery may be required
    * all the disks look fine
    * vgscan shows volume group was found
    * fsck found huge amounts of problems
    * tried to mount data volume but was unsuccessful
    * LOOKS LIKE ANOTHER CASE WHERE DISK SCRUBBING FOLLOWED BY FSCK HAS TRASHED THE FILESYSTEM



* UPDATE *

It was confirmed to me by email from the L3 engineer that under the current firmware, if a RAID scrubbing is followed by fsck, the entire filesystem can be corrupted. With that unfortunate news, I am moving forward with data recovery in an attempt to salvage the lost data.

For those of you with existing ReadyNAS enclosures… this advice to me from him is probably relevant...

"With your new volume, I recommend disabling disk scrubbing until 4.2.20 is released."
Message 1 of 4
chirpa
Luminary

Re: Disk scrubbing followed by fsck TRASHES filesystem?

Can you tell me your case #? Could you forward your email thread with the L3?

Message 2 of 4
dshng
Tutor

Re: Disk scrubbing followed by fsck TRASHES filesystem?

1843 3185
Message 3 of 4
dshng
Tutor

Re: Disk scrubbing followed by fsck TRASHES filesystem?

Actually that was the Netgear case number which I believe is closed.
Here's a few more.
1848 8041
1848 8069

Here's the relevant bits of the communication with the L3 tech:

This is **** from Netgear support. I looked at your case yesterday but since I am based in **** and had left the office, I had requested that one of my colleagues from your timezone follow up with you on this. This morning I can see that this did not happen and I am sorry for this.

Unfortunately, the data will not be recoverable in this case. There was an error during the disk scrubbing on April ** and the file system check that ran following this then damaged the file system.

Safeguards against this will be included in the 4.2.20 firmware scheduled for release soon.

In the meantime, you will need to reset this system to factory defaults and create a new volume.

I am sorry for any inconvenience caused. Please let me know if you have any further questions or concerns and I will do my best to address them.


and then after another few questions he confirms that there was an error with the OS configuration that caused my data loss...

> I did notice the release happened after we spoke but did not notice in the
> release notes that it resolves any issues related to raid scrubbing and fsck?
> Are we safe to move forward with this new release or should I be looking
> for a different storage vendor in the interim?

4.2.20 has indeed been released. There is a change in this release whereby it will not run a file system check with auto-fix. Instead it will be read only and will flag any detected errors.


So there you have it. Thank you for confirming that the underlying Netgear OS configuration was to blame.
Message 4 of 4
Top Contributors
Discussion stats
  • 3 replies
  • 3720 views
  • 1 kudo
  • 2 in conversation
Announcements