NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

ReadyNasMan123's avatar
Oct 16, 2010

fsck has taken 6 days so far

Following an abrupt power-off (powercord was unplugged), my ReadyNas NV has been busy doing an fsck for the past 6 days.
I'm able to monitor progress by ssh into the box and looking at /var/log/front/enclosure.log and /tmp/fsck.log. Progress was slow but steady, until 3 days ago when it got to "Pass 2: Checking directory structures". It has been making 0.1% progress each day... Right now it is at 72.8%. CPU usage has been consistently at 95-99%.

Is it normal for an fsck to take this long? Will it ever finish? Can I interrupt it and restart it later? Will it restart where it left off, or have the past 6 days been a waste if I kill the fsck process?

8 Replies

Replies have been turned off for this discussion
  • I would let it finish if the time constraints are possible, though undesirable.
  • Prior to this last fsck I had logged in and performed a 'kill' on the fsck process because I needed to get access to my files. While I understand the file system may have had errors, I had to take the risk. It worked out, I accessed the files I needed and rebooted... with the intent of letting fsck finish.

    So far it has been over a week... progress right now is at 73.4%.. that's 0.6% since Friday evening (it's now Sunday morning).

    Could fsck take longer because the filesystem contains many hard-links? I primarily use the device for backups and each night a hardlink is made to all files from the night before. All together I keep 7 daily backups and 4 weekly backups, so each file is basically hardlinked 11 times. (The exception are new, changed or deleted files)

    nas1:/var/log/frontview# cat enclosure.log
    temp!!1!!status=ok::descr=36.5C/97.7F::expected=0-60C/32-140F
    fan!!1!!status=ok::descr=2142RPM
    ups!!1!!status=not_present::descr=
    volume!!1!!status=ok::descr=Volume C: RAID Level X, Redundant. 1442256 MB (51%) of 2743 GB used
    disk!!1!!status=ok::descr=Channel 1: Seagate ST31000520AS 931 GB, 38C/100F
    disk!!2!!status=ok::descr=Channel 2: Seagate ST31000520AS 931 GB, 42C/107F
    disk!!3!!status=ok::descr=Channel 3: Seagate ST31000520AS 931 GB, 44C/111F
    disk!!4!!status=ok::descr=Channel 4: Seagate ST31000520AS 931 GB, 39C/102F
    model!!0!!::descr=ReadyNAS NV [X-RAID]
    Boot!!FS_CHECK!!73.4%

    nas1:/var/log/frontview# cat /tmp/fsck.log
    ***** File system check performed at Sat Oct 9 20:24:13 CST 2010 *****
    fsck 1.40.2 (12-Jul-2007)
    e2fsck 1.40.2 (12-Jul-2007)
    c was not cleanly unmounted, check forced.
    Pass 1: Checking inodes, blocks, and sizes
    Pass 2: Checking directory structure

    nas1:/var/log/frontview# ps -eaf | grep fsc
    root 574 435 0 Oct09 ttyS1 00:00:00 fsck -R -A -y -- -C-1
    root 575 574 99 Oct09 ttyS1 7-13:21:47 fsck.ext3 -y -C-1 /dev/c/c

    nas1:/var/log/frontview# ls -alt /proc/575/fd/
    total 0
    l-wx------ 1 root root 64 Oct 15 22:19 0 -> /dev/tty
    l-wx------ 1 root root 64 Oct 15 22:19 1 -> /tmp/fsck.log
    lrwx------ 1 root root 64 Oct 15 22:19 3 -> /dev/mapper/c-c
    lrwx------ 1 root root 64 Oct 15 22:19 5 -> /var/cache/e2fsck/fef9ff86-773d-4c8a-84f3-c4af0c65fc21-icount-DyM5Bm
    lrwx------ 1 root root 64 Oct 15 22:19 6 -> /var/cache/e2fsck/fef9ff86-773d-4c8a-84f3-c4af0c65fc21-dirinfo-2EAyps
    lrwx------ 1 root root 64 Oct 15 22:19 7 -> /var/cache/e2fsck/fef9ff86-773d-4c8a-84f3-c4af0c65fc21-icount-c6M7BM
    dr-x------ 2 root root 0 Oct 15 22:18 .
    l-wx------ 1 root root 64 Oct 15 22:18 2 -> /tmp/fsck.log
    dr-xr-xr-x 4 root root 0 Oct 9 20:24 ..
  • It's about 28 hours later now and it has made 0.5% progress... It's at 73.9%

    The e2fsck scratch files continue to grow, very slowly:
    nas1:/var/cache/e2fsck# ls -alt
    total 278116
    -rw------- 1 root root 54706176 Oct 18 12:33 fef9ff86-773d-4c8a-84f3-c4af0c65fc21-icount-c6M7BM
    -rw------- 1 root root 61538304 Oct 18 12:29 fef9ff86-773d-4c8a-84f3-c4af0c65fc21-dirinfo-2EAyps
    drwxr-xr-x 2 root root 4096 Oct 12 22:17 .
    drwxr-xr-x 6 root root 4096 Oct 5 2008 ..

    At this rate it should finish in 2 months....

    Will the next file system check be just as slow?
  • I had to cancel the fsck because I needed access to my files. Next I reboot the device I'll have to ssh into it again and manually kill the fsck.. I don't think I'll ever be able to wait over a week and let it finish....
  • Welcome to my world.

    I have no ReadyNAS but can confirm the problem with high numbers of hardlinks. An old Fileserver (Single-Core-Xeon @2,8Ghz) crashed its external RAID which we used to store rsync-snapshot-backups. Ext4 clearly is not made for very high numbers of hardlinks, we are talking about one million files hardlinked in four complete backuptrees totaling three to six million files. For every single inode fsck has to compute the dangling links which makes 6.000.000 x 6.000.000 = 36.000.000.000.000 calculations.

    When this happened first the Xeon took nearly a month for repair. Btw, it was the RAID-controller which spilled bad data over the drive.

    My solution: take the raid or an image of the partition and put it on a faster system. My Corei7-2600k@4Ghz made it in sixs days.

    Alternatve approach: Let the fsck run until you enter the lengthy "multiply claimed blocks" cycle. At this point your filesystem should(!!!) be mostly(!?!?!) readable as long as you mount it read only. Stop fsck, copy everything, format the corrupted drive, be happy. But beware, hardlinks will not be recognized after stopping fsck, so you will copy every single file as often as it is hardlinked.
  • mdgm-ntgr's avatar
    mdgm-ntgr
    NETGEAR Employee Retired
    Firstly the NV uses EXT3 not EXT4 (I presume EXT3 would probably support less hardlinks than EXT4). The NV does has an Infrant Sparc CPU which is fairly slow.

    An NV would be pretty old. Probably would be a good idea for the OP to upgrade e.g. to a Pro 6 (RNDP6000-200 series) and transfer his/her data across the network. As well as a dual-core 2.6Ghz CPU, it does have nice features like an Online Filesystem Check and Raid Scrubbing that you can schedule which I expect would considerably reduce the likelihood of needing to have a lengthy offline filesystem check.
  • Hi guys,

    As this does indeed appear to be a bug in fsck...

    http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=411838

    ...and as apparently it has been fixed, could we please get this fix included in the next RAIDiator update? I'd guess a lot of people use backup strategies that create lots of hard links, so this is a real gotcha!

    Regards,

    Richard.

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More