Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
fsck has taken 6 days so far
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2010-10-15
09:06 PM
2010-10-15
09:06 PM
fsck has taken 6 days so far
Following an abrupt power-off (powercord was unplugged), my ReadyNas NV has been busy doing an fsck for the past 6 days.
I'm able to monitor progress by ssh into the box and looking at /var/log/front/enclosure.log and /tmp/fsck.log. Progress was slow but steady, until 3 days ago when it got to "Pass 2: Checking directory structures". It has been making 0.1% progress each day... Right now it is at 72.8%. CPU usage has been consistently at 95-99%.
Is it normal for an fsck to take this long? Will it ever finish? Can I interrupt it and restart it later? Will it restart where it left off, or have the past 6 days been a waste if I kill the fsck process?
I'm able to monitor progress by ssh into the box and looking at /var/log/front/enclosure.log and /tmp/fsck.log. Progress was slow but steady, until 3 days ago when it got to "Pass 2: Checking directory structures". It has been making 0.1% progress each day... Right now it is at 72.8%. CPU usage has been consistently at 95-99%.
Is it normal for an fsck to take this long? Will it ever finish? Can I interrupt it and restart it later? Will it restart where it left off, or have the past 6 days been a waste if I kill the fsck process?
Message 1 of 9
Labels:
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2010-10-15
09:14 PM
2010-10-15
09:14 PM
Re: fsck has taken 6 days so far
Sounds like something is wrong. The filesystem check should take hours at most, not days.
Edit: see post below. Best to let it finish.
If not, you might wish to follow the advice here: http://www.readynas.com/forum/faq.php#How_can_I_skip_the_Volume_check%3F
Be sure to download your logs.
Welcome to the forum!
Edit: see post below. Best to let it finish.
If not, you might wish to follow the advice here: http://www.readynas.com/forum/faq.php#How_can_I_skip_the_Volume_check%3F
Be sure to download your logs.
Welcome to the forum!
Message 2 of 9
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2010-10-15
09:20 PM
2010-10-15
09:20 PM
Re: fsck has taken 6 days so far
I would let it finish if the time constraints are possible, though undesirable.
Message 3 of 9
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2010-10-17
09:51 AM
2010-10-17
09:51 AM
Re: fsck has taken 6 days so far
Prior to this last fsck I had logged in and performed a 'kill' on the fsck process because I needed to get access to my files. While I understand the file system may have had errors, I had to take the risk. It worked out, I accessed the files I needed and rebooted... with the intent of letting fsck finish.
So far it has been over a week... progress right now is at 73.4%.. that's 0.6% since Friday evening (it's now Sunday morning).
Could fsck take longer because the filesystem contains many hard-links? I primarily use the device for backups and each night a hardlink is made to all files from the night before. All together I keep 7 daily backups and 4 weekly backups, so each file is basically hardlinked 11 times. (The exception are new, changed or deleted files)
nas1:/var/log/frontview# cat enclosure.log
temp!!1!!status=ok::descr=36.5C/97.7F::expected=0-60C/32-140F
fan!!1!!status=ok::descr=2142RPM
ups!!1!!status=not_present::descr=
volume!!1!!status=ok::descr=Volume C: RAID Level X, Redundant. 1442256 MB (51%) of 2743 GB used
disk!!1!!status=ok::descr=Channel 1: Seagate ST31000520AS 931 GB, 38C/100F
disk!!2!!status=ok::descr=Channel 2: Seagate ST31000520AS 931 GB, 42C/107F
disk!!3!!status=ok::descr=Channel 3: Seagate ST31000520AS 931 GB, 44C/111F
disk!!4!!status=ok::descr=Channel 4: Seagate ST31000520AS 931 GB, 39C/102F
model!!0!!::descr=ReadyNAS NV [X-RAID]
Boot!!FS_CHECK!!73.4%
nas1:/var/log/frontview# cat /tmp/fsck.log
***** File system check performed at Sat Oct 9 20:24:13 CST 2010 *****
fsck 1.40.2 (12-Jul-2007)
e2fsck 1.40.2 (12-Jul-2007)
c was not cleanly unmounted, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
nas1:/var/log/frontview# ps -eaf | grep fsc
root 574 435 0 Oct09 ttyS1 00:00:00 fsck -R -A -y -- -C-1
root 575 574 99 Oct09 ttyS1 7-13:21:47 fsck.ext3 -y -C-1 /dev/c/c
nas1:/var/log/frontview# ls -alt /proc/575/fd/
total 0
l-wx------ 1 root root 64 Oct 15 22:19 0 -> /dev/tty
l-wx------ 1 root root 64 Oct 15 22:19 1 -> /tmp/fsck.log
lrwx------ 1 root root 64 Oct 15 22:19 3 -> /dev/mapper/c-c
lrwx------ 1 root root 64 Oct 15 22:19 5 -> /var/cache/e2fsck/fef9ff86-773d-4c8a-84f3-c4af0c65fc21-icount-DyM5Bm
lrwx------ 1 root root 64 Oct 15 22:19 6 -> /var/cache/e2fsck/fef9ff86-773d-4c8a-84f3-c4af0c65fc21-dirinfo-2EAyps
lrwx------ 1 root root 64 Oct 15 22:19 7 -> /var/cache/e2fsck/fef9ff86-773d-4c8a-84f3-c4af0c65fc21-icount-c6M7BM
dr-x------ 2 root root 0 Oct 15 22:18 .
l-wx------ 1 root root 64 Oct 15 22:18 2 -> /tmp/fsck.log
dr-xr-xr-x 4 root root 0 Oct 9 20:24 ..
So far it has been over a week... progress right now is at 73.4%.. that's 0.6% since Friday evening (it's now Sunday morning).
Could fsck take longer because the filesystem contains many hard-links? I primarily use the device for backups and each night a hardlink is made to all files from the night before. All together I keep 7 daily backups and 4 weekly backups, so each file is basically hardlinked 11 times. (The exception are new, changed or deleted files)
nas1:/var/log/frontview# cat enclosure.log
temp!!1!!status=ok::descr=36.5C/97.7F::expected=0-60C/32-140F
fan!!1!!status=ok::descr=2142RPM
ups!!1!!status=not_present::descr=
volume!!1!!status=ok::descr=Volume C: RAID Level X, Redundant. 1442256 MB (51%) of 2743 GB used
disk!!1!!status=ok::descr=Channel 1: Seagate ST31000520AS 931 GB, 38C/100F
disk!!2!!status=ok::descr=Channel 2: Seagate ST31000520AS 931 GB, 42C/107F
disk!!3!!status=ok::descr=Channel 3: Seagate ST31000520AS 931 GB, 44C/111F
disk!!4!!status=ok::descr=Channel 4: Seagate ST31000520AS 931 GB, 39C/102F
model!!0!!::descr=ReadyNAS NV [X-RAID]
Boot!!FS_CHECK!!73.4%
nas1:/var/log/frontview# cat /tmp/fsck.log
***** File system check performed at Sat Oct 9 20:24:13 CST 2010 *****
fsck 1.40.2 (12-Jul-2007)
e2fsck 1.40.2 (12-Jul-2007)
c was not cleanly unmounted, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
nas1:/var/log/frontview# ps -eaf | grep fsc
root 574 435 0 Oct09 ttyS1 00:00:00 fsck -R -A -y -- -C-1
root 575 574 99 Oct09 ttyS1 7-13:21:47 fsck.ext3 -y -C-1 /dev/c/c
nas1:/var/log/frontview# ls -alt /proc/575/fd/
total 0
l-wx------ 1 root root 64 Oct 15 22:19 0 -> /dev/tty
l-wx------ 1 root root 64 Oct 15 22:19 1 -> /tmp/fsck.log
lrwx------ 1 root root 64 Oct 15 22:19 3 -> /dev/mapper/c-c
lrwx------ 1 root root 64 Oct 15 22:19 5 -> /var/cache/e2fsck/fef9ff86-773d-4c8a-84f3-c4af0c65fc21-icount-DyM5Bm
lrwx------ 1 root root 64 Oct 15 22:19 6 -> /var/cache/e2fsck/fef9ff86-773d-4c8a-84f3-c4af0c65fc21-dirinfo-2EAyps
lrwx------ 1 root root 64 Oct 15 22:19 7 -> /var/cache/e2fsck/fef9ff86-773d-4c8a-84f3-c4af0c65fc21-icount-c6M7BM
dr-x------ 2 root root 0 Oct 15 22:18 .
l-wx------ 1 root root 64 Oct 15 22:18 2 -> /tmp/fsck.log
dr-xr-xr-x 4 root root 0 Oct 9 20:24 ..
Message 4 of 9
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2010-10-18
11:30 AM
2010-10-18
11:30 AM
Re: fsck has taken 6 days so far
It's about 28 hours later now and it has made 0.5% progress... It's at 73.9%
The e2fsck scratch files continue to grow, very slowly:
nas1:/var/cache/e2fsck# ls -alt
total 278116
-rw------- 1 root root 54706176 Oct 18 12:33 fef9ff86-773d-4c8a-84f3-c4af0c65fc21-icount-c6M7BM
-rw------- 1 root root 61538304 Oct 18 12:29 fef9ff86-773d-4c8a-84f3-c4af0c65fc21-dirinfo-2EAyps
drwxr-xr-x 2 root root 4096 Oct 12 22:17 .
drwxr-xr-x 6 root root 4096 Oct 5 2008 ..
At this rate it should finish in 2 months....
Will the next file system check be just as slow?
The e2fsck scratch files continue to grow, very slowly:
nas1:/var/cache/e2fsck# ls -alt
total 278116
-rw------- 1 root root 54706176 Oct 18 12:33 fef9ff86-773d-4c8a-84f3-c4af0c65fc21-icount-c6M7BM
-rw------- 1 root root 61538304 Oct 18 12:29 fef9ff86-773d-4c8a-84f3-c4af0c65fc21-dirinfo-2EAyps
drwxr-xr-x 2 root root 4096 Oct 12 22:17 .
drwxr-xr-x 6 root root 4096 Oct 5 2008 ..
At this rate it should finish in 2 months....
Will the next file system check be just as slow?
Message 5 of 9
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2010-10-20
08:38 PM
2010-10-20
08:38 PM
Re: fsck has taken 6 days so far
I had to cancel the fsck because I needed access to my files. Next I reboot the device I'll have to ssh into it again and manually kill the fsck.. I don't think I'll ever be able to wait over a week and let it finish....
Message 6 of 9
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-03-27
03:50 AM
2011-03-27
03:50 AM
Re: fsck has taken 6 days so far
Welcome to my world.
I have no ReadyNAS but can confirm the problem with high numbers of hardlinks. An old Fileserver (Single-Core-Xeon @2,8Ghz) crashed its external RAID which we used to store rsync-snapshot-backups. Ext4 clearly is not made for very high numbers of hardlinks, we are talking about one million files hardlinked in four complete backuptrees totaling three to six million files. For every single inode fsck has to compute the dangling links which makes 6.000.000 x 6.000.000 = 36.000.000.000.000 calculations.
When this happened first the Xeon took nearly a month for repair. Btw, it was the RAID-controller which spilled bad data over the drive.
My solution: take the raid or an image of the partition and put it on a faster system. My Corei7-2600k@4Ghz made it in sixs days.
Alternatve approach: Let the fsck run until you enter the lengthy "multiply claimed blocks" cycle. At this point your filesystem should(!!!) be mostly(!?!?!) readable as long as you mount it read only. Stop fsck, copy everything, format the corrupted drive, be happy. But beware, hardlinks will not be recognized after stopping fsck, so you will copy every single file as often as it is hardlinked.
I have no ReadyNAS but can confirm the problem with high numbers of hardlinks. An old Fileserver (Single-Core-Xeon @2,8Ghz) crashed its external RAID which we used to store rsync-snapshot-backups. Ext4 clearly is not made for very high numbers of hardlinks, we are talking about one million files hardlinked in four complete backuptrees totaling three to six million files. For every single inode fsck has to compute the dangling links which makes 6.000.000 x 6.000.000 = 36.000.000.000.000 calculations.
When this happened first the Xeon took nearly a month for repair. Btw, it was the RAID-controller which spilled bad data over the drive.
My solution: take the raid or an image of the partition and put it on a faster system. My Corei7-2600k@4Ghz made it in sixs days.
Alternatve approach: Let the fsck run until you enter the lengthy "multiply claimed blocks" cycle. At this point your filesystem should(!!!) be mostly(!?!?!) readable as long as you mount it read only. Stop fsck, copy everything, format the corrupted drive, be happy. But beware, hardlinks will not be recognized after stopping fsck, so you will copy every single file as often as it is hardlinked.
Message 7 of 9
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2011-03-27
03:57 AM
2011-03-27
03:57 AM
Re: fsck has taken 6 days so far
Firstly the NV uses EXT3 not EXT4 (I presume EXT3 would probably support less hardlinks than EXT4). The NV does has an Infrant Sparc CPU which is fairly slow.
An NV would be pretty old. Probably would be a good idea for the OP to upgrade e.g. to a Pro 6 (RNDP6000-200 series) and transfer his/her data across the network. As well as a dual-core 2.6Ghz CPU, it does have nice features like an Online Filesystem Check and Raid Scrubbing that you can schedule which I expect would considerably reduce the likelihood of needing to have a lengthy offline filesystem check.
An NV would be pretty old. Probably would be a good idea for the OP to upgrade e.g. to a Pro 6 (RNDP6000-200 series) and transfer his/her data across the network. As well as a dual-core 2.6Ghz CPU, it does have nice features like an Online Filesystem Check and Raid Scrubbing that you can schedule which I expect would considerably reduce the likelihood of needing to have a lengthy offline filesystem check.
Message 8 of 9
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2012-01-05
09:10 PM
2012-01-05
09:10 PM
Re: fsck has taken 6 days so far
Hi guys,
As this does indeed appear to be a bug in fsck...
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=411838
...and as apparently it has been fixed, could we please get this fix included in the next RAIDiator update? I'd guess a lot of people use backup strategies that create lots of hard links, so this is a real gotcha!
Regards,
Richard.
As this does indeed appear to be a bug in fsck...
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=411838
...and as apparently it has been fixed, could we please get this fix included in the next RAIDiator update? I'd guess a lot of people use backup strategies that create lots of hard links, so this is a real gotcha!
Regards,
Richard.
Message 9 of 9