NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

drawding's avatar
drawding
Aspirant
Sep 07, 2014

NV+ hanging up every 7 days

I have a strange issue that has been happening for the last three weeks. It seems that every Sunday morning my NV+ hangs up. I'm usually not able to control it via the web management or ssh. I can't even get a reaction to the front buttons. So I end up unplugging it and plugging it back in.

When it boots up I log in via ssh at kill the quota check (takes hours) and then reboot from the cli. When it comes back up again I'm back in business. A fsck usually does not find any issues. I can't for the life of me figure out why it's doing this? While writing this I looked at the health and logs in the web management and there was nothing. So I just looked at /var/log/messages and found that there were no entries since the last reboot until 6:47 this morning when I see this:

Sep 7 06:47:01 TWC-NAS kernel: X_RAID_DUMP
Sep 7 06:47:01 TWC-NAS kernel:
Sep 7 06:47:01 TWC-NAS kernel: VERSION/ID : SB=(V:0.1.0) ID=<0ea55712.00000000.00000000.00000000> CT:4de5751a
Sep 7 06:47:01 TWC-NAS kernel: RAID_INFO : DISKS(TOTAL:4 RAID:4 PARITY:1 ONL:4 WRK:4 FAILED:0 SPARE:0 BASE:0)
Sep 7 06:47:01 TWC-NAS kernel: SZ:1953108616 UT:00000000 STATE:0 LUNS:2 EXTCMD:1 LSZ:1953108614
Sep 7 06:47:01 TWC-NAS kernel: LOGICAL_DRIVE : 0: B:0000000002 E:0004096000 R:1 O:1 I:1:000000000 DM:f
Sep 7 06:47:01 TWC-NAS kernel: LOGICAL_DRIVE : 1: B:0004096002 E:1949012614 R:4 O:1 I:1:640684034 DM:f
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 0: DISK<N:0/1,hdc(22,0),ID:0,PT:1,SZ:1953108616,ST: B:online>
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 1: DISK<N:1/2,hde(33,0),ID:1,PT:1,SZ:1953108616,ST:P :online>
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 2: DISK<N:2/3,hdg(34,0),ID:2,PT:1,SZ:1953108616,ST: :online>
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 3: DISK<N:3/4,hdi(56,0),ID:3,PT:1,SZ:1953108616,ST: :online>
Sep 7 06:47:01 TWC-NAS kernel: CURRENT_DRIVE : DISK<N:0/1,XXX(22,0),ID:0,PT:1,SZ:1953108616,ST: B:online>
Sep 7 06:47:01 TWC-NAS kernel:
Sep 7 06:47:01 TWC-NAS kernel: VERSION/ID : SB=(V:0.1.0) ID=<0ea55712.00000000.00000000.00000000> CT:4de5751a
Sep 7 06:47:01 TWC-NAS kernel: RAID_INFO : DISKS(TOTAL:4 RAID:4 PARITY:1 ONL:4 WRK:4 FAILED:0 SPARE:0 BASE:0)
Sep 7 06:47:01 TWC-NAS kernel: SZ:1953108616 UT:00000000 STATE:0 LUNS:2 EXTCMD:1 LSZ:1953108614
Sep 7 06:47:01 TWC-NAS kernel: LOGICAL_DRIVE : 0: B:0000000002 E:0004096000 R:1 O:1 I:1:000000000 DM:f
Sep 7 06:47:01 TWC-NAS kernel: LOGICAL_DRIVE : 1: B:0004096002 E:1949012614 R:4 O:1 I:1:640684034 DM:f
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 0: DISK<N:0/1,hdc(22,0),ID:0,PT:1,SZ:1953108616,ST: B:online>
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 1: DISK<N:1/2,hde(33,0),ID:1,PT:1,SZ:1953108616,ST:P :online>
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 2: DISK<N:2/3,hdg(34,0),ID:2,PT:1,SZ:1953108616,ST: :online>
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 3: DISK<N:3/4,hdi(56,0),ID:3,PT:1,SZ:1953108616,ST: :online>
Sep 7 06:47:01 TWC-NAS kernel: CURRENT_DRIVE : DISK<N:1/2,XXX(33,0),ID:1,PT:1,SZ:1953108616,ST:P :online>
Sep 7 06:47:01 TWC-NAS kernel:
Sep 7 06:47:01 TWC-NAS kernel: VERSION/ID : SB=(V:0.1.0) ID=<0ea55712.00000000.00000000.00000000> CT:4de5751a
Sep 7 06:47:01 TWC-NAS kernel: RAID_INFO : DISKS(TOTAL:4 RAID:4 PARITY:1 ONL:4 WRK:4 FAILED:0 SPARE:0 BASE:0)
Sep 7 06:47:01 TWC-NAS kernel: SZ:1953108616 UT:00000000 STATE:0 LUNS:2 EXTCMD:1 LSZ:1953108614
Sep 7 06:47:01 TWC-NAS kernel: LOGICAL_DRIVE : 0: B:0000000002 E:0004096000 R:1 O:1 I:1:000000000 DM:f
Sep 7 06:47:01 TWC-NAS kernel: LOGICAL_DRIVE : 1: B:0004096002 E:1949012614 R:4 O:1 I:1:640684034 DM:f
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 0: DISK<N:0/1,hdc(22,0),ID:0,PT:1,SZ:1953108616,ST: B:online>
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 1: DISK<N:1/2,hde(33,0),ID:1,PT:1,SZ:1953108616,ST:P :online>
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 2: DISK<N:2/3,hdg(34,0),ID:2,PT:1,SZ:1953108616,ST: :online>
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 3: DISK<N:3/4,hdi(56,0),ID:3,PT:1,SZ:1953108616,ST: :online>
Sep 7 06:47:01 TWC-NAS kernel: CURRENT_DRIVE : DISK<N:2/3,XXX(34,0),ID:2,PT:1,SZ:1953108616,ST: :online>
Sep 7 06:47:01 TWC-NAS kernel:
Sep 7 06:47:01 TWC-NAS kernel: VERSION/ID : SB=(V:0.1.0) ID=<0ea55712.00000000.00000000.00000000> CT:4de5751a
Sep 7 06:47:01 TWC-NAS kernel: RAID_INFO : DISKS(TOTAL:4 RAID:4 PARITY:1 ONL:4 WRK:4 FAILED:0 SPARE:0 BASE:0)
Sep 7 06:47:01 TWC-NAS kernel: SZ:1953108616 UT:00000000 STATE:0 LUNS:2 EXTCMD:1 LSZ:1953108614
Sep 7 06:47:01 TWC-NAS kernel: LOGICAL_DRIVE : 0: B:0000000002 E:0004096000 R:1 O:1 I:1:000000000 DM:f
Sep 7 06:47:01 TWC-NAS kernel: LOGICAL_DRIVE : 1: B:0004096002 E:1949012614 R:4 O:1 I:1:640684034 DM:f
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 0: DISK<N:0/1,hdc(22,0),ID:0,PT:1,SZ:1953108616,ST: B:online>
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 1: DISK<N:1/2,hde(33,0),ID:1,PT:1,SZ:1953108616,ST:P :online>
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 2: DISK<N:2/3,hdg(34,0),ID:2,PT:1,SZ:1953108616,ST: :online>
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 3: DISK<N:3/4,hdi(56,0),ID:3,PT:1,SZ:1953108616,ST: :online>
Sep 7 06:47:01 TWC-NAS kernel: CURRENT_DRIVE : DISK<N:3/4,XXX(56,0),ID:3,PT:1,SZ:1953108616,ST: :online>
Sep 7 06:47:01 TWC-NAS kernel:
Sep 7 06:47:01 TWC-NAS kernel: VERSION/ID : SB=(V:0.1.0) ID=<0ea55712.00000000.00000000.00000000> CT:4de5751a
Sep 7 06:47:01 TWC-NAS kernel: RAID_INFO : DISKS(TOTAL:4 RAID:4 PARITY:1 ONL:4 WRK:4 FAILED:0 SPARE:0 BASE:0)
Sep 7 06:47:01 TWC-NAS kernel: SZ:1953108616 UT:00000000 STATE:0 LUNS:2 EXTCMD:1 LSZ:1953108614
Sep 7 06:47:01 TWC-NAS kernel: LOGICAL_DRIVE : 0: B:0000000002 E:0004096000 R:1 O:1 I:1:000000000 DM:f
Sep 7 06:47:01 TWC-NAS kernel: LOGICAL_DRIVE : 1: B:0004096002 E:1949012614 R:4 O:1 I:1:640684034 DM:f
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 0: DISK<N:0/1,hdc(22,0),ID:0,PT:1,SZ:1953108616,ST: B:online>
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 1: DISK<N:1/2,hde(33,0),ID:1,PT:1,SZ:1953108616,ST:P :online>
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 2: DISK<N:2/3,hdg(34,0),ID:2,PT:1,SZ:1953108616,ST: :online>
Sep 7 06:47:01 TWC-NAS kernel: PHYSICAL_DRIVE: 3: DISK<N:3/4,hdi(56,0),ID:3,PT:1,SZ:1953108616,ST: :online>
Sep 7 06:47:01 TWC-NAS kernel: CURRENT_DRIVE : DISK<N:0/1,XXX(22,0),ID:0,PT:1,SZ:1953108616,ST: B:online>
Sep 7 06:47:01 TWC-NAS kernel:
Sep 7 06:47:01 TWC-NAS kernel: RUN_PARAMETERS: raid_running=1,last_word=ok,interface_start_at=1,fake=0
Sep 7 06:47:01 TWC-NAS kernel: RAID_REBUILD : sync=0,logical=0,parity=1,sectors/TOTAL=0/4294967295
Sep 7 06:47:01 TWC-NAS kernel: : source=f, total_drives=4, auto_sync=1
Sep 7 06:47:01 TWC-NAS kernel: RAID_P_CHECK : chck=0,logical/total=0/2,raid_level=0
Sep 7 06:47:01 TWC-NAS kernel: : err/sectors/TOTAL=0/0/0,report_err=1
Sep 7 06:47:01 TWC-NAS kernel: : initialized=0xf,initialize_error=0x0,initializing=0x0
Sep 7 06:47:01 TWC-NAS kernel: : where=0,total=0
Sep 7 06:47:01 TWC-NAS kernel: SIZE_INFOR : sb_size=9440,sections_size=32/256,disk_t_size=128
Sep 7 06:47:01 TWC-NAS kernel: : sb=f8134294,disks=512/1536,luns=2048/3168,thisdisk=5216/128,diskid=5344/4096
Sep 7 06:47:01 TWC-NAS kernel: DJO_RECORD : dj_raid=NO_RAID,chns=0,source=0,disks=0 parity=0,chn_image=f,
Sep 7 06:47:01 TWC-NAS kernel: : sectors=3955906/0x3c5cc2,need_IO=0
Sep 7 06:47:01 TWC-NAS kernel: IO__RECORD : 0=803604248,1=320464,2=187170098,3=3719570, busy=0/0/0/0/0,t_d=f

Then nothing until I forced the restart around 9 AM when I noticed the problem. Anyone ever seen anything like this? Any pointers as to what may be causing the problem? This device has been running for a few years now without issue or problem. Nothing has really changed that I can think of that would be causing this to happen.

Thanks!

28 Replies

Replies have been turned off for this discussion
  • Emailed cron logs as requested. I really can't take the box down again today. I'm in the process of migrating some mail data off the NAS and once that is done I will be able to take it down whenever needed. Hopefully will be able to get to it tonight.

    I replaced a drive that was in far worse shape than the one with the 55 reallocated sectors. I don't have another spare 1 TB or larger drive at the moment but it's on my list. BTW when I put in the new 2 TB drive to replace the failing one is there a way to make use of the additional space. It looks as if it's only using about 1/2 the available space.
  • StephenB's avatar
    StephenB
    Guru - Experienced User
    You'd need to replace all 4 drives with 2 TB models to get expansion on the v1 platforms. On newer platforms, you'd need to have 2 disks of the largest size in order to use all the space (with single redundancy xraid2).

    I'd definitely replace a drive with 55 reallocated sectors, even if that is not the cause of the hangup.
  • I tried to log into the web interface tonight and it was pretty unresponsive. So I ssh'd in and ran top. I was very surprised to see such a high load average. Any Idea what could be causing it and I wonder if it's related or even the cause of the other issues.

    3:41 up 1 day, 12:33,  1 user,  load average: 15.34, 14.68, 14.72
    Tasks: 77 total, 1 running, 76 sleeping, 0 stopped, 0 zombie
    Cpu(s): 6.1% us, 17.7% sy, 0.0% ni, 65.0% id, 1.3% wa, 9.6% hi, 0.3% si
    Mem: 226352k total, 221920k used, 4432k free, 62080k buffers
    Swap: 767904k total, 0k used, 767904k free, 127152k cached

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    43 root 15 0 0 0 0 S 2.2 0.0 1:59.11 kswapd0
    13843 root 16 0 2992 1680 1328 R 2.2 0.7 0:21.05 top
    1103 root 15 0 0 0 0 D 1.6 0.0 4:53.40 nfsd
    1110 root 15 0 0 0 0 D 1.6 0.0 5:09.38 nfsd
    1112 root 15 0 0 0 0 D 1.6 0.0 4:59.62 nfsd
    1105 root 15 0 0 0 0 D 1.3 0.0 4:56.24 nfsd
    1115 root 15 0 0 0 0 D 1.3 0.0 5:01.40 nfsd
    1107 root 15 0 0 0 0 D 0.9 0.0 5:01.75 nfsd
    16322 root 16 0 3120 1024 832 S 0.9 0.5 0:00.03 sleep
    914 root 10 -5 0 0 0 S 0.6 0.0 3:53.88 kjournald
    1100 root 15 0 0 0 0 D 0.6 0.0 4:56.54 nfsd
    1106 root 15 0 0 0 0 D 0.6 0.0 5:04.40 nfsd
    1108 root 15 0 0 0 0 D 0.6 0.0 4:53.66 nfsd
    1109 root 15 0 0 0 0 D 0.6 0.0 4:53.10 nfsd
    1111 root 15 0 0 0 0 D 0.6 0.0 4:51.23 nfsd
    1113 root 15 0 0 0 0 D 0.6 0.0 4:58.19 nfsd
    1101 root 15 0 0 0 0 D 0.3 0.0 5:04.35 nfsd
    1102 root 15 0 0 0 0 D 0.3 0.0 5:00.96 nfsd
    1104 root 15 0 0 0 0 D 0.3 0.0 5:03.27 nfsd
    1114 root 15 0 0 0 0 D 0.3 0.0 4:57.31 nfsd
    13778 root 15 0 13840 4592 3584 S 0.3 2.0 0:03.78 sshd
    1 root 15 0 2000 880 768 S 0.0 0.4 0:05.35 init
    2 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
    3 root 10 -5 0 0 0 S 0.0 0.0 0:00.02 events/0
    4 root 10 -5 0 0 0 S 0.0 0.0 0:00.02 khelper
    5 root 10 -5 0 0 0 S 0.0 0.0 0:00.01 kthread
    10 root 10 -5 0 0 0 S 0.0 0.0 0:02.50 kblockd/0
    13 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 khubd
    41 root 15 0 0 0 0 S 0.0 0.0 0:40.08 pdflush
    42 root 15 0 0 0 0 S 0.0 0.0 0:11.72 pdflush
    44 root 20 -5 0 0 0 S 0.0 0.0 0:00.00 aio/0
    45 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 cifsoplockd
    46 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 cifsdnotifyd
    92 root 20 -5 0 0 0 S 0.0 0.0 0:00.00 kvblade
    93 root 15 0 0 0 0 S 0.0 0.0 0:00.02 mtdblockd
    106 root 25 0 0 0 0 S 0.0 0.0 0:00.00 hotplug-sata
    128 root 15 0 0 0 0 S 0.0 0.0 0:00.02 djsyncd
    129 root 20 0 0 0 0 S 0.0 0.0 0:00.00 djcheckd
  • mdgm-ntgr's avatar
    mdgm-ntgr
    NETGEAR Employee Retired
    Could be the disk with 55 reallocated sectors, could be something else.

    You don't have 4k sector partition alignment, but anyway I don't think your 2TB disk is a 4k sector disk,

    A backup and factory reset would be a good idea. Certainly if you intend to add 2TB disks that use 4k sectors in the future.
  • Just an update. Things are not any better, the replacement drive over the week was up to over 250 reallocated sectors! I pull the new drive and ran seatools on it and the drive checked out OK!? I left the drive out of the array it and ran all week OK. That is until 6 AM on Sunday. Down it went again right on schedule. So I ordered two new exact matches of the original drives. The NAS would not even recognize the new drive (which I also checked with sea tools before installing). On Friday I ended powering the nas down and booting with the reset button pushed in. It came back up and did see the drive and successfully synced it. Things were humming along again until today at 6 AM when it was locked up again. Upon restarting I got a message that it had to resync volume C again. It resynced and has been running OK again but in another week I'm sure it's going to go down yet again. When I did the reset, I had to change the IP address and password on the nas, it said it was reinstalling the firmware or something like that on the display, but the drives and shares were preserved. Was that supposed to be the case? Or is there another level of reset that I can try?

    I'm just about out of ideas but I did come up with the idea of running each of the scripts in cron.weekly and seeing if I can determine which one is hanging up things. I can then at least remove the script until I can determine why it's locking up the device. The scripts I see in weekly are: backup_idmap get_rn_messages get_smart quotacheck schedule_update_check.

    Any ideas anyone has are more than welcome.


    Date Message
    Sun Oct 5 15:15:49 EDT 2014 RAID sync finished on volume C. The volume is now fully redundant.
    Sun Oct 5 11:07:27 EDT 2014 System is up.
    Sun Oct 5 11:05:56 EDT 2014 RAID sync started on volume C.
    Sun Oct 5 11:05:33 EDT 2014 Improper shutdown detected. To ensure data integrity, a filesystem check should be performed by rebooting the NAS through Frontview with the volume scan option enabled.
    Fri Oct 3 18:58:33 EDT 2014 RAID sync finished on volume C. The volume is now fully redundant.
    Fri Oct 3 16:27:23 EDT 2014 Successfully changed password. [admin]
    Fri Oct 3 14:48:15 EDT 2014 System is up.
    Fri Oct 3 14:41:48 EDT 2014 Changes to the network is being backgrounded. The browser will automatically refresh in 10 secs.
    Fri Oct 3 14:37:14 EDT 2014 System is up.
    Fri Oct 3 14:35:05 EDT 2014 RAID sync started on volume C.
  • mdgm-ntgr's avatar
    mdgm-ntgr
    NETGEAR Employee Retired
    250 reallocated sectors is a lot. Did you just run a quick/short test or did you run a long/extended test? We would recommend replacing a disk when the count is much lower than that.

    An OS re-install, resets some settings (e.g. admin password, some network settings), but leaves data intact.

    A factory default (wipes all data, settings, everything).
  • I did both the short tests. I started the long test but had to physically move the drive before it completed so I ended up shutting it down. I was using a Thermaltake BlacX eSATA USB Docking Station and my laptop on the kitchen table and my wife didn't think it was more important than dinner. LOL

    Since it was a brand new drive I find it hard to believe the issue is with the drive itself. But I guess we will see now that I have another brand new drive installed. I did check the original drive out with seatools and it would not even pass the short drive self test so there appears to be no doubt that that drive did have issues.

    After the cron job experiment I may try a factory default reset. I've started a complete backup now. If I do the factory reset can I do a restore of the settings or is it best to just set up the shares again from scratch? Also, has anyone found a way to run a Crash Plan client on the NV+? If not I'll probably have to mount all the shares on a vm and run the client on that instead.
  • mdgm-ntgr's avatar
    mdgm-ntgr
    NETGEAR Employee Retired
    You could try restoring the settings, but if you can reconfigure it manually that would be better.

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More