Orbi WiFi 7 RBE973
Reply

System Hang Due to Backup Failure

steveoelliott
Luminary

System Hang Due to Backup Failure

Hi all,

(#22039954 Raised)

Today my ReadynasPRO 6 hung upon accidently starting a backup job and the drive being removed from the system. The system was hung hard and had to be hard shutdown (held power button for 5 seconds) to resume operation. In this state, the green light above the backup button was also on.

Upon inspecting the logs, the last operation was the backup job and the next is when I got on site to perform the hard reload:

Oct 7 15:58:53 despair RAIDiator: Backup button jobs started.
Oct 7 15:58:54 despair rsyncd[20050]: connect from despair (192.168.1.200)
Oct 7 15:58:54 despair rsyncd[20050]: rsync on PREMIER/. from admin@despair (192.168.1.200)
Oct 7 15:58:55 despair rsyncd[20050]: building file list
Oct 7 15:58:56 despair kernel: usb 1-2: USB disconnect, address 6
Oct 7 15:58:56 despair kernel: scsi 10:0:0:0: [sde] Unhandled error code
Oct 7 15:58:56 despair kernel: scsi 10:0:0:0: [sde] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
Oct 7 15:58:56 despair kernel: scsi 10:0:0:0: [sde] CDB: Read(10): 28 00 08 10 91 d7 00 00 f0 00
Oct 7 15:58:56 despair kernel: end_request: I/O error, dev sde, sector 135303639
Oct 7 15:58:56 despair kernel: scsi 10:0:0:0: rejecting I/O to offline device
Oct 7 15:58:56 despair kernel: scsi 10:0:0:0: [sde] Unhandled error code
Oct 7 15:58:56 despair kernel: scsi 10:0:0:0: [sde] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
Oct 7 15:58:56 despair kernel: scsi 10:0:0:0: [sde] CDB: Read(10): 28 00 08 10 92 c7 00 00 10 00
Oct 7 15:58:56 despair kernel: end_request: I/O error, dev sde, sector 135303879
Oct 7 15:58:57 despair udevd-event[20072]: unlink_secure: chown(/dev/sde1, 0, 0) failed: No such file or directory
Oct 7 15:58:57 despair udevd-event[20072]: unlink_secure: chmod(/dev/sde1, 0000) failed: No such file or directory
Oct 7 15:58:57 despair kernel: EXT3-fs error (device sde1): ext3_find_entry: reading directory #1011713 offset 0
Oct 7 15:58:57 despair kernel: EXT3-fs (sde1): I/O error while writing superblock
Oct 7 15:58:57 despair kernel: EXT3-fs error (device sde1): ext3_find_entry: reading directory #1011713 offset 0
Oct 7 15:58:57 despair kernel: EXT3-fs (sde1): I/O error while writing superblock
Oct 7 15:58:57 despair kernel: EXT3-fs error (device sde1): read_inode_bitmap: Cannot read inode bitmap - block_group = 494, inode_bitmap = 16187393
Oct 7 15:58:57 despair kernel: EXT3-fs (sde1): I/O error while writing superblock
Oct 7 15:58:57 despair kernel: EXT3-fs (sde1): error in ext3_new_inode: IO failure
Oct 7 15:58:57 despair rsyncd[20050]: rsync: connection unexpectedly closed (61433 bytes received so far) [sender]
Oct 7 15:58:57 despair rsyncd[20050]: rsync error: error in rsync protocol data stream (code 12) at io.c(605) [sender=3.0.9]
Oct 7 15:58:57 despair rsyncd[20050]: rsync: writefd_unbuffered failed to write 91 bytes to socket [sender]: Broken pipe (32)
Oct 7 15:58:57 despair kernel: EXT3-fs (sde1): I/O error while writing superblock
Oct 7 15:58:58 despair kernel: journal_bmap: journal block not found at offset 23284 on sde1
Oct 7 15:58:58 despair kernel: Aborting journal on device sde1.
Oct 7 15:58:58 despair kernel: JBD: I/O error detected when updating journal superblock for sde1.
Oct 7 15:58:58 despair kernel: journal commit I/O error
Oct 7 15:58:58 despair rsyncd[20100]: connect from despair (192.168.1.200)
Oct 7 15:58:58 despair rsyncd[20100]: rsync on PREMIER/. from admin@despair (192.168.1.200)
Oct 7 15:58:58 despair kernel: EXT3-fs (sde1): error: ext3_put_super: Couldn't clean up the journal
Oct 7 15:58:58 despair rsyncd[20100]: building file list
Oct 7 15:58:58 despair kernel: EXT3-fs (sde1): error: remounting filesystem read-only
Oct 7 15:58:58 despair kernel: scsi: killing requests for dead queue
Oct 7 15:58:58 despair rsyncd[20100]: rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Connection reset by peer (104)
Oct 7 15:58:58 despair rsyncd[20100]: rsync error: error in rsync protocol data stream (code 12) at io.c(1532) [sender=3.0.9]
Oct 7 15:58:58 despair rsyncd[20106]: connect from despair (192.168.1.200)
Oct 7 15:58:58 despair rsyncd[20106]: rsync on PREMIER/. from admin@despair (192.168.1.200)
Oct 7 15:58:58 despair rsyncd[20106]: building file list
Oct 7 15:58:58 despair rsyncd[20106]: rsync: writefd_unbuffered failed to write 959 bytes to socket [sender]: Connection reset by peer (104)
Oct 7 15:58:58 despair rsyncd[20106]: rsync error: error in rsync protocol data stream (code 12) at io.c(1532) [sender=3.0.9]
Oct 7 15:58:58 despair RAIDiator: Error encountered copying data from remote source path 192.168.1.200::PREMIER ==> /USB_HDD_2/PREMIER due to empty source d
Oct 7 18:43:38 despair syslogd 1.4.1#18: restart.
Oct 7 18:43:38 despair kernel: klogd 1.4.1#18, log source = /proc/kmsg started.
Oct 7 18:43:38 despair kernel: Initializing cgroup subsys cpu


So here are 2 questions...

1) How do you remove the drive from he Pro6 using and holding the backup button. I have held for 5 seconds and then generally it comes up saying disconnected but other times it starts the backup again. It's not me personally doing this operation and on this (and other occasions) the person doing this opts to remove the drive mid backup.

2) Could taking the drive out mid operation reasonably cause he ReadyNAS to hang like this? The job it was running was an rsync job where we use localhost and the USB drive as the source and desination as USB port 1.

Look forward to your feedback...

Thanks.
Message 1 of 13
vandermerwe
Master

Re: System Hang Due to Backup Failure

My ultra6 plus behaves exactly like this if the source NAS for a rsync backup becomes unavailable because it powers off.
It would be nice if the unit didn't hang, but I think this is normal behaviour in this particular scenario.
Message 2 of 13
steveoelliott
Luminary

Re: System Hang Due to Backup Failure

Thanks... Thats interesting. I'll raise a case anyhow. Obviously the solution is to not have the device disconnected on the first place but still it shouldn't happen.

It is a bug and should be addressed!

How do you actually disconnect one of these using the backup button? The old NV+ used to flash the disk lights when you could remove the disk, what does this unit do? I personally have only disconnected from Frontview.
Message 3 of 13
mdgm-ntgr
NETGEAR Employee Retired

Re: System Hang Due to Backup Failure

Probably best to disconnect the USB disk using Frontview.
Message 4 of 13
steveoelliott
Luminary

Re: System Hang Due to Backup Failure

Moving forward that is the intention and I cannot see documented steps on how to remove the disk using the backup button on the Pro 6.

Even so, this behavior is so clearly a bug and another user has the same... We have both raised cases for this.

Thanks for your response...
Message 5 of 13
vandermerwe
Master

Re: System Hang Due to Backup Failure

Netgear don't seem to accept that if a backup job encounters a persistent problem accessing a source or destination, then it should recognise this and fail. The backup job should then be terminated.
It appears that in the case of rsync at least, if a remote source becomes unavailable during a backup job, the backup job hangs and this leads to a requirement for a forced shutdown of the readynas.

In all other cases of backup jobs where a source or destination is unavailable or becomes unavailable, my experience is that the backup job fails.

To me the former problem is a bug.

Netgear have implied in their reply to my case that it will not be reported as a bug as it's "root cause" is misuse of the device.
Unbelievably arrogant.
Message 6 of 13
steveoelliott
Luminary

Re: System Hang Due to Backup Failure

I had exactly the same kind of reply but I pushed back to my agent and insisted it be escalated. I suggest you also do the same and reference my case. This is not acceptable!
Message 7 of 13
vandermerwe
Master

Re: System Hang Due to Backup Failure

I did reply, asking again for a bug report to be filed.
The agent referred to your case and your username/ name. Pretty sure they are not supposed to disclose that sort of information.
Message 8 of 13
steveoelliott
Luminary

Re: System Hang Due to Backup Failure

Yea, it's not normally the done thing... Anyway, mine is now being escalated to L3. Did the same happen to yours?
Message 9 of 13
nhills
Aspirant

Re: System Hang Due to Backup Failure

Did you ever get a resolution to this problem? I am having the same issue and wanted to get an idea if there was a solution provided by netgear for this or if I had to find a way around it.
Message 10 of 13
steveoelliott
Luminary

Re: System Hang Due to Backup Failure

They never actually found the root cause of this. They could never reproduce the problem in the lab.
It would be worth your while raising a case with Netgear and referring this thread and past cases.
(#22039954) and (22041021)

The end result for me was don't use the button to remove the usb drive. Hardly a solution but I use the web interface for removal now.
Message 11 of 13
nhills
Aspirant

Re: System Hang Due to Backup Failure

Thanks for your reply. I will follow up with them on this and see where I get.
Message 12 of 13
steveoelliott
Luminary

Re: System Hang Due to Backup Failure

Please keep us posted...
Message 13 of 13
Top Contributors
Discussion stats
  • 12 replies
  • 2516 views
  • 0 kudos
  • 4 in conversation
Announcements