Readynas 104 drops out when copying to it

Aspirant

Dec 04, 2013

I might as well share my "story" here as well. I have the same issues as described here with 2 104's and 314. At first I thought that I was just pushing too much to the 104's which is why I purchased the 314. That doesn't appear to be the case however since I have the same issue with all 3 devices.

My biggest issues come with copying large files (several GB) over NFS or AFP (everything that I use is either Mac or UNIX so I do not use SMB at all). I've also caused the 104/314's to hang when performing rsync backups via SSH (although this doesn't occur as frequently and isn't nearly as reproducible).

My setup is as follows;

- 3 ReadyNAS devices, 2 104's and 1 314
- All ReadyNAS devices have 4 Seagate ST4000DM000 4TB drives
- I use a single X-RAID volume on each device (RAID 5, ~10.5TB formatted space)
- Used disk space is between ~1.7TB and 3.1TB on each ReadyNAS (due to the disconnections, I can't seem to get any data on these)
- I only have NFS, HTTP, HTTPS, & SSH running on one 104 and the 314. The other 104 has AFP running in addition to the other protocols.
- Each device has a single NFS share (the 104 that has AFP uses both NFS and AFP for this share)
- I use adaptive load balancing NIC bonding on all 3 devices (although, I can and have reproduced the disconnects without NIC bonding enabled)
- All 3 devices connect to a Cisco 3750 gigabit switch (I can also reproduce the issue using a direct connection to a server via a crossover cable)
- My client devices also connect to the Cisco 3750 (48 port) switch. Wireless clients connect via an Apple AP that is connected to the 3750 as well.
- Clients range from Apple laptops (only wireless clients), to Mac Mini Servers, to dedicated UNIX (Linux and Solaris) servers, to VMware ESXi servers (UNIX clients).
- I can reproduce the disconnections from wired and wireless clients although, it's much more prevalent with wired clients.
- ReadyNAS devices are used primarily for backups. I run my backups in a serial fashion (ie: no 2 devices ever perform a backup task at one time).
- If more than one device runs a backup task at the same time, the ReadyNAS will lockup about 95% of the time.
- If only 1 device is executing a backup at a time, I only see lockups about 80% of the time.
- Backups run either via rsync over SSH (remote jobs) or direct file copies over NFS.
- The rsync jobs transfer only small files over the internet and are far more reliable than the NFS jobs. The NFS jobs transfer mainly large files (up to 800GB/file).

I have been fighting these disconnections since May of this year and have tried a great deal of configurations. I even rewrote my backup tasks to assure that only one backup was running at a time. This improved things but not to an acceptable (ie; usable) degree. I did set "vm.swappiness = 60" in /etc/sysctl.conf (and rebooted) which did help quite a bit (instead of the device locking up 100% of the time, that number is now closer to 80%).

I was able to capture the output of "top" when my 314 died last night, here is what it looked like;

top - 10:48:04 up 1 day,  8:33,  1 user,  load average: 6.95, 5.90, 3.81
Tasks: 149 total,   4 running, 145 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us, 99.4 sy,  0.0 ni,  0.2 id,  0.2 wa,  0.0 hi,  0.1 si,  0.0 st
KiB Mem:   2039220 total,  2009300 used,    29920 free,      216 buffers
KiB Swap:  1047932 total,    57252 used,   990680 free,  1742152 cached
Write failed: Broken pipe
curt@tatiana:~$
  491 root      20   0     0    0    0 R 111.9  0.0  38:37.80 kswapd0                                                              
    2 root      20   0     0    0    0 R  98.1  0.0  35:41.68 kthreadd                                                             
26706 root      19  -1  4176  484  484 R  98.1  0.0  10:13.47 sh                                                                   
 2030 root      19  -1  488m 2788 2788 S  94.7  0.1  34:28.26 readynasd
18695 curt      20   0 28484  716  632 R  57.8  0.0  13:27.02 top                                                                  
 1007 root      20   0     0    0    0 S   0.2  0.0   0:02.51 md1_raid6
 1969 root      20   0     0    0    0 S   0.2  0.0   2:44.90 flush-btrfs-2
 2155 root      20   0     0    0    0 S   0.2  0.0  35:57.79 nfsd
    1 root      20   0 45300 1168 1168 S   0.0  0.1   0:17.19 systemd
    3 root      20   0     0    0    0 S   0.0  0.0   0:01.44 ksoftirqd/0
    6 root      rt   0     0    0    0 S   0.0  0.0   0:20.29 migration/0
    7 root      rt   0     0    0    0 S   0.0  0.0   0:01.16 watchdog/0
    8 root      rt   0     0    0    0 S   0.0  0.0   0:12.88 migration/1
   10 root      20   0     0    0    0 S   0.0  0.0   0:00.77 ksoftirqd/1
   12 root      rt   0     0    0    0 S   0.0  0.0   0:00.88 watchdog/1
   13 root      rt   0     0    0    0 S   0.0  0.0   0:20.89 migration/2
   14 root      20   0     0    0    0 S   0.0  0.0  13:31.54 kworker/2:0
   15 root      20   0     0    0    0 S   0.0  0.0   0:01.21 ksoftirqd/2
 2155 root      20   0     0    0    0 D   8.0  0.0  11:27.07 nfsd
 2154 root      20   0     0    0    0 S   7.6  0.0  11:51.29 nfsd
 2152 root      20   0     0    0    0 D   6.6  0.0  11:32.86 nfsd
 2153 root      20   0     0    0    0 S   6.6  0.0  11:29.58 nfsd
 2159 root      20   0     0    0    0 D   6.6  0.0  11:32.95 nfsd
 2158 root      20   0     0    0    0 S   5.3  0.0  11:30.62 nfsd
 2156 root      20   0     0    0    0 S   4.0  0.0  11:21.16 nfsd
 2157 root      20   0     0    0    0 S   3.3  0.0  11:39.94 nfsd

In this case the load average was relatively low, I've seen it as high as 30 in the past. As you'd expect, the load average is generally higher on the 104's than the 314.

I have dug through everything in /var/log and have yet to find anything useful. On the same token, I have not attempted to add any additional logging.

I'm really hoping that 6.1.5 takes care of these disconnects. I purchased the ReadyNAS devices as replacements for Solaris/FreeBSD file servers with the intent on lowering my administrative overhead. Up to this point, these devices have been an even bigger headache than maintaining dedicated file servers. :(

Forum Discussion

Readynas 104 drops out when copying to it

Related Content

ReadyNAS 422 Missing copied folders from Share

RBR850 Dropping Internet Connectivity

Copying files from a ReadyNas Duo

RBRE960 LAN PORTS DROPPING

Mount and copy files from 3TB drive - RN102

NETGEAR Academy

ProSupport for Business