NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
thepartnersgrou
Sep 02, 2011Aspirant
ReadyNAS 2100 timeouts case #16618795
Hi there,
We've got a ReadyNAS 2100 that we are using as a AD-integrated CIFS/SMB file server. Recently (in the last week-2 weeks) we've had users report that their mapped drives "aren't available" and have witnessed this ourselves as well. It looks like the server is dropping in and out of availability randomly, as sometimes an immediate refresh will find the device, and sometimes it takes up to 5-10 minutes. This affects clients whether they are accessing the NAS via UNC or a mapped drive letter. Also, the same clients are experiencing slow browsing of our other file servers (Windows Server 2003 Std., domain controller/WINS/DNS etc.). Clients consist of a mix of Windows XP SP3 machines and Windows 7 64-bit machines, with a wide variety of network interface card models (mostly Broadcom or Intel integrated NICs).
We have spent a day examining everything we can think of, from event logs on client systems and our other file servers, to looking at the logs on the NAS itself (although we're not experts on quickly finding root causes in samba logs etc.) and haven't found any glaring problems. We've update NIC drivers on a couple of different machines, with no improvement. We found one NetBIOS master browser election event in our PDC's event log so we downloaded and installed the add-in to disable master browser preference. We've rebooted the NAS multiple times throughout the day. While running RichCopy from a network location to the NAS, we get multiple errors every 5-10 minutes saying both paths cannot be found - it almost seems as if it's killing the network interface on the client machines that it's communicating with.
Our users that are not regularly accessing the NAS are having zero problems. Further, on these systems, if we connect to the NAS via UNC or mapped drive letter and launch an application install, for instance, it will error out saying the drive or device cannot be found. However, if we copy the installation folder to the NAS, the install works perfectly the first time, as long as the copy is able to complete (usually does).
Recent changes include rolling out a new version of Symantec Endpoint Protection a couple of weeks ago companywide, an Exchange 2010 migration (machines having this problem are both 2003 and 2010 clients as we're mid-migration right now), and a Symantec Backup Exec 2010 64-bit install. We also had a DC that we had set up on a remote network that was replicating via a slow site-to-site VPN, but we ran into bandwidth issues and connection reliability, so we ran DCPROMO and demoted it to just a member server so that we didn't have replication errors.
We would love to hear any suggestions because users having intermittent availability to their files causes frustration and we have way too much on our plates right now to deal with chasing a problem that we don't understand the cause of! I'll be working on and off over the long weekend so feel free to toss out ideas, as come Monday afternoon we may have to find a new home for these files if we can't get this resolved and repurpose our ReadyNAS as a utility storage device if it can't be reliable. =\
Thanks for any help you can give!
Aaron & Tim
The Partners Group, LTD.
We've got a ReadyNAS 2100 that we are using as a AD-integrated CIFS/SMB file server. Recently (in the last week-2 weeks) we've had users report that their mapped drives "aren't available" and have witnessed this ourselves as well. It looks like the server is dropping in and out of availability randomly, as sometimes an immediate refresh will find the device, and sometimes it takes up to 5-10 minutes. This affects clients whether they are accessing the NAS via UNC or a mapped drive letter. Also, the same clients are experiencing slow browsing of our other file servers (Windows Server 2003 Std., domain controller/WINS/DNS etc.). Clients consist of a mix of Windows XP SP3 machines and Windows 7 64-bit machines, with a wide variety of network interface card models (mostly Broadcom or Intel integrated NICs).
We have spent a day examining everything we can think of, from event logs on client systems and our other file servers, to looking at the logs on the NAS itself (although we're not experts on quickly finding root causes in samba logs etc.) and haven't found any glaring problems. We've update NIC drivers on a couple of different machines, with no improvement. We found one NetBIOS master browser election event in our PDC's event log so we downloaded and installed the add-in to disable master browser preference. We've rebooted the NAS multiple times throughout the day. While running RichCopy from a network location to the NAS, we get multiple errors every 5-10 minutes saying both paths cannot be found - it almost seems as if it's killing the network interface on the client machines that it's communicating with.
Our users that are not regularly accessing the NAS are having zero problems. Further, on these systems, if we connect to the NAS via UNC or mapped drive letter and launch an application install, for instance, it will error out saying the drive or device cannot be found. However, if we copy the installation folder to the NAS, the install works perfectly the first time, as long as the copy is able to complete (usually does).
Recent changes include rolling out a new version of Symantec Endpoint Protection a couple of weeks ago companywide, an Exchange 2010 migration (machines having this problem are both 2003 and 2010 clients as we're mid-migration right now), and a Symantec Backup Exec 2010 64-bit install. We also had a DC that we had set up on a remote network that was replicating via a slow site-to-site VPN, but we ran into bandwidth issues and connection reliability, so we ran DCPROMO and demoted it to just a member server so that we didn't have replication errors.
We would love to hear any suggestions because users having intermittent availability to their files causes frustration and we have way too much on our plates right now to deal with chasing a problem that we don't understand the cause of! I'll be working on and off over the long weekend so feel free to toss out ideas, as come Monday afternoon we may have to find a new home for these files if we can't get this resolved and repurpose our ReadyNAS as a utility storage device if it can't be reliable. =\
Thanks for any help you can give!
Aaron & Tim
The Partners Group, LTD.
6 Replies
Replies have been turned off for this discussion
- thepartnersgrouAspirantAnyone have any ideas? In trying to migrate data off the NAS, using RichCopy, it took about 20 attempts to copy off a 4GB file (each attempt errored out between 5% and 85% before finally making it in one pass). During this copy process we saw an average transfer rate of 10-12MB/sec to a Windows Server 2003 file server with a 4-disc 10K SAS RAID5 array with all journaling turned on, for whatever that's worth.
Also, I noticed that even with a 500GB iSCSI volume set up on the ReadyNAS 2100, formatted to NTFS, I am seeing errors in the event log (event id 55) saying the file structure is corrupt. Running a chkdsk on the logical drive always fixes errors in the MFT, deletes unused indexes, and creates indexes as necessary. I am willing to bet this is happening due to the drops in availability, but it certainly doesn't make me feel good about that data that's being stored on the NAS...
Thanks,
Aaron - dbott67GuideHi Aaron,
I've seen intermittent drive problems that can cause issues like you're seeing. You may want to test the drives in the NAS using vendor tools to eliminate them as a possible culprit: http://home.bott.ca/webserver/?p=388
Also, what version of firmware is on your 2100? - thepartnersgrouAspirantThanks for the response. Firmware is RAIDiator 4.2.17. Talked to tech support and they seem to think it's some sort of issue with our HP switch and the LACP configuration. Going to give HP a call and check a few settings with them. I'll let you know what we come up with.
The disks LOOK fine, but may warrany further checking. I assume the basic steps are to pull a drive, toss in a spare box and run diagnostics, put back in NAS and let it rebuild the array. Repeat for other drives after letting each rebuild finish...? - dbott67Guide
thepartnersgroup wrote: I assume the basic steps are to pull a drive, toss in a spare box and run diagnostics, put back in NAS and let it rebuild the array. Repeat for other drives after letting each rebuild finish...?
I would advise powering down the unit and running the tests while the NAS is off-line, if possible. The problem with hot-pulling a disk in this particular case may not be wise. Supposing it is a flaky disk, if you were to pull a good disk it may lead to a dual disk failure in your array if the flaky disk starts acting up. Additionally, the added stress of the data resyncs may cause the flaky disk to drop out of the array and again lead to a dual disk failure condition resulting in data loss.thepartnersgroup wrote: Talked to tech support and they seem to think it's some sort of issue with our HP switch and the LACP configuration. Going to give HP a call and check a few settings with them. I'll let you know what we come up with.
This might be the best/easiest thing to check. If you have the capability, try disabling LACP, jumbo frames and any other network settings to try to establish a baseline. In fact, if you have another switch or even the capability to try a direct connection it may help isolate/eliminate the variables.
Direct connection: http://sphardy.com/web/readynas/how-to- ... -readynas/ - mdgm-ntgrNETGEAR Employee RetiredYou could of course run the "Disk Test" boot option using the boot menu: http://www.readynas.com/kb/faq/boot/how_do_i_use_the_boot_menu
Whilst not as good as using the drive manufacturer's tools it's the next best thing. - thepartnersgrouAspirantFor everybody's reference, we ended up having switch problems. I believe the correct setting on the switch for LACP to work properly is to set the switch to be a PASSIVE LACP member. Active will probably work as well but the NAS is also acting as active, so if the switch is set to passive the NAS will always be the active member.
In our case, our HP ProCurve switch lost its' mind, including non-ReadyNAS traffic, and we had to apply a more recent firmware image in order to bring things back to normal. Our inter-switch traffic was set up as a static LACP group (to another ProCurve) but even setting the trunk type to non-LACP or static trunk mode didn't resolve the issues.
Thanks for the responses.
Aaron
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!