NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
crash
7 TopicsRN3312 BTRFS operations are completely hung
We have owned the RN3312 for a bit over 6 months, and all was seemingly fine. However, things went downhill recently and now pretty much the entire BTRFS partition is completely unusable at this point. Even leaving the NAS offline and just trying to do whatever internal metadata cleanup by itself in a reasonable time is not enough to recover. What has happened is a combination of the Bit Rot Protection / COW + Compression + Snapshots being turned on, on a partition used for file backups, and image backups (Veeam) for a single, large, fileserver. BTRFS is NOT production ready for such a setup, I firmly believe this option should be removed from the UI, or a huge warning displayed. Everything was going great until the first snapshots needed to be deleted, where I ran into the problem of btrfs-cleaner taking up 100% CPU. Symptoms: the admin UI would lock up on any file operation in certain directories. Directory accesses would hang forever, even over SMB. Of course all the backups to the NAS were timing out. I eventually was able to delete the snapshots by hard rebooting the system and removing them before btrfs-cleaner got too bad. But now, I have the problem where btrfs-transacti is taking up 100% CPU. I have left the system sitting offline for a week just spinning at 100% CPU (!), and there is no visible improvement - EVERY BTRFS operation still hangs, no matter what I try. There is little disk activity, it is not thrashing - makes me think there is something wrong in the internals of BTRFS, or that the CPU is too underpowered to handle the amount of storage metadata operations. top - 12:30:29 up 1:39, 2 users, load average: 115.21, 112.48, 99.52 Tasks: 334 total, 2 running, 332 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.6 us, 23.0 sy, 0.0 ni, 72.1 id, 1.6 wa, 0.0 hi, 2.7 si, 0.0 st KiB Mem: 8113792 total, 2673896 used, 5439896 free, 4404 buffers KiB Swap: 2093052 total, 0 used, 2093052 free. 1980036 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3740 root 20 0 0 0 0 R 100.0 0.0 93:52.62 btrfs-tran+ 1 root 20 0 136632 6868 5144 S 0.0 0.1 0:02.45 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd admin@archive:/data$ iostat Linux 4.4.68.x86_64.1 (archive) 07/11/2017 _x86_64_ (4 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 0.55 0.00 25.73 1.56 0.00 72.15 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sda 3.26 55.82 57.19 337578 345828 sdb 3.25 55.07 57.08 333044 345196 sdc 3.45 74.32 57.02 449448 344808 sdd 3.28 53.95 57.21 326224 345936 sde 3.26 57.73 57.09 349104 345252 sdf 3.28 54.89 56.87 331951 343908 md0 1.68 27.57 39.28 166740 237520 md1 0.02 0.19 0.00 1172 0 md127 9.38 243.50 69.42 1472516 419788 < not a lot of activity... I have tried starting a balance to fix fragmentation, I believe there are operations blocking it inside the kernel, but even at -dusage=0 I gave up after giving it the weekend to do its thing. Trying to look for evidence is fragmented files is horrendously slow. But it is very bad now: admin@archive:/data$ ls *** hangs forever *** My hope at this point is to try and mount the system read-only and recover data onto a USB drive, the share with data is around 8 TB which might just fit after a couple of days/weeks? of copying... Then figuring out some way to drop the share? and rebuild it without selecting the 'Bit Rot Protection' or 'Compression' options. Hopefully I don't have to resort to copying the NAS to something else and wiping it - there is about 14 TB of data on it currently, and I don't have that much capacity available anywhere else... After going through this and after lots of research, I see lots of horror stories showing that BTRFS is extremely fragile and not ready for prime time. I believe it is reckless for Netgear to base a NAS on such an unproven FS. The features are not worth it if they explode in spectacular fashion after a couple of months. Symptoms include btrfs-transacti and btrfs-endio-wri taking up a lot of CPU time (in spikes, possibly triggered by syncs). You can use filefrag to locate heavily fragmented files (may not work correctly with compression). ... "a balance on 2TB of data that was heavily snapshotted - it took 3 months" "when I have to do balances ... I delete all the snapshots and allow a few months for the balance to finish" https://btrfs.wiki.kernel.org/index.php/Gotchas We are running version 6.7.4. We currently have 6 x 8 TB in X-RAID (certified drives.) I struggle to think what would happen if we filled up all 12 slots... Are there any other operations anyone from support wants to try before I start wiping it? Unfortunately our 90-day free support has expired before any of this happened, so I am left venting in public...4.8KViews0likes4CommentsReadyNAS 2120 crash every little
Hello, We have about a dozen NAS ReadyNAS, half are model 2120 and 2120v2. These models give us problems that make them unusable every little time: balance, defrag ... that last up to 22 days and with a load higher than 10 in the system, so the NAS are not operational. We need to know what is the necessary configuration so that no problems (job scheduling, firmware version...) These NAS worked correctly until they reached a particular firmware version: what is the reason they are no longer functional? Greetings.5.4KViews0likes16CommentsReadyNAS 3130 crashing and directing to SuperMicro login
My ReadyNas 3130 is set up as a web server with PHP, MySQL, and MyPhpAdmin add-ins installed. SMB, NFS, FTP, HTTP, HTTPS, SSH services are enabled. Firmware version 6.6.0 It will intermittently stop responding. SSH/FTP will not connect, HTTP will throws 404. But when I try to go to the ReadyNas Admin panel, it will show a 404 but after a refresh a few times it takes me to a login screen for SuperMicro. I can log into that using ADMIN ADMIN to login and I can view the sensor data for the unit (Fan speeds, temperatures, etc). This will happen very sporatically but often. Sometimes it will be down for 5 minutes, sometimes as much as 30 minutes. Sometimes it will stay up for several hours, sometimes it will only stay up for a couple minutes. If the admin page is open at the time, it goes to a progress bar that says "Connecting to ReadyNAS admin page". I have already done a full factory reset, afterwhich I reinstalled the add-ons (listed above), reloaded the files through FTP and repopulated the SQL database using a backup in the form of an SQL script. It worked fine for several hours and then had the same problems. Sensors are not showing any high temps and logs shown through the admin interface do not record anything happening at all. Any help on this would be greatly appreciated!Solved3KViews0likes5CommentsAirPrint and ReadyNASOS6
Hey, I've been trying to get my ReadyNAS to operate as a 'print server' for my local network, including with AirPrint functionallity. I have a Brother wifi printer (so it's not connected via USB) and have sucessfully installed the required CUPS and foomatic filter packages, and have the printer shared with my network without issue. The next issue was AirPrint. I've read all the posts in this forun where people have sucessfully got it working with pre-iOS6 clients; but have had no success with the post-6 devices. I believe I have figured out why.... The issue is Avahi, and the advertising of the printer as an AirPrint printer. Previous to iOS6, the Avahi .service file did not make use of the <subtype> directive; but since iOS6, a specific <subtype> setting is required in order for iOS to 'see' the printer. Unforunately, it is the use of the <subtype> directive in the .service file which is causing Avahi to crash on the ReadyNAS - and thus prevent the NAS from advertising the AirPrint printers to the local network. The Avahi .service file for my Brother wifi printer was created with this script: https://raw.github.com/tjfontaine/airprint-generate/master/airprint-generate.py The script spits out a valid .service file which contains everything which SHOULD be required in order to advertise the printer via mDNS/Bonjour for the iOS devices to pick up. The problem is that Avahi crashes while it is parsing any .service file which contains a <subtype> option. This can easily be reproduced if you upload a .service file created using the above script (or any other example you find with google) and run Avahi from a terminal as root. The .service file for my Brother wifi printer is: <?xml version="1.0" ?><!DOCTYPE service-group SYSTEM 'avahi-service.dtd'> <service-group> <name replace-wildcards="yes">AirPrint Brother_DCP-1610W @ %h</name> <service> <type>_ipp._tcp</type> <subtype>_universal._sub._ipp._tcp</subtype> <port>631</port> <txt-record>txtvers=1</txt-record> <txt-record>qtotal=1</txt-record> <txt-record>Transparent=T</txt-record> <txt-record>URF=none</txt-record> <txt-record>rp=printers/Brother_DCP-1610W</txt-record> <txt-record>note=Brother DCP-1610W</txt-record> <txt-record>product=(GPL Ghostscript)</txt-record> <txt-record>printer-state=3</txt-record> <txt-record>printer-type=0x80b004</txt-record> <txt-record>pdl=application/octet-stream,application/pdf,application/postscript,image/gif,image/jpeg,image/png,image/tiff,image/urf,text/html,text/plain,application/vnd.adobe-reader-postscript,application/vnd.cups-command,application/vnd.cups-pdf</txt-record> </service> </service-group> (different printers will obviously have different output from the script... but the key point of note here is the inclusion of the <subtype> directive) If the <subtype> directive is commented out, Avahi on the ReadyNAS will parse and load the file without issue (but the printer is not picked up by iOS due to the missing directive): Found user 'avahi' (UID 84) and group 'avahi' (GID 84). Successfully dropped root privileges. avahi-daemon 0.6.31 starting up. Successfully called chroot(). Successfully dropped remaining capabilities. Loading service file /services/AirPrint-Brother_DCP-1610W.service. Loading service file /services/frontview.service. Loading service file /services/nut.service. Joining mDNS multicast group on interface bond0.IPv4 with address 192.168.67.248. New relevant interface bond0.IPv4 for mDNS. Joining LLMNR multicast group on interface bond0.IPv4 with address 192.168.67.248. New relevant interface bond0.IPv4 for LLMNR. Network interface enumeration completed. Registering new mDNS address record for 192.168.67.248 on bond0.IPv4. Registering new LLMNR address record for 192.168.67.248 on bond0.IPv4. Server startup complete. Host name is NAS.local. Local service cookie is 2760298420. All host RR's have been announced/verified : SERVER RUNNING Service "NAS" (/services/nut.service) successfully established. Service "ReadyNAS Administration on NAS" (/services/frontview.service) successfully established. Service "AirPrint Brother_DCP-1610W @ NAS" (/services/AirPrint-Brother_DCP-1610W.service) successfully established. However, with the <subtype> option present, Avahi crashed out at startup: Found user 'avahi' (UID 84) and group 'avahi' (GID 84). Successfully dropped root privileges. avahi-daemon 0.6.31 starting up. Successfully called chroot(). Successfully dropped remaining capabilities. Loading service file /services/AirPrint-Brother_DCP-1610W.service. Loading service file /services/frontview.service. Loading service file /services/nut.service. Joining mDNS multicast group on interface bond0.IPv4 with address 192.168.67.248. New relevant interface bond0.IPv4 for mDNS. Joining LLMNR multicast group on interface bond0.IPv4 with address 192.168.67.248. New relevant interface bond0.IPv4 for LLMNR. Network interface enumeration completed. Registering new mDNS address record for 192.168.67.248 on bond0.IPv4. Registering new LLMNR address record for 192.168.67.248 on bond0.IPv4. Server startup complete. Host name is NAS.local. Local service cookie is 1803852026. avahi-daemon: running [NAS.local]: entry.c:1080: avahi_server_add_service_subtype: Assertion `flags & AVAHI_PUBLISH_USE_MULTICAST || flags & AVAHI_PUBLISH_USE_WIDE_AREA' failed. I believe this is an issue related to the version of Avahi present on the ReadyNAS. Is there any chance of Netgear upgrading the version to one which doesn't contain this bug, or at the very least patch something which will cause Avahi to crash at start? I'd appreciate if anyone else can test their system to see if they get a similar problem with Avahi - just to make sure i'm not doing something completely brain-dead :) If we can get this crashing issue resolved, i'm pretty sure we can get AirPrint working again in ReadyNASOS6. Cheers.5.9KViews0likes6CommentsReadyNAS 204 OS 6.4.0 "Please remove inactive volumes", case #25950625
Hi, after some stability issues of my ReadyNAS 204 in the past with different OS 6 releases it now finally crashed some weeks after the installation of 6.4.0 with the above error message. I've opened the above case, response from tech support is i had a power failure and should order data recovery services. I do have a backup which is quite a bit outdated, however i do not understand why i should pay for data recovery services as (to me) it seems there are many bugs in 6.4.0 which might have caused this issue. At least I'd expect from Netgear to recover my data free of charge, in case that fails they could refer me to my backup responsibility. Anyone from Netgear able to help me on this ? Thx, MaxSolved3.3KViews0likes3CommentsCreating / Deleting large file crashes ReadyNAS
In VMWare, I was migrating a large VM to a NFS share on my ReadyNAS RN2120. It got to a certain point and then the ReadyNAS crashed. After rebooting it, I tried to delete the directory which contains a 0.98 TB vmdk file and it crashed again. Every time I try to delete this file, it crashes the ReadyNAS and the unit must be hard booted. I've tried deleting it via Windows, in the ReadyNAS Web UI and via SSH with the same results (crash). There seems to be no way to remove this file without crashing the ReadyNAS. This is a very serious issue and seems to be the same problem this fellow is having: https://community.netgear.com/t5/Using-your-ReadyNAS/Can-t-delete-a-large-file/m-p/932043#M70966 There is obviously some kind of bug in this ReadyNAS OS (firmware version 6.2.4) which is causing this. I really need a solution that doesn't require a complete factory reset / erasure of the unit.Solved5KViews0likes8CommentsRN102 - Please remove inactive volumes in order to use the disk - data loss
ReadyNas 102, 1 x 1TB disk (in 2nd drive position), owned 10 days, Using 6.2.5 I copied maybe 25 data dvds to the drive (1 volume, across about 4 shares). I lost network connection to the NAS, couldnt get NAs to respond to power switch & eventually upluged the power out of the back, after a restart which was laboured & on reconnecting I note ALL my disk data has gone. No matter what I do with the power I shouldnt be able to have this catastrophic failure. If I go to Volumes, it shows the volume, with a red dot in the top left corner, with Data and Free are both 0. also a balloon - Please remove inactive volumes in order to use the disk. Disk #2. If I go to shares it says: No volume or USB Disks. & It is recommended to create a volume before configuring others .... If I had 2 drives (copies of each other) would I still have lost all this data. It looks like disk reliability is not my primary concern, but the enclosure. I've never had a pc lose all its drive data, It doest look the drive has crashed as the s/w is inviting me to use it. It looks like the O/S meta data managing the drive contents has crashed. Is this indicative of issues I may have ongoing?Solved5.2KViews0likes4Comments