Script for making remote NAS diagnostics available locally

Sandshark
Sensei
Dec 25, 2020
StephenB wrote:
I haven't installed hdsentinal, so I'll need to take a look at that.
I find it to be quite helpful. While I'm sure you can get the same information from smartctl (and more, as it does not show a completely missing drive as your command does), it has a very noce presentation. It's designed to be monitored by a PC running a (paid) version, but it works stand-alone just fine. The line with HDSentinel -solid gives a very brief overview and then HDSentinel -r -html gives a much more detailed report. It's especially useful to me on my main 12-drive NAS and 24-drive (not yet full) external chassis, especially since the external has SAS drives in it. But I sill recommend you take a look.

HDSentinel does have a shortcoming that it only reports use on one partition. That turns out to be the system partition on a ReadyNAS. And while it would be nice to have a report on all partitions, a report on the one that the GUI does not report and can cause catastrophy if it fills is nice.
- StephenB
  Guru - Experienced User
  Dec 25, 2020
  I downloaded it, and have it on my main NAS and one of the backup ones. It does look useful.
  
  I've developed a starting point for my own script. That script will use rnutil to make the system log on OS 6 systems, otherwise it will zip /var/logs. It will then run smartctl, and will run HDSentinel if it is present. The logs are stored on a local NAS share, and if the NAS is a backup it will also rsync to the main NAS. It should also be possible to combine this with rsync backup jobs (backing up the main share to the local log share).
  
  I still need to test it on one of my 4.1.16 systems, and also I still need to test retention. Though I'm thinking I won't need retention in the script, I can simply delete old files from time to time on the main NAS, and the rsync backups will propagate the deletions to the backups.
  - StephenB
    Guru - Experienced User
    Jan 01, 2021
    I've been working on this over the holiday break, and I have something reasonable that so far is working ok.
    
    Overall, the goal is to capture daily logs from both the main and various backup NAS, and consolidate those in a Logs share on the main NAS. The overall organization is to create a $(hostname) folder for each NAS in the share. Within that there is a folder for each year (e.g., 2021), and each month (2021-01, 2021-02, etc). When run on the main NAS, the script will write the logs directly to the consolidated log share. On the backup NAS, it writes the logs to a local share (LocalLogs), and then rsyncs that to the main NAS. The idea there is to allow me to back up the consolidated log share w/o having any contention (due to the backup jobs running at the same time as the script).
    
    The script applies retention of 7 days to LocalLogs, but does not apply retention to the consolidated log. The idea there is that I don't want to have to manually log into the backup NAS regularly to clean LocalLogs (they are on a power schedule, so that is inconvenient). But it is ok for me to manually prune the consolidated logs. I might do something along the lines of the snapshot thinning used in SMART snaphots later on - not sure.
    
    The system log naming convention is a bit different from the normal ReadyNAS names - I changed it in order to make it sort better (putting the HDSentinel log, the SMART log, and the system log for a given date together).
    
    I designed the script to run both on my legacy 4.1.x NAS and OS-6 - and deliberately used old-school syntax to limit any compatibility issues with the older linux on 4.1.x. Probably over-did that. OS-6 is detected by looking for rnutil. Main vs Backup is detected by looking at the IP address (which is 10.0.0.15 on my main NAS). All my OS-6 NAS have a data volume (even the one running jbod FlexRaid), and of course all legacy NAS have a C volume. So I use those volume names.
    
    The script itself is
    
    #!/bin/sh # # set up some useful variables # MainNasIP=10.0.0.15 NasIP=`exec hostname -i | awk -F " " '{print $NF}'` RemoteShareName=Logs test "$MainNasIP" != "$NasIP" \ && { ShareName=LocalLogs; \ Retention=7; } \ || ShareName=Logs; \ test -e /usr/bin/rnutil && LogShare=/data/$ShareName || LogShare=/c/$ShareName LogFolder=$LogShare/$(hostname) HDSentinel=/apps/HDSentinel/HDSentinel timestamp="$(date +%Y%m%d_%H%M%S)" RsyncFilter="--include=$(date +%Y)/ --include=$(date +%Y-%m)/*** --exclude=*"; test -e /usr/bin/rnutil && RshParm=-rsh=rsh # # make output folder if not there # test -d $LogFolder || mkdir $LogFolder # # Save Logs in /Logs/hostname/year/year-month # build the longer folder name in two steps, so mkdir works # LogFolder=$LogFolder/$(date +%Y); test -d $LogFolder || mkdir $LogFolder; LogFolder=$LogFolder/$(date +%Y-%m); test -d $LogFolder || mkdir $LogFolder; # # get system logs with rnutil on OS-6, otherwise zip /var/logs # rnutil will create an empty file named "1" in its folder; which is harmless. But let's delete it anyway # get smartctl data (somewhat different command for OS-6 than OS-4) # test -e /usr/bin/rnutil \ && { rnutil create_system_log -o $LogFolder/$(hostname)-$timestamp-System.zip;\ rm ./1;\ for i in a b c d e f g h i j k l m n; do smartctl -a -x -l defects /dev/sd${i} | egrep -v "local build|No such device|smartmontools"; done >>$LogFolder/$(hostname)-$timestamp-Smart.log; }\ || { /apps/Scripts/diag >/tmp/diagnostics.log;\ /apps/Scripts/90_CreateLogs;\ zip -r -j $LogFolder/$(hostname)-$timestamp-System.zip /ramfs/log_zip/*;\ test -d /ramfs/log_zip && rm -rf /ramfs/log_zip;\ test -e /tmp/diagnostics.log && rm /tmp/diagnostics.log;\ for i in a b c d e f g h i j k l m n; do smartctl -a -x /dev/hd${i} | egrep -v "local build|No such device|smartmontools"; done >>/$LogFolder/$(hostname)-$timestamp-Smart.log; } # # log HDsentinel info if present # test -e $HDSentinel && $HDSentinel -r $LogFolder/$(hostname)-$timestamp-HDSentinel # # Apply retention limits if variable set # test "$Retention" != "" && find $LogShare/$(hostname)/* -mtime +$Retention -exec rm {} \;
    test "$Retention" != "" && find $LogShare/$(hostname) -type d -empty -delete # # rsync logs to the main NAS if this is a backup NAS # this requires that rsync be enabled as read-write on the destination share. # retention is not being applied to the destination share # test "$MainNasIP" != "$NasIP" && rsync $RshParm -amv $RsyncFilter $LogShare/$(hostname)/* $MainNasIP::$RemoteShareName/$(hostname) exit 0
    
    On OS-6 I chose to run this as a service, and not in a cron job. To do this, you need to put a service and a timer specification into /var/systemd/system. The files I am using are below.
    
    update_logs.service:
    
    [Unit] Description=Capture Logs Service After=network-online.target multi-user.target [Service] Type=oneshot RemainAfterExit=no ExecStart=/apps/Scripts/update_logs [Install] WantedBy=multi-user.target
    
    update_logs.timer:
    
    [Unit] Description=Capture Logs Service [Timer] OnCalendar=*-*-* 00:04:00 Persistent=true Unit=update_logs.service [Install] WantedBy=multi-user.target
    
    The services are set up by entering
    
    systemctl enable update_logs systemctl start update_logs systemctl enable update_logs.timer systemctl start update_logs.timer
    
    The timer setting for Persistent is supposed to detect that the service wasn't run because the NAS was off, and run it at the next boot when that is detected. I haven't tested that.
    
    Note that the exit 0 at the end of the script is intentional. If the final test is false, then the script returns an error status. Also, if the rsync fails because the main NAS is down, then the script would also return an error. There are apparently scenarios when systemd will stop running services that repeatedly fail. I don't know for sure if that can happen with a one-shot service, but it seemed best to avoid it.
    
    I'll describe how I am building the system log for the legacy NAS in the next post.

Forum Discussion

Script for making remote NAS diagnostics available locally

Related Content

GS308T Remote Diagnostics

RBRE960 cant connect locally

this app isn't available

Available apps is empty

LAN Connectivity Failure from Wireless Diagnostics (macOS)

NETGEAR Academy

ProSupport for Business