A good way to check if a network share like NFS or CIFS is still available, is to monitor an existing file on the share itself.
Doing this with SMB/CIFS is a little easier than with NFS when the NFS share is hard mounted.
The check would then wait forever if the NFS server is not available and not return any error. It could also happen that your checks are piling up in the process list.
Here is an example to make it work by using /usr/bin/timeout
timeout
will run stat
to check for the file
but kill the stat command if it does not return within
a specified time period.
For demonstration purposes I use monit but this can be done with any other monitoring solution like OpenNMS by executing the check over net-snmp's extend feature for example.
apt-get install monit
Since monit 5.7 "check program" supports now arguments.
/etc/monit/conf.d/cifs_nfs
check program CIFS with path "/usr/bin/timeout 1 /usr/bin/stat -t /media/cifs/test.txt" if status != 0 then alert check program NFS with path "/usr/bin/timeout 1 /usr/bin/stat -t /media/nfs/test.txt" if status != 0 then alert
With older versions of monit you have to use a wrapper script for the check.
mkdir /etc/monit/check_scripts/
/etc/monit/conf.d/cifs_nfs
check program CIFS with path "/etc/monit/check_scripts/check_stale_cifs.sh" if status != 0 then alert check program NFS with path "/etc/monit/check_scripts/check_stale_nfs.sh" if status != 0 then alert
/etc/monit/check_scripts/check_stale_cifs.sh
#!/bin/bash CHECK_FILE="/media/cifs/test.txt" TIMEOUT=1 BIN_TIMEOUT=/usr/bin/timeout BIN_STAT=/usr/bin/stat "$BIN_TIMEOUT" "$TIMEOUT" "$BIN_STAT" -t "$CHECK_FILE" > /dev/null 2> /dev/null RETVAL=$? [ $RETVAL -eq 0 ] && echo "Ok. Found $CHECK_FILE" && exit $RETVAL [ $RETVAL -eq 124 ] && echo "Timed out checking for $CHECK_FILE" >&2 && exit $RETVAL [ $RETVAL -ne 0 ] && echo "Could not find $CHECK_FILE" >&2 && exit $RETVAL
/etc/monit/check_scripts/check_stale_nfs.sh
#!/bin/bash CHECK_FILE="/media/nfs/test.txt" TIMEOUT=1 BIN_TIMEOUT=/usr/bin/timeout BIN_STAT=/usr/bin/stat "$BIN_TIMEOUT" "$TIMEOUT" "$BIN_STAT" -t "$CHECK_FILE" > /dev/null 2> /dev/null RETVAL=$? [ $RETVAL -eq 0 ] && echo "Ok. Found $CHECK_FILE" && exit $RETVAL [ $RETVAL -eq 124 ] && echo "Timed out checking for $CHECK_FILE" >&2 && exit $RETVAL [ $RETVAL -ne 0 ] && echo "Could not find $CHECK_FILE" >&2 && exit $RETVAL