Linuxos4all: 2011

Monday, November 7, 2011

VBScript to find Fileage and send an e-mail

'On Error Resume Next

Const ForReading = 1
today = now()

Set objMessage = CreateObject("CDO.Message")
objMessage.From = "babu.dhinakaran@jda.com"
objMessage.To = "babu.dhinakaran@jda.com"

Set objFSO = CreateObject("Scripting.FileSystemObject")

Set objTextFile = objFSO.OpenTextFile("c:\filelist.txt", ForReading)

Do Until objTextFile.AtEndOfStream

folderspec = objTextFile.Readline
'Wscript.Echo "File in the Folder:" & folderspec
objMessage.Subject = folderspec
Agecount = 0

Set folder = objFSO.GetFolder(folderspec)
Set fc = folder.Files

For Each f1 in fc
file = f1.name
filespec = folderspec & file

Set file = objFSO.GetFile(filespec)

ShowDateCreated = file.DateCreated
'Wscript.Echo "FileName: " & filespec
'Wscript.Echo "ShowDateCreated:" & ShowDateCreated
difference = DateDiff("n",ShowDateCreated,today)

If difference > 30 Then
Agecount = Agecount + 1
End If
'Wscript.Echo "File Age in minutes is:" & difference
Next

'Wscript.Echo "Total file older than 30 minutes at: " & folderspec & " are:" & Agecount

'mailbody = "Total file older than 30 minutes at: " & folderspec & " are:" & Agecount
'objMessage.TextBody = mailbody

'Wscript.Echo "-----------------------------------------------"

objMessage.Configuration.Fields.Item _
("http://schemas.microsoft.com/cdo/configuration/sendusing") = 2
objMessage.Configuration.Fields.Item _
("http://schemas.microsoft.com/cdo/configuration/smtpserver") = "indiamailrelay.jda.corp.local"
objMessage.Configuration.Fields.Item _
("http://schemas.microsoft.com/cdo/configuration/smtpserverport") = 25
objMessage.Configuration.Fields.Update

If Agecount > 0 Then

mailbody = "Total file older than 30 minutes at: " & folderspec & " are:" & Agecount
objMessage.TextBody = mailbody
objMessage.Send
'Wscript.Echo "Total file older than 30 minutes at: " & folderspec & " are:" & Agecount

End If

Loop

------------------------------------------------------------------------------------

The above code basically, check for a fileage > 30 minutes and send an email if it is true.

It uses a text file as a input to read folder path where files exists
eg:- c:\temp\
Note: Make sure to provide full path as said above, if provided c:\temp this will generate an error.

Friday, September 16, 2011

diskinfo from the Domain using LDAP to query AD for Servers

on error resume next

' Determine DNS domain name from RootDSE object.
Set objRootDSE = GetObject("LDAP://RootDSE")
strDNSDomain = objRootDSE.Get("defaultNamingContext")
'wscript.echo "defaultNamingContext (Domain Name System)" & strDNSDomain
' Use ADO to search Active Directory for all computers.
Set adoCommand = CreateObject("ADODB.Command")
Set adoConnection = CreateObject("ADODB.Connection")
adoConnection.Provider = "ADsDSOObject"
adoConnection.Open "ADs Provider"
adoCommand.ActiveConnection = adoConnection

' Search entire domain.
strBase = ""

' Filter on computer objects with server operating system.
strFilter = "(&(objectCategory=computer)(operatingSystem=*server*))"

' Comma delimited list of attribute values to retrieve.
strAttributes = "cn"

' Construct the LDAP syntax query.
strQuery = strBase & ";" & strFilter & ";" & strAttributes & ";subtree"

adoCommand.CommandText = strQuery
'adoCommand.Properties("Page Size") = 100
'adoCommand.Properties("Timeout") = 30
'adoCommand.Properties("Cache Results") = False

Set adoRecordset = adoCommand.Execute
strComputerDN = Array()
'arrSortOut = Array()
' cnsd = Array()
Dim cnsdstring
counter = 0
'step = 0
count = 0

' Enumerate computer objects with server operating systems.
Do Until adoRecordset.EOF
ReDim Preserve strComputerDN(counter)
strComputerDN(counter) = adoRecordset.Fields("cn").value
'Wscript.Echo strComputerDN(counter)
adoRecordset.MoveNext
counter = counter + 1
Loop

'wscript.echo "defaultNamingContext" & strDNSDomain
'wscript.echo counter

'Clean up.
adoRecordset.Close
adoConnection.Close

Wscript.Echo "Disk Info"

for each strComputer in strComputerDN
'wscript.echo strComputer
cnsdstring = ""
Set objWMIService = GetObject("winmgmts:" _
& "{impersonationLevel=impersonate}!\\" _
& strComputer & "\root\cimv2")
Set colDisks = objWMIService.ExecQuery _
("Select * from Win32_LogicalDisk where DriveType=3")
For Each objDisk in colDisks
ReDim Preserve cnsd(count)
cnsd(count) = objDisk.DeviceID
cnsdstring = cnsdstring &"," &cnsd(count)
count=count+1
'wscript.echo cnsd(count)
'wscript.echo objDisk.DeviceID
next
wscript.echo "" &strComputer &","&cnsdstring

count = 0

Next

Tuesday, September 6, 2011

Linux using DBD::Sybase to access MSSQL

Recently, I made yet another attempt to get Perl to access Microsoft SQL Server using DBD. Usually, when I want to connect to a Microsoft SQL Server, it is from Perl on Windows. So I take the easy route and use DBD::ODBC and use an ODBC connection. This time though, I wanted to connect to Microsoft SQL Server 2000 from a Linux box. Having no ODBC to fall back on, I looked for native DBD driver of some sort.
It took me several hours of struggling to make it work. I almost gave up several times, so I am writing outline to help anyone else trying to accomplish this same task.
In the end, we will use the DBD::Sybase perl module from CPAN to access the Microsoft SQL Server. Before we can do that however, we must first compile the freetds library.Note: From now on I will refer to Microsoft SQL Server as SQL Server. Please do not confuse this with a generic sql server. We can all now pause to gripe about the lack of imagination in product naming at Microsoft.
Compiling Freetds
Download and compile freetds from http://www.freetds.org/.once you unzip and untar it, enter the directory and run:
./configure --prefix=/usr/local/freetds --with-tdsver=7.0makemake install
Configuring Freetds
Now we have the freetds compiled, but we still have configure it. This is the part that threw me off and is so different from other DBD drivers. The DBD::Sybase driver will ultimately be affected by the contents of the /usr/local/freetds/etc/freetds.conf file. If that file is not configured correctly, your DBD::Sybase connection will fail.
Okay, now that we have established there is a relationship between the freetds.conf file and the DBD::Sybase module, let's edit the freetds.conf file.
The strategic modifications I made to the freetds.conf file were:
1) uncomment the following lines and modify if necessary:
try server login = yestry domain login = no
Note: this forces the module to attempt a database login instead of a domain login. I could not get domain login to work, though I will admit I did not try very hard.
2) uncomment the following line and modify if necessary:
tds version = 7.0
This supposedly sets the default tds version to establish a connection with. I have only SQL Server 2000 servers, and they won't talk at any lower version. So I set it to 7.0. If for some reason you had older SQL Servers, you might leave it at the default 4.2.
3) create a server entry for my server sql1:
[sql1]
host = sql1
port = 1433
tds version = 8.0

[download]
Note: My server here is sql1. Ping sql1 worked, so I am sure I can resolve it using DNS. You can also specifcy an ip address instead of the host name. The sql1 in the brackets is just a descriptor. It could be 'superduperserver' and it would still work as long as my 'host =' is set correctly. I tried 'tds version 7.0' for my SQL Sever 2000 and it worked. Version 5.0 though resulted in an error. You might want to verify your SQL Server is listening on port 1433 with a 'netstat -a -n' run from the command line on the SQL Server.
At this point you can verify your configuration.
/usr/local/freetds/bin/tsql -S sql1 -U sqluser
You will then be prompted for a password and if everything is well, you will see a '1)' waiting for you to enter a command. If you can't get the 1) using tsql, I doubt your DBD::Sybase perl code is going to work. Please note that sqluser is not an Active Directory/Windows Domain user, but an SQL Server user.
Compiling DBD::Sybase
Now that we have the freetds library prerequisite for DBD::Sybase installed and configured, we can compile the DBD::Sybase perl module. Obtain it from http://www.cpan.org/ if you haven't already.
once you have untarred it and are in the directory, run:
export SYBASE=/usr/local/freetdsperl Makefile.PLmakemake install
Note: The export line is to let the compilation process know where to find the freetds libraries.Using DBD::Sybase
You are now ready to test your DBD::Sybase module.
#!/usr/bin/perluse DBI;$dsn = 'DBI:Sybase:server=sql1';my $dbh = DBI->connect($dsn, "sqluser", 'password');die "unable to connect to server $DBI::errstr" unless $dbh;$dbh->do("use mydatabase");$query = "SELECT * FROM MYTABLE";$sth = $dbh->prepare ($query) or die "prepare failed\n";$sth->execute( ) or die "unable to execute query $query error $DBI::errstr";$rows = $sth->rows ;print "$row rows returned by query\n";while ( @first = $sth->fetchrow_array ) { foreach $field (@first) { print "field: $field\n"; }}

Content extracted from: http://www.perlmonks.org/?node_id=392385

Monday, August 8, 2011

How to Reset Nagiosadmin Password?

Login as a root user and execute the below command

htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin

Incase of sudo user, execute the below command

sudo htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin

Friday, April 8, 2011

how to pass arguments for check_db in nagios

File Download	Description
check_db	Shell Script wrapper
DbCheck.class	Compiled DbCheck Class
DbCheck.java	DbCheck java source code
JSAP-2.0a.jar	JSAP java command line parser

Remote oracle database check using JDBC drivers. Supports custom SQL queries and regular expression match. Provides similar funcationality as sitescope Db monitor.

This plugin can check almost every aspect of oracle database, written in java for portability.

Compiled with JDK 1.5.0_06

Uses JSAP command line parser http://www.martiansoftware.com/jsap/

to compile : copy DbCheck.java JSAP-2.0a.jar in a directory, expand JSAP-2.0a.jar in the same directory using command jar xvf JSAP-2.0a.jar

Use : javac -cp .:com/martiansoftware/jsap/JSAP.* DbCheck.java

this should generate DbCheck.class file

Setup Nagios Environment:

Copy DbCheck.class, check_db to /usr/local/nagios/libexec
create /usr/local/nagios/libexec/lib directory
Copy JSAP-2.0a.jar, classes12.jar (Take from $ORACLE_HOME/jdbc/lib), sql files to /usr/local/nagios/libexec/lib directory

Nagios plugin to check output of a sql query, matched with a regular expression. Currently supports numeric data checks only.
--help

Print help message

-H

Hostname/ip address of database server.

-p

Listener port, default is 1521. (default: 1521)

-s

Oracle database SID.

-l

Oracle database user to connect.

-x

Oracle database user password.

-f

Full path of sql file to be executed.

-r

PERL style Regular Expression to match in SQL output. Check value must
be enclosed between (), should match only one field and must be enclosed
between single quotes. Example 'SYSTEM.*,.*,(.*)'

-w

Warning value for matched expression. Normally warning < critical value.
If your check want to alert when result < warning , set critical less
than warning

-c

Critical value for matched expression. Should be more than Warning
1. If not check behaviour is reversed, see warning help text above.

-L

Label for matched regex value i.e FieldName.
Usage: DbCheck --help -H -p -s -l -x -f -r -w -c -L

CHECK_NRPE: Received 0 bytes from daemon

Verify the check_nrpe error message

Just for testing purpose, let us assume that you are execuing the following check_nrpe command that displays the “CHECK_NRPE: Received 0 bytes from daemon.” error message.

$ /usr/local/nagios/libexec/check_nrpe -H 192.168.1.20 -c check_disk -a 60 80 /dev/sdb1
CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages.

If you view the /var/log/messages on the remote host, (in the above example, that is 192.168.1.20), you’ll see the nrpe error “Error: Request contained command arguments!” as shown below, indicating that check_nrpe is not enabled to take the command arguments.

$ tail -f /var/log/messages
Dec 5 11:11:52 dev-db xinetd[2536]: START: nrpe pid=24187 from=192.168.101.108
Dec 5 11:11:52 dev-db nrpe[24187]: Error: Request contained command arguments!
Dec 5 11:11:52 dev-db nrpe[24187]: Client request was invalid, bailing out...
Dec 5 11:11:52 dev-db xinetd[2536]: EXIT: nrpe status=0 pid=24187 duration=0(sec)

Enable check_nrpe command arguments
To enable command arguments in NRPE, you should do the following two things

1. Configure NRPE with –enable-command-args

Typically when you install NRPE on the remote host, you’ll do ./configure without any arguments. To enable support for command arguments in the NRPE daemon, you should install it with –enable-command-args as shown below.

[remotehost]# tar xvfz nrpe-2.12.tar.gz
[remotehost]# cd nrpe-2.12
[remotehost]# ./configure --enable-command-args
[remotehost]# make all
[remotehost]# make install-plugin
[remotehost]# make install-daemon
[remotehost]# make install-daemon-config
[remotehost]# make install-xinetd

2. Modify nrpe.cfg and set dont_blame_nrpe

Modify the /usr/local/nagios/etc/nrpe.cfg on the remote server and set the dont_blame_nrpe directive to 1 as shown below.
$ /usr/local/nagios/etc/nrpe.cfg
dont_blame_nrpe=1Execute check_nrpe with command arguments

After the above two changes, if you execute the check_nrpe for this particular remote host, you’ll not see the error message anymore as shown below.

How to pass arguments to CHECK_NRPE?

$ /usr/local/nagios/libexec/check_nrpe -H 192.168.1.20 -c check_disk -a 60 80 /dev/sdb1
DISK OK - free space: / 111199 MB (92% inode=99%);| /=9319MB;101662;114370;0;127078

Security Warning

Enabling NRPE command line arguments is a security risk. If you don’t know what you are doing, don’t enable this.

Probably by now you’ve already figured out that you can’t blame NRPE if something goes wrong. After all you did set dont_blame_nrpe to 1.

Monday, April 4, 2011

Traceroute shell script for linux/unix

Before you run the below script you need to save server list in /tmp/servlist. The below mentioned bash script you need to put in crontab to run as per the schedule. This script will send the traceroute whenever node/server is down. Please comment if this script is useful for you.

#!/bin/bash
#Author : Ranjith Kumar R
#Purpose : To send the traceroute whenever server/node is down.
#Version : V1.0
#Date :April 05, 2011

Hostname=`/bin/hostname`
Trace="/bin/traceroute -I"

for i in `cat /tmp/servlist`
do
test=`fping $i | grep unreachable | awk '{print $1'}`
echo $test
if [ -z "$test" ]; then
sleep 1
else
$Trace $test | mail -s "$test is Down!!! Traceroute from $Hostname to $test" youremailid@domain.com &
fi
done

Tuesday, March 22, 2011

Accessing Remote Server Attributes using WMI using VbScript

https://docs.google.com/leaf?id=0B7kjumexSzS1MTk4ZjcxNjktODg5Ny00ZjFjLTg4MWEtMjJhMmMyNjdlNjI0&hl=en

Sunday, February 27, 2011

How to delete the files more than 24 hours in linux

For e.g. delete the files from /tmp/test directory

find /tmp/test -mmin +1440 -type f -exec rm -f {} \;

Test Remove Files

Thursday, February 17, 2011

How to reset mysql root password in linux

Step # 1 : Stop mysql service

# /etc/init.d/mysqld stop

Step # 2: Start to MySQL server w/o password:

# mysqld_safe --skip-grant-tables &

Step # 3: Connect to mysql server using mysql client:

# mysql -u root

Step # 4: Setup new MySQL root user password

mysql> use mysql;
mysql> update user set password=PASSWORD("NEW-ROOT-PASSWORD") where User='root';
mysql> flush privileges;
mysql> quit

Step # 5: Stop MySQL Server:

# /etc/init.d/mysqld stop

Step # 6: Start MySQL server and test it

# /etc/init.d/mysqld start
# mysql -u root -p

Saturday, February 12, 2011

Nagios Log Monitoring(line by line) Plugin for linux

#!/bin/bash
#Purpose : To monitor the log line by line
#Authors : Ranjith Kumar R
#Date : 29th March 2014
#Version : V2.0

PROGNAME=`/bin/basename $0`
PROGPATH=`echo $0 | sed -e 's,[\\/][^\\/][^\\/]*$,,'`
REVISION="V1.0"
ECHO="/bin/echo"
STATE_UNKNOWN=3
STATE_OK=0
STATE_CRITICAL=2
TAIL="/usr/bin/tail"
MAIL="/bin/mail"
PRINT="/usr/bin/printf"
DATE=`/bin/date`
CONTACTEMAIL="ranjith@test.com"
print_usage() {
echo "Usage: $PROGNAME -F LOGFILEPATH -q query -c critical count of string match"
echo "Usage: $PROGNAME --help"
echo "Usage: $PROGNAME --version"
}
print_help() {
print_revision $PROGNAME $REVISION
echo ""
print_usage
echo ""
echo "Log file pattern detector plugin for Nagios"
echo ""
support
}
# Make sure the correct number of command line
# arguments have been supplied
if [ $# -lt 6 ]; then
print_usage
exit $STATE_UNKNOWN
fi
# Grab the command line arguments
#LOGFILEPATH=$1
#query=$2
exitstatus=$STATE_WARNING #default
while test -n "$1"; do
case "$1" in
--help)
print_help
exit $STATE_OK
;;
-h)
print_help
exit $STATE_OK
;;
--version)
print_revision $PROGNAME $REVISION
exit $STATE_OK
;;
-V)
print_revision $PROGNAME $REVISION
exit $STATE_OK
;;
--filename)
LOGFILEPATH=$2
shift
;;
-F)
LOGFILEPATH=$2
shift
;;
--query)
query=$2
shift
;;
-q)
query=$2
shift
;;
--critical)
critical=$2
shift
;;
-c)
critical=$2
shift
;;
-x)
exitstatus=$2
shift
;;
--exitstatus)
exitstatus=$2
shift
;;
*)
echo "Unknown argument: $1"
print_usage
exit $STATE_UNKNOWN
;;
esac
shift
done
if [ -r $LOGFILEPATH ]; then

echo "$LOGFILEPATH has read permission" > /dev/null

else

echo "Nagios unable to read $LOGFILEPATH file, please check the file permission"

exitstatus=$STATE_CRITICAL
exit $exitstatus

fi

query1=`echo $LOGFILEPATH | awk -F"/" '{print $NF}'`.`echo $query | awk '{print $1}'`

if [ -f "/usr/local/nagios/libexec/lastline.$query1" ]; then
count=0
else
echo 0 > /usr/local/nagios/libexec/lastline.$query1
fi
COUNT=0
LA="/usr/local/nagios/libexec/lastline.$query1"
LASTLINE=`cat /usr/local/nagios/libexec/lastline.$query1`
NEWLINE=`cat $LOGFILEPATH | wc -l`
if [ "$NEWLINE" -lt "$LASTLINE" ];then
echo 0 > /usr/local/nagios/libexec/lastline.$query1
fi
if [ "$NEWLINE" -gt "$LASTLINE" ];then
LINE=$(expr $NEWLINE - $LASTLINE)
echo $NEWLINE > $LA
COUNT=`$TAIL -$LINE $LOGFILEPATH | egrep -c "$query"`
MATCHLINE=`$TAIL -$LINE $LOGFILEPATH | egrep -i "$query"`
if [ "$COUNT" -ge "$critical" ];then
$ECHO -e "CRITICAL Matches per line for $query is $COUNT, please refer the below error log.\n$MATCHLINE\nLast Line is $LASTLINE and New Line is $NEWLINE ";echo '|' "count=$COUNT;;$critical"
$PRINT "%b" "***** CRITICAL *****\n\nNotification Type: CRITICAL\n\nCRITICAL Matches per line for $query is $COUNT, please refer the below error log.\n\n$MATCHLINE\n\nDate&Time: $DATE" | $MAIL -s "** CRITICAL Alert: $query **" $CONTACTEMAIL
exitstatus=$STATE_CRITICAL
exit $exitstatus
fi
fi
if [ "$NEWLINE" -eq "$LASTLINE" ] || [ "$COUNT" -lt "$critical" ];then
$ECHO "OK - $COUNT pattern matches found,Last Line is $LASTLINE and New Line is $NEWLINE";echo '|' "count=$COUNT;;$critical"
exitstatus=$STATE_OK
exit $exitstatus
else
$ECHO "UNKNOWN, Last Line is $LASTLINE and New Line is $NEWLINE";echo '|' "count=$COUNT;;$critical"
exitstatus=$STATE_UNKNOWN
exit $exitstatus
fi
fi
--------------------------------------------------------------

e.g. ./check_log -F logfilepath(/var/log/messages) -q string(message) -c number of match

--------------------------------------------------------------------

It will through an error whenever there is a match.

Friday, February 4, 2011

Nagios check_webinject/web transaction plugin

check_webinject

check_webinject is a Nagios check plugin based on the Webinject Perl Module available on CPAN which is now part of the Webinject project. We use it heavily at ConSol and did a complete rework including some bugfixes and enhancements for Nagios 3.

Current Version is the 1.55 released on Dec 18, 2010.

How does it work?

The plugin is written in Perl and uses LWP together with Crypt::SSLeay. check_webinject sends requests to any configured webservice. You may then specify verification settings in your test cases.

A sample testcase file is included in the downloadable tarball.

   id = "1"
   description1 = "Sample Test Case"
   method = "get"
   url = "{BASEURL}/test.jsp"
   verifypositive = "All tests succeded"
   warning = "5"
   critical = "15"
   label = "testpage"
/>

A sample command like would look like this:

%>./check_webinject -s baseurl=http://yourwebserver.com:8080 testcase.xml 
WebInject OK - All tests passed successfully in 0.027 seconds|time=0.027;0;0;0;0 testpage=0.024;5;15;0;0

Add check_webinject like a normal nagios plugin.

Download

You can download check_webinject here.

The source is also available from GitHub.

Prebuild version of the check_webinject nagios plugin

Installation

Just unpack the tarball and make sure the required Perl modules exist:

LWP
XML::Simple
HTTP::Request::Common
HTTP::Cookies
Crypt::SSLeay
XML::Parser
Error

Nagios check_logfile plugin for windows and unix/linux

Download

check_logfiles-3.4.3.tar.gz

check_logfiles.zip

On this page you will find examples for configuration files.

Example 1: Error messages from FCAL-Devices

Usage as nagios-plugin to monitor FCAL-devices on a Solaris system. This is a basic example which scans for patterns in /var/adm/messages.

@searches = (
  {
    tag => 'san',
    logfile => '/var/adm/messages',
    rotation => 'SOLARIS',
    criticalpatterns => [
        'Link Down Event received',
        'Loop OFFLINE',
        'fctl:.*disappeared from fabric',
        '.*Lun.*disappeared.*'
    ],
  });

Example 2: Again, but this time as passive service using send_nsca

Using the following configfile you can run check_logfiles as standalone-script. If error messages are found in the messages file, a summary notification is sent to the NSCA server at the end of the check_logfile run.

$scriptpath = '/usr/bin/nagios/libexec:/usr/local/nagios/contrib';
$MACROS = {
    NAGIOS_HOSTNAME => 'orschgeign.muc',
    CL_NSCA_HOST_ADDRESS => 'nagios1.muc',
    CL_NSCA_PORT => 5778
};
$postscript = 'send_nsca';
$postscriptparams = '-H $CL_NSCA_HOST_ADDRESS$ -p $CL_NSCA_PORT$
     -to $CL_NSCA_TO_SEC$ -c $CL_NSCA_CONFIG_FILE$';
$postscriptstdin = '$CL_HOSTNAME$\t$CL_SERVICEDESC$\t
    $CL_SERVICESTATEID$\t$CL_SERVICEOUTPUT$\n';
@searches = (
  {
    tag => 'san',
    logfile => '/var/adm/messages',
    criticalpatterns => [
        'Link Down Event received',
        'Loop OFFLINE',
        'fctl:.*disappeared from fabric',
        '.*Lun.*disappeared.*'
    ],
  },
);

Example 3: Again, but this time with a notification for each single hit

If you want a notification every time a line matching one of your patterns is found, use the following modified configfile. Be careful: If you expect hundreds of these lines, your server will be flooded.

$scriptpath = '/usr/bin/nagios/libexec:/usr/local/nagios/contrib';
$MACROS = {
    NAGIOS_HOSTNAME => 'orschgeign.muc',
    CL_NSCA_HOST_ADDRESS => 'nagios1.muc',
    CL_NSCA_PORT => 5778
};
@searches = (
  {
    tag => 'san',
    logfile => '/var/adm/messages',
    criticalpatterns => [
        'Link Down Event received',
        'Loop OFFLINE',
        'fctl:.*disappeared from fabric',
        '.*Lun.*disappeared.*'
    ],
    options => 'script',
    script => 'send_nsca',
    scriptparams => '-H $CL_NSCA_HOST_ADDRESS$ -p $CL_NSCA_PORT$
     -to $CL_NSCA_TO_SEC$ -c $CL_NSCA_CONFIG_FILE$',
    scriptstdin => '$CL_HOSTNAME$\t$CL_SERVICEDESC$\t
    $CL_SERVICESTATEID$\t$CL_SERVICEOUTPUT$\n',
  },
);

Example 4: Check the correct function of the syslog service

In the following example a message will be sent to the syslog service imediately after check_logfiles starts up. After a delay of 5 seconds (which should be enough for the message to make it into the logfile) the logfile will be scanned for this message. If it cannot be found, this is counted as a critical error.

$scriptpath = '/usr/bin';
$prescript = 'logger';
$prescriptparams = '-t nagios';
$prescriptstdin = 'braver syslog ($CL_DATE_YYYY$-$CL_DATE_MM$
    -$CL_DATE_DD$ $CL_
DATE_HH$:$CL_DATE_MI$:$CL_DATE_SS$)';
$prescriptsleep = 5;
@searches = (
  {
    tag => 'syslogworks',
    logfile => '/var/adm/syslog/syslog.log',
    rotation => 'bmwhpux',
    criticalpatterns => ['!nagios:\s+braver\s+syslog'],
    options => 'count',
  },
);

Example 5: Monitoring HP Service Guard

Here we look for typical error messages of the cluster software. The value HPUX of the rotation-parameter means, that both syslog.log and maybe OLDsyslog.log are scanned.

$seekfilesdir = '/lfs/opt/nagios/var/tmp';
$protocolsdir = '/lfs/opt/nagios/var/tmp';
$scriptpath = '/lfs/opt/nagios/nrpe/locallibexec';
@searches = (
  {
    tag => 'mcsg',
    logfile => '/var/adm/syslog/syslog.log',
    rotation => 'HPUX',
    criticalpatterns => [
        '.*cmcld: Inbound connection from unconfigured address.*',
        '.*cmclconfd.*Unable to activate keep alive option on
     incomming connection.*',
        '.*inetd.*hacl-cfg/udp: Server failing (looping),
     service terminated.*',
        '.*inetd.*hacl-probe/tcp: accept: Bad file number.*',
        '.*cmcld: Inbound.*message from unconfigured address.*',
        '.*cmcld: Unable to connect to quorum server .*
     It may be down.*',
        '.*cmcld: Failed to receive from quorum server.*',
        '.*cmcld: Connection failure to quorum server.*'
    ],
    warningpatterns => [
        'Cluster Files not in Sync',
    ],
    options => 'protocol,count'
  },
);

Example 6: Monitoring the LVM under HP-UX

In this example we look for typical logical volume manager error messages.

@searches = (
 {
  tag => 'lvm',
  logfile => '/var/adm/syslog/syslog.log',
  rotation => 'HPUX',
  criticalpatterns => [
   '.*vmunix: LVM: vg\[[0-9]*\]: pvnum=.*is POWERFAILED',
   '.*vmunix: SCSI: Read error.*dev:.*errno:.*resid:.*',
   '.*vmunix: LVM:.*PVLink.* Failed! The PV is still accessible.*',
   '.*vmunix: LVM: Restored PV.*',
   '.*vmunix: LVM: Performed a switch for Lun ID.*',
   '.*vmunix: LVM:.*PVLink.*Recovered.*',
   '.*vmunix:.*vxfs:.*vx_metaioerr.*file system meta data read error',
  ],
 },
);

Example 7: Simple monitor for a SUN server’s hardware health

If failures or errors exist in the system, prtdiag -l outputs this information to syslogd. If a corresponding error message is found in the messages file, a defect was detected.

#
#  This config file implements a simple method to monitor the
#  hardware health of a solaris machine.
#  From the prtdiag(1M) manpage:
#  -l    Log output. If failures or errors exist in the system,
#        output this information to syslogd(1M) only.
#  This means, if you run prtdiag and you find something
#  prtdiag-related in the messages file, then there must be
#  an error somewhere in the system.
#
$scriptpath = '/usr/platform/sun4u/sbin';
$prescript = 'prtdiag';
$prescriptparams = '-l';
@searches = (
  {
    tag => 'prtdiag',
    logfile => '/var/adm/messages',
    rotation => 'SOLARIS',
    criticalpatterns => 'prtdiag:',
  },
);

Example 8: Monitoring of SUN hardware by sending SNMP-traps

In this example we scan /var/adm/messages for patterns indicating upcoming hardware trouble. In this scenario check_logfiles runs not as a nagios-plugin but as a standalone script, which sends a snmp-trap if matching lines were found. Sending the trap is done by an external script which gets the needed information via environment variables.
Here just one single trap is sent at the end of check_logfile’s runtime. If you want a trap for each single matching line, move the $postscript definition as script definition inside the search.

$MACROS = {
  SNMP_TRAP_SINK_HOST => 'nagios.dierichs.de',
  SNMP_TRAP_SINK_VERSION => 'snmpv1',
  SNMP_TRAP_SINK_COMMUNITY => 'public',
  SNMP_TRAP_SINK_PORT => 162,
  SNMP_TRAP_ENTERPRISE_OID => '1.3.6.1.4.1.20006.1.5.1',
};
$seekfilesdir = '/lfs/opt/nagios/var/tmp';
$protocolsdir = '/lfs/opt/nagios/var/tmp';
$scriptpath = '/lfs/opt/nagios/nrpe/locallibexec';
@searches = (
 {
  tag => 'hwmsgs',
  logfile => '/var/adm/kern.log',
  rotation => 'kern\d{4}-\d{2}-\d{2}',
  criticalpatterns => [
  # bit error cannot be repaired by the scrubber.
  # take cover.
  '.*Sticky Softerror encountered.*',
  ],
  warningpatterns => [
   # memory crumbling
   'NOTICE: Previously reported error on page \w+\.\w+ cleared',
   # lan calble was pulled
   'WARNING: \w+: fault detected external to device; service degraded',
  ],
  options => 'noprotocol',
 },
);
$postscript => 'send_snmptrap.pl';

Jörg Linge was so kind to contribute the following script:

#! /usr/bin/perl
#
#  send_snmptrap.pl
#
use strict;
use Net::SNMP;
my $hostname = $ENV{CHECK_LOGFILES_SNMP_TRAP_SINK_HOST}
    || 'nagios.dierichs.de';
my $version = $ENV{CHECK_LOGFILES_SNMP_TRAP_SINK_VERSION}
    || 'snmpv1';
my $community = $ENV{CHECK_LOGFILES_SNMP_TRAP_SINK_COMMUNITY}
    || 'public';
my $port = $ENV{CHECK_LOGFILES_SNMP_TRAP_SINK_PORT}
    || 162;
my $oid = $ENV{CHECK_LOGFILES_SNMP_TRAP_ENTERPRISE_OID}
    || '1.3.6.1.4.1.20006.1.5.1';
 
my ($session, $error) = Net::SNMP->session(
    -hostname     => $hostname,
    -version      => $version,
    -community    => $community,
    -port         => $port      # Need to use port 162
);
if (!defined($session)) {
   printf('ERROR: %s.\n', $error);
   exit 1;
}
my @varbind = ($oid, OCTET_STRING, $ENV{CHECK_LOGFILES_SERVICEOUTPUT});
my $result = $session->trap(
    -enterprise   => $oid,
    -specifictrap => $ENV{CHECK_LOGFILES_SERVICESTATEID},
    -varbindlist  => \@varbind);
$session->close;
exit 0;

Example 9: Monitoring SUN hardware using NSCA

Instead of SNMP-traps one could also report the errors to a nagios server using send_nsca. Here also check_logfiles runs as standalone script.

$scriptpath = '/usr/local/nagios/bin';
$MACROS = {
    NAGIOS_HOSTNAME => 'orschgeign.muc',
    CL_NSCA_HOST_ADDRESS => 'nagios1.muc',
    CL_NSCA_PORT => 5778,
    CL_NSCA_CONFIG_FILE => '/usr/local/etc/send_nsca.cfg',
};
@searches = (
 {
  tag => 'hwmsgs',
  logfile => '/var/adm/kern.log',
  rotation => 'kern\d{4}-\d{2}-\d{2}',
  criticalpatterns => [
  # bit error cannot be repaired by the scrubber.
  # take cover.
  '.*Sticky Softerror encountered.*',
  ],
  warningpatterns => [
   # memory degrading
   'NOTICE: Previously reported error on page \w+\.\w+ cleared',
   # lan cable was pulled
   'WARNING: \w+: fault detected external to device; service degraded',
  ],
  options => 'noprotocol',
 },
);
$postscript = 'send_nsca';
$postscriptparams = '-H $CL_NSCA_HOST_ADDRESS$ -p $CL_NSCA_PORT$
     -to $CL_NSCA_TO_SEC$ -c $CL_NSCA_CONFIG_FILE$';
$postscriptstdin = '$CL_HOSTNAME$\t$CL_SERVICEDESC$\t
    $CL_SERVICESTATEID$\t$CL_SERVICEOUTPUT$\n';

Example 10: Scan Linux logfiles as an unprivileged user

At the startup of check_logfiles the file attributes of the logfile are modified such that the nagios user can read them.
For this you need an entry in /etc/sudoers:
qqnagio ALL = (root) NOPASSWD: /usr/bin/setfacl
Should the sudo-command fail, then its exitcode of 1 together with the supersmartprescript-option forces check_logfiles to abort with a warning.
If you find the following line in /etc/sudoers
Defaults requiretty
it must be commented out.

$scriptpath = '/usr/bin';
$prescript = 'sudo';
$prescriptparams = 'setfacl -m u:$CL_USERNAME$:r-- /var/log/messages*';
$options = 'supersmartprescript';
@searches = ({
  tag => 'reiserfs',
  logfile => '/var/log/messages',
  rotation => 'SUSE',
  criticalpatterns => [
      'vs-5150: search_by_key:',
      'is_tree_node: node level \d+ does not match to the expected one',
      'vs-500: unknown uniqueness -1',
      'vs-5657: reiserfs_do_truncate: i/o failure',
      'green-16006: Invalid item type observed, run fsck ASAP'],
  ...
});
....

Example 11: Monitoring Apache under Windows for intrusion attempts

Because of the ‘\’ Windows path names have to be set in single quotes.

$MACROS = {
  APACHEDIR => 'C:\Programme\Apache Software Foundation\Apache2.2'
};
@searches = ({
  tag => 'apachebreakin',
  logfile => '$APACHEDIR$\logs\access.log',
  criticalpatterns => [
      'GET.*cmd\.exe.*',
      'SEARCH /\\x90\\x02\\xb1\\x02\\xb1' ]
});

Example 12: Revoke hits with the help of a script

Scripts of type supersmart can help you to take a more accurate look at matching lines and, if necessary, modify them.

@searches =(
  {
    tag => 'heiss',
    logfile => '/var/log/messages',
    criticalpatterns => '.*Thermometer: \d+ Degrees.*',
    options => 'supersmartscript',
    script => sub {
      my $degrees = 0;
      $ENV{CHECK_LOGFILES_SERVICEOUTPUT} =~ /: (\d+) Degrees/;
      $degrees = $1;
      if ($degrees > 86) {
        if (($ENV{CHECK_LOGFILES_DATE_MM} >= 6) &&
            ($ENV{CHECK_LOGFILES_DATE_MM} <= 8)) {
          printf 'OK - after all, it\'s summer\n'; # dummy msg
          return 0; # this match never happened.
        } elsif (($ENV{CHECK_LOGFILES_DATE_MM} >= 11) &&
            ($ENV{CHECK_LOGFILES_DATE_MM} <= 2)) {
          printf 'CRITICAL - fire!\n';
          return 2;
        } else {
          printf 'WARNING - a bit warm in here\n';
          return 1;
        }
      } else {
        printf 'OK - below 86 degrees\n';
        return 0;
      }
    }
  }
);

Example 13: Monitoring of Fibre Channel Links

Using the type “virtual” one can monitor files in the /proc or /sys directory. In the following example the cable is pulled from an Emulex LPe1150 adapter.

nagios@ibmsrv05:/> cat /sys/class/scsi_host/host0/model
ServeRAID 8i
nagios@ibmsrv05:/> cat /sys/class/scsi_host/host1/modeldesc
Emulex LPe1150-F4 4Gb 1port FC: PCIe SFF HBA
nagios@ibmsrv05:/> cat /sys/class/scsi_host/host2/modeldesc
Emulex LPe1150-F4 4Gb 1port FC: PCIe SFF HBA
.
.
.
nagios@ibmsrv05:/> cat /sys/class/scsi_host/host0/state
running
nagios@ibmsrv05:/> cat /sys/class/scsi_host/host1/state
Link Up - Ready:
   Fabric
nagios@ibmsrv05:/> cat /sys/class/scsi_host/host2/state
Link Up - Ready:
   Fabric
.
.
.
@searches = (
  {
    tag => 'host0',
    logfile => '/sys/class/scsi_host/host0/state',
    type => 'virtual',
    criticalpatterns => [
      '^[^running]+'
    ],
    options => 'nologfilenocry,noprotocol',
  },
  {
    tag => 'host1',
    logfile => '/sys/class/scsi_host/host1/state',
    type => 'virtual',
    criticalpatterns => [
      'Link [^Up]+'
    ],
    options => 'nologfilenocry,noprotocol',
  },
  {
    tag => 'host2',
    logfile => '/sys/class/scsi_host/host2/state',
    type => 'virtual',
    criticalpatterns => [
      'Link [^Up]+'
    ],
    options => 'nologfilenocry,noprotocol',
  },
);
.
.
.
nagios@ibmsrv05:/> check_logfiles -f linux_fs_check_fcal.cfg
OK - no errors or warnings |host0=1;0;0;0 host1=2;0;0;0 host2=2;0;0;0
.
.
.
nagios@ibmsrv05:/> cat /sys/class/scsi_host/host2/state
Link Down
.
.
.
nagios@ibmsrv05:/> check_logfiles -f linux_fs_check_fcal.cfg
CRITICAL - (1 errors) - Link Down  |host0_lines=1
     host0_warnings=0 host0_criticals=0
     host0_unknowns=0 host1_lines=2 host1_warnings=0
     host1_criticals=0 host1_unknowns=0 host2_lines=1
     host2_warnings=0 host2_criticals=1 host2_unknowns=0

Example 14: Forwarding of the Windows Eventlogs to a Unix-Syslogserver

If a messages file is composed of multiple servers’ events, because you forward the Windows eventlog to a Unix system, using the syslogclient option allows a directed search for messages coming from a specific Windows system.

@searches = ({
  tag => 'exchange1.dom',
  logfile => '/var/log/messages',
  rotation => 'SUSE',
  criticalpatterns => [
     'An MTA database server error was encountered',
  ],
  options => 'syslogclient=exchange1.dom'
},
{
  tag => 'exchange2.dom',
  logfile => '/var/log/messages',
  rotation => 'SUSE',
  criticalpatterns => [
     'An MTA database server error was encountered',
  ],
  options => 'syslogclient=$CL_TAG$'
  });
....

Example 15: Searching the AIX errpt

AIX writes many messages in the so called Error Report which can be readout with the errpt command. With type=errpt you can instruct check_logfiles to scan errpt’s output instead of a real logfile.

@searches = (
 {
   tag => 'minor_errors',
   type => 'errpt',
   criticalpatterns => ['ADAPTER ERROR',
       'The largest dump device is too small.',
       'The copy directory is too small.',
       'Kernel heap use exceeds allocation count',
       'Kernel heap use exceeds percentage thres',
       'LINK ERROR',
       'Permanent fatal error',
       'SCSI BUS OR DEVICE ERROR',
       'SCSI DEVICE OR MEDIA ERROR',
       'Possible malfunction on local adapter',
       'ETHERNET DOWN',
       'UNABLE TO ALLOCATE SPACE IN KERNEL HEAP'
    ],
 }
);

Example 16: Windows EventLog forwarding with templates

If there are messages originating from different syslog clients in a logfile, they can be prefiltered with the name of such a client. To avoid definitions for each single client, you can use templates.

define command {
  command_name  check_client_logs
  command_line     $USER2$/check_logfiles --tag=$HOSTNAME$ \
      --logfile='/var/log/messages' \
      --criticalpattern='$ARG1$' --syslogclient='$CL_TAG$'
}
define service {
  service_description dr_watson
  host_name  pc0815.muc
  check_command check_client_logs!4097.*generated an application error
}

With templates you can formulate multiple searches in one configfile and pick only specific ones according to the type of the host. Without templates you would have to write a definition for each host.

@searches = (
{
  template => 'drwatson',
  logfile => '/var/log/messages',
  criticalpattern => '4097.*generated an application error',
  options => 'syslogclient=$CL_TAG$'
},
{
  template => 'virus',
  logfile => '/var/log/messages',
  criticalpattern => 'a virus was found',
  options => 'syslogclient=$CL_TAG$'
},
{
  template => 'cluster',
  logfile => '/var/log/messages',
  criticalpatterns => ['5029.*The cluster  log is corrupt',
      '5038.*A cluster resource failed', ],
  options => 'syslogclient=$CL_TAG$'
});

For “normal” Windows-Clients you would run:

check_logfiles --config  --tag='pc0815' \
    --selectedsearches='drwatson,virus' \

And for cluster servers:

check_logfiles --config  --tag='clustsrv1.muc'

Example 17: Oracle Alertlog

Oracle databases write their error messages into an alert log. Paying attention to these messages helps you detect potential problems before they cause a production outage. (please also refer to type => “oraclealertlog”)

@searches = ({
  tag => 'oraalerts',
  logfile => '......../alert.log',
  criticalpatterns => [
      'ORA\-0*204[^\d]',        # error in reading control file
      'ORA\-0*206[^\d]',        # error in writing control file
      'ORA\-0*210[^\d]',        # cannot open control file
      'ORA\-0*257[^\d]',        # archiver is stuck
      'ORA\-0*333[^\d]',        # redo log read error
      'ORA\-0*345[^\d]',        # redo log write error
      'ORA\-0*4[4-7][0-9][^\d]',# ORA-0440 - ORA-0485 background process failure
      'ORA\-0*48[0-5][^\d]',
      'ORA\-0*6[0-3][0-9][^\d]',# ORA-6000 - ORA-0639 internal errors
      'ORA\-0*1114[^\d]',        # datafile I/O write error
      'ORA\-0*1115[^\d]',        # datafile I/O read error
      'ORA\-0*1116[^\d]',        # cannot open datafile
      'ORA\-0*1118[^\d]',        # cannot add a data file
      'ORA\-0*1122[^\d]',       # database file 16 failed verification check
      'ORA\-0*1171[^\d]',       # datafile 16 going offline due to error advancing checkpoint
      'ORA\-0*1201[^\d]',       # file 16 header failed to write correctly
      'ORA\-0*1208[^\d]',       # data file is an old version - not accessing current version
      'ORA\-0*1578[^\d]',        # data block corruption
      'ORA\-0*1135[^\d]',        # file accessed for query is offline
      'ORA\-0*1547[^\d]',        # tablespace is full
      'ORA\-0*1555[^\d]',        # snapshot too old
      'ORA\-0*1562[^\d]',        # failed to extend rollback segment
      'ORA\-0*162[89][^\d]',     # ORA-1628 - ORA-1632 maximum extents exceeded
      'ORA\-0*163[0-2][^\d]',
      'ORA\-0*165[0-6][^\d]',    # ORA-1650 - ORA-1656 tablespace is full
      'ORA\-16014[^\d]',      # log cannot be archived, no available destinations
      'ORA\-16038[^\d]',      # log cannot be archived
      'ORA\-19502[^\d]',      # write error on datafile
      'ORA\-27063[^\d]',         # number of bytes read/written is incorrect
      'ORA\-0*4031[^\d]',        # out of shared memory.
      'No space left on device',
      'Archival Error',
  ],
  warningpatterns => [
      'ORA\-0*3113[^\d]',        # end of file on communication channel
      'ORA\-0*6501[^\d]',         # PL/SQL internal error
      'ORA\-0*1140[^\d]',         # follows WARNING: datafile #20 was not in online backup mode
      'Archival stopped, error occurred. Will continue retrying',
  ]
});

Example 17a: Oracle RAC Clusterware Alertlog

Daniel Graef sent in this example for the monitoring of an Oracle Clusterware Alertlog. Thanks a lot!

@searches = (
{
  tag => 'racnode01-clusterware',
  logfile => '/oracle/app/crs/product/111_1/log/racnode01/alertracnode01.log',
  criticalpatterns => [
      'CRS\-1006[^\d]', # The OCR location %s is inaccessible. Details in %s.
      'CRS\-1008[^\d]', #  Node %s is not responding to OCR requests. Details in %s.
      'CRS\-1009[^\d]', #  The OCR configuration is invalid. Details in %s.
      'CRS\-1011[^\d]', #  OCR cannot determine that the OCR content contains the latest updates. Details in %s.
      'CRS\-1202[^\d]', #  CRSD aborted on node %s. Error [%s]. Details in %s.
      'CRS\-1203[^\d]', #  Failover failed for the CRS resource %s. Details in %s.
      'CRS\-1205[^\d]', #  Auto-start failed for the CRS resource %s. Details in %s.
      'CRS\-1206[^\d]', #  Resource %s went into an UNKNOWN state. Force stop the resource using the crs_stop -f command and restart %s.
      'CRS\-1207[^\d]', #  There are no more restart attempts left for resource %s. Restart the resource manually using the crs_start command.
      'CRS\-1402[^\d]', #  EVMD aborted on node %s. Error [%s]. Details in %s.
      'CRS\-1602[^\d]', #  CSSD aborted on node %s. Error [%s]. Details in %s.
      'CRS\-1606[^\d]', #  CSSD Insufficient voting files available [%s of %s]. Details in %s.
      'CRS\-1608[^\d]', #  CSSD Evicted by node %s. Details in %s.  [local node eviced, critical for node himself]
      'CRS\-1609[^\d]', #  CSSD detected a network split. Details in %s.
  ],
  warningpatterns => [
      'CRS\-1010[^\d]', #  The OCR mirror location %s was removed.
      'CRS\-1604[^\d]', #  CSSD voting file is offline: %s. Details in %s.
      'CRS\-1607[^\d]', #  CSSD evicting node %s. Details in %s. [local evicted other node, warning for clsuter state]
      'CRS\-2001[^\d]', #  memory allocation error when initiating the connection failed to allocate memory for the connection with the target process
      'CRS\-2003[^\d]', #  error %d encountered when connecting to %s
      'CRS\-2004 [^\d]', # error %d encountered when sending messages to %s
      'CRS\-2005[^\d]', #  timed out when waiting for response from %d
      'CRS\-2006[^\d]', #  failed to get response from %d
  ],
  options => 'sticky=86400'
});

Example 18: IPMI System Event Log

This example shows how to look for power supply problems by reading the IPMI System Event Log with theipmitool sel list command.

@searches = (
  {
    tag => 'powercable',
    type => 'ipmitool',
    ipmitool => { # you don't need this if you are root
      path => 'sudo /usr/bin/ipmitool',
    },
    criticalpatterns => [
        'Power Supply.*Failure detected',
        'Power Supply AC lost',
     ],
  });
nagios@ibmsrv05:/> check_logfiles -f ibm_power.cfg
CRITICAL - (6 errors in test.protocol-2008-02-12-14-19-36) -
      190 ; 02/07/2008 ; 14:28:13 ; Power Supply #0x39 ;
     Failure detected ...|
     powercable_lines=17 powercable_warnings=0
     powercable_criticals=6 powercable_unknowns=0

Example 19: Passive Checkresults which cannot be assigned

Passive Checkresults, which cannot be assigned a host or a service (e.g. because of a typo) are silently dropped (Apart from a notice in nagios.log). With this method, Nagios is able to send out a notification if this occurs. This was Augustinus’ idea.

$MACROS = {
  NAGIOS_LOGFILES => '/var/nagios'
};
@searches = {
  tag => 'nagios_unmatched_passive_check_results',
  logfile => '$NAGIOS_LOGFILES$/nagios.log',
  archivedir => '$NAGIOS_LOGFILES$/archives',
  rotation => 'nagios-\d{2}-\d{2}-\d{2}-\d{2}.log',
  criticalpatterns => [
      '^\[\d+\] Warning:  Passive check result was received for service .* on host .* but the service could not be found',
      '^\[\d+\] Warning:  Passive check result was received for service .* on host .* but the host could not be found',
  ],
};