Category Archives: Monitoring

vCenter alarm Polling (v2) – Cross platform version (Windows & Appliance)

As it has been some time since my last post about a solution on how to get vCenter alarms to Zabbix, and VMware also evolved I followed a new approach on that topic as my initial post only supports Windows vCenters. Furthermore the solution is not as stable as I wished that it would be, so my new approach is to query all alarms from a vCenter via it’s SDK.
Initially all the alarms are discovered and created in Zabbix and in a second step the values for the discovered alarms are polled.
Currently the script used the data center object of the vCenter to discover alarms, so it can’t be used on a standalone ESXi-Server. However – if the code is changed to use whatever object is needed to get the alarms directly from the ESXi-server it should also be possible to get alarms directly from a server without the need of a vCenter (but I didn’t implement that till now as there wasn’t the need/time).

vCenter alarms – SDK (tested with ESXi 6.0+ and Zabbix 3.0 on RHEL 7)

To install the vCenter alarms the attached zip needs to be downloaded and the VMware Perl SDK must be installed on the Zabbix Server.
The template needs to be imported into Zabbix and the vCenter username and password need to be set in the username/password macros of the template.

The other two files (vcenterAlarms.pl & vcenterAlarms.wrapper) need to be extracted to the externalscripts folder of the Zabbix Server. The wrapper script is just a shell script that is executed by a Zabbix item to call the per script and send the that to Zabbix via Zabbix Sender.  As the VMware API is quite slow the wrapper also starts itself again with NOHUP because otherwise the timeout defined in the Zabbix Server configuration would cause an exit of the script. For my setup it always took longer than 30 Seconds till tall data where gathered and therefor the Zabbix Server would kill the script in the middle of the execution and no data would be sent to Zabbix. That’s why I added this workaround. Furthermore it also checks if there are less than eleven vcenterAlarms.wrapper processes running, and only starts if there are less, to ensure that Zabbix does not spawn hundreds of NOHUP-processes.

 

 

 

 

Zabbix – Clear hosts from untemplated items

Sometimes you run into the problem, that you have a host which had a template attached but somebody wanted to replace the template or something like that and unfortunately hit just “Unlink” instead of “Unlink and Clear” and all the items are still in the host.

If you have only one host it’s normaly no problem to delete all items per hand, but if you have multiple of those hosts it’s quite some work do remove the old items.

Solution nr. one would be tu use the filters to select all items in a specified host group and delte those items, but the applications, discovery rules and so on will still remain in the hosts and have to be deleted in a 2nd/3rd step.

See the Screenshot below:Zabbix Host configuration - item filters

 

My preferred solution for this problem is a simple regex based find/replace with Notepad++.
Herefor an export of the affected hosts is needed. The xml-file could be opend with NPP and the following regexes are needed for find/replace (CTRL+H) to remove the unwanted items.

Find what: (<discovery_rules>[\s\S]*?<\/discovery_rules>)|(<triggers>[\s\S]*?<\/triggers>)|(<inventory>[\s\S]*?<\/inventory>)|(<items>[\s\S]*?<\/items>)

Replace with: (?1<discovery_rules />)(?2<triggers />)(?3<inventory />)(?4<items />)

In the above example multiple regexes with multiple replace-patterns are used to replace the items, discovery rules, triggers an inventory and reset it.

Powershell/PowerCLI very slow execution Time

Sometimes a PowerCLI-script can take quite some time till everything is executed. For example the PowerShell scripts used by Zabbix to gather the vCenter alarms into Zabbix (BlogPost) need some tuning to run fine.

So why are scripts running slow in some cases?
It seems to occur primary on systems which do not have a connection to the internet. As a matter of fact – most of the systems I’m setting up “lose” internet connection sooner, or later. :/

What exactly causes the problem?
While investigating that problem i found an interesting feature which seems to cause the problem – certificate checks!
There is an IE-setting which is named “Check for publisher’s certificate revocation ” and can be found at: Intenet Options -> Advanced -> Section: Security ->Disable: Check for publisher’s certificate revocation.

Disabling the certificate checks improves the execution time by about 60%.

certificate-check enabled:

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 41
Milliseconds      : 536
Ticks             : 415369143
TotalDays         : 0.000480751322916667
TotalHours        : 0.01153803175
TotalMinutes      : 0.692281905
TotalSeconds      : 41.5369143
TotalMilliseconds : 41536.9143

 

certificate-check disabled:

Days : 0
Hours : 0
Minutes : 0
Seconds : 16
Milliseconds : 262
Ticks : 162628208
TotalDays : 0.000188227092592593
TotalHours : 0.00451745022222222
TotalMinutes : 0.271047013333333
TotalSeconds : 16.2628208
TotalMilliseconds : 16262.8208

 

If the script is run from a normal user account everything should be fine and we have an improved execution time, BUT …

… if the script is run from an Service (and as a matter of fact I’m using the Zabbix agent service to run the script) we got a problem.
With default settings the Zabbix Agent is installed to run as nt authority\system, so if the IE-setting is changed for the current user, its working for this user, but not for the system user. 🙁
So a quick and dirty workaround could be to disable the setting for the system user.
ATTENTION: Running the Zabbix Agent as a system user is OK for a DEV-environment, but should not be used in an production environment. For production a dedicated service user should be used.

I disabled it by becoming a system user with

PSEXEC -i -s -d CMD

and launching the IE from the command prompt. Afterwards I was able to disable the setting via the above method.

Otherwise the Key could also be found in the registry at:

HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\WinTrust\Trust Providers\Software Publishing\State
0x00023e00 / 146944 Check OFF
0x00023c00 / 146432 Check ON

A simple PS-Script to disable the setting would be:

Write-Host "Disable Check for publisher’s certificate Revocation"
set-ItemProperty -path "REGISTRY::\HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\WinTrust\Trust Providers\Software Publishing\" -name State -value 146944
PowerShell to disable Publisher certificate checks

 

Get vCenter alarms into Zabbix via poll-method

Some time ago i wrote a post on how to forward vCenter alarms to Zabbix ( https://blog.fawcs.info/2015/05/getting-vcenter-alarms-to-zabbix/) and I have to admit, that this solutions is kind of a pain in the ass. I’m getting the alarm info from environmental varaibles which are automatically set by the vCenter when an alarm changes its status, but it seems that there is a “littel” problem with “overlapping” alarms. For example if there are occuring multiple alarms within a short period only the first alarm will be forwarded to zabbix, but non of the follwoing alarms. Besides that this is not an ideal solution I personally do not like my former approach because it’s an event driven approach. So if one event goes missing we have an inconsistent system :/

It’s quite some time since I wanted to redesign the solution and now I’m finally having some time ( and the pressure) to do so. 🙂
The new approach is based on using userparameters to execute a powershellscript on the vCenter to discover all active alarms and create items in Zabbix. At the moment I’m creating three item prototyes. One for the Timestamp when the alarm became active, another item for the acknowledged-state of the alarm and the last one for the severity of the alarm.

There are two userparemeters which run two powershell scripts. The first one (vcenter.alarm.polling.discovery.ps1) does the discovery and the second one (vcenter.alarm.polling.itemdata.ps1) is to get the data for the discoverd items.
There are also three triggers (one for each severity gray, yellow, red) which will be active als long as the alarm is not acknowledged.

You can download the scripts, userparameters and the template down below:
vCenterAlarmPolling

 

Additional findings:
Ther can occure problems if there are different addresses used to connect to the vcenter (eg. 127.0.0.1, loclahost, vcenterhostname, …)
It seems that the vCenter creates a sperate datacenter instance for every connection, so if you use the three examples from abovve you will end up creating three instances and mess up the script.

 

If special characters want to be passed to the powershellscript (e.g. special chars in passwords ord login with administrator@vsphere.local) the “UnsafeUserParameters=1” – parameter from the zabbix-agent.conf needs to be set to 1. (default value is 0)

Install Zabbix on Raspberry PI 2

Wouldn’t it be cool to monitor your home? For example all your devices, but also temperature and other sensors an have all that data accessible via a web interfaces?

I think it would so, i thought about setting up Zabbix for home monitoring, but on the RaPi B and B+ it’s not the most performant setup, So i decided to try it again with the PI2.

This post provides a short log on how I set it up.

At first we have to download the source from Zabbix’ SF-page because there is no official package for the ARM-architecture available.

cd /opt
wget http://downloads.sourceforge.net/project/zabbix/ZABBIX%20Latest%20Stable/2.4.6/zabbix-2.4.6.tar.gz?r=http%3A%2F%2Fwww.zabbix.com%2Fdownload.php&ts=1441447329&use_mirror=skylink^C
mv zabbix-2.4.6.tar.gz\?r\=http\:%2F%2Fwww.zabbix.com%2Fdownload.php zabbix-2.4.6.tar.gz
tar xfvz zabbix-2.4.6.tar.gz
cd zabbix-2.4.6/


#With ./configure --help we can see all the availalbe switches which can be used to compile zabbix.
root@raspberrypi /opt/zabbix-2.4.6 # groupadd zabbix
root@raspberrypi /opt/zabbix-2.4.6 # useradd -g zabbix zabbix
root@raspberrypi /opt/zabbix-2.4.6 # ./configure --help


#I used the follwoing switches to compile the zabbix server and agent, use a MySQL-DB, enable jabber-support, lib-xml2 - which is needed for webmonitoring, net-snmp, ssh and curl which is alos needed for webmonitoring. IPMI can be useful if you also hav a realy server with a BMC to monitor. But for most homeusers the IPMI-option is not neede if you only want to monitor your home and thats it. If you have a LDAP/AD-environment where you want to integrate zabbix you also should use the ldap-switch, but I think most home users also do not have a directory service running at home. 😉

#If this command is run there will ocure some erroes in most cases because there are missing dependencies

./configure --enable-server --enable-agent --with-mysql --with-jabber --with-libxml2 --with-net-snmp --with-ssh2 --with-libcurl


apt-get install apache2 php5-mysql mysql-server mysql-common mysql-utilities libiksemel-dev libiksemel-utils libxml2-dev libxml2-utils libxml2 snmp libsnmp-dev libsnmp-perl libssh2-1-dev libssh2-1 libcurl3 libghc-curl-dev libmysql++-dev php5-gd

#now all dependencies should be resolved
./configure --enable-server --enable-agent --with-mysql --with-jabber --with-libxml2 --with-net-snmp --with-ssh2 --with-libcurl

#copy init scripts
cp /opt/zabbix-2.4.6/misc/init.d/debian/* /etc/init.d/
#copy webfrontend
cp -r /opt/zabbix-2.4.6/frontends/php/* /var/www/zabbix/
chown -R www-data:www-data /var/www/zabbix/


#create the database
#at first log in to your mysql-server as a root useradd and runn the following commands
mysql -uroot -p
create database zabbix character set utf8 collate utf8_bin;
CREATE USER 'zabbix'@'localhost' IDENTIFIED BY 'zabbix';
GRANT ALL PRIVILEGES ON zabbix.* TO 'zabbix'@'%' WITH GRANT OPTION;
mysql -uzabbix -pzabbix zabbix < /opt/zabbix-2.4.6/database/mysql/schema.sql
mysql -uzabbix -pzabbix zabbix < /opt/zabbix-2.4.6/database/mysql/images.sql
mysql -uzabbix -pzabbix zabbix < /opt/zabbix-2.4.6/database/mysql/data.sql

#adapt configuration files at /usr/local/etc/ like in the attached examples
#create dircetories for logfiles:
mkdir -p /var/log/zabbix
chown -R zabbix:zabbix /var/log/zabbix/

#create dirs for alert & external scripts 
mkdir -p /var/zabbix/alertscripts
mkdir -p /var/zabbix/externalscripts
chown -R zabbix:zabbix /var/zabbix/


#configure php-settings
vim /etc/php5/apache2filter/php.init
post_max_size = 16M
max_execution_time = 300
max_input_time = 300
#select timezone from http://php.net/manual/en/timezones.php and set:
date.timezone = <TIMEZONE>

#restart/reload webserver to accept changes
service apache2 restart
service zabbix-server restart
service zabbix-agent restart

#open http://<zbxip>/zabbix in browser and finish installation
zabbix installation

 

zabbix-conf

Getting vCenter alarms to Zabbix

VMware is a relay nice product, but there is one little problem. It’s realy hard to monitor VMware products with SNMP or any other “old school” technologies.
The actual problem is to get an alarm in Zabbix if there occures an error on the vCenter. So Zabbix is used as an umbrella monitoring for the whole environment.
All this could also be done with SNMP-Traps what would be a lot easier – at first appereance, but Zabbix is … how do I say … not the best tool to monitor events. It’s designed to monitor statuses.

So it’s designed to continuously monitor as specific value – if this value raises over a defined alert-value an alert is displayed and when it falls below the value the problem disappears.
With events there is the problem that we get only one single value which describes the error. So firstly we have to analyze the received value/message and secondly – how do we know when the problem is okay again? And thats one of the design flaws of Zabbix – you do not have any possibilty to reset such events to “OK” if such an event happend.
So we need to monitor the vCenter alarms, because this alerts are raised if an problem occures and disappear if the problem changes to OK again.

So how do we get all the vCenter alarms to zabbix? I don’t want to copy/create all the alarms by hand because its a dynamic environment and alarms could be added or deleted, so the system has to “import” the alarms “on the fly” from the vCenter.
Since Zabbix 2.0 there exist discovery rules which are kind of helpful to import dynamic values. So I’m using a discovery to peridodically pull the data from the vCenter and create an item for every alarm. All the alarms in the vCenter need to be configured to run a custom alarm when an alarm becomes active which sends the current status to zabbix and voilá – we are done.

Continue reading Getting vCenter alarms to Zabbix

Get raid information from a Fujitsu RX200S8/Rx300S8 via BMC

With the S8 generation of Fujtitus RX300/200 servers, which use the iRMC S4 ,Fujitsu implementent the ability to poll SNMP-data from the BMC (iRMC). To enable the ability to poll data the BMC has to be flashed with an up to date Firmware version and SNMP-polling has to be enableed in the iRMC-web-GUI. I’ve tested it with 7.69F from Dec. 2014 .
The first shippings of the rx300/200 servers came with an older firmeware version which did not implment the needed Fujitsuu MIBs to query HW-status (see http://manuals.ts.fujitsu.com/file/11470/irmc-s4-ug-en.pdf Page: 18 for details about the supported MIBs).
The problem is, that there is no possibility to query the status of the Raid controller and its disks via SNMP (or I haven’t found it till now) but it’s displayed on the iRMC web-GUI. So I wrote a script which extracts the useful informations (controller status, disk status + details and logical drive status) from the web-interface.

Atm. it’s just an alpha release but I’ll modify the script to be used by Zabbix for an auto discovery and push all the data into Zabbix:

Source:

#!/usr/bin/php
<?php
/****************************************
################################################################
#   DESCRIPTION: Script to query RAID-Informations from the Fujitsu iRMC S4 webinterface.
#   COPYRIGHT ©: 2015 by Florian Weixeblaum, GPL freeware
#        AUTHOR: Florian Weixelbaum, weixeflo@fawcs.info
#       LICENSE: GPL freeware
# CREATION DATE: 2015-Mar-19
################################################################
*****************************************/
$username="admin";
$password="admin";
$host="127.0.0.1";

$siteIDs=array('controller'=>87,'disks'=>88,'logicalVolumes'=>89);
//get Raid-Controller-Data
//$returnController=getWebsiteContent($host."/".$siteIDs['controller'],$username,$password,true);
$returnDisks=getWebsiteContent($host."/".$siteIDs['disks'],$username,$password,true);
//$returnVolumes=getWebsiteContent($host."/".$siteIDs['logicalVolumes'],$username,$password,true);
//$id['controller']=parseWebSiteToIDs($returnController);
$id['disks']=parseWebSiteToIDs($returnDisks);
//$id['logicalVolumes']=parseWebSiteToIDs($returnVolumes);
//print_r($id);


$ctrl=$id['disks'][0]['id'][1];
foreach($id['disks'][0]['data'] AS $disk)
{
	//print_r($disk);
	$diskWebContent=getWebsiteContent($host."/pdrive?ctrl=".$ctrl."&pd=".$disk['detailID'][1],$username,$password,true);
	//print_r($diskWebContent);
	$diskTableArray=parseWebSiteToIDs($diskWebContent);
	print_r($diskTableArray);	
}
/**/

/**
retrieves an html-site
	
	@return: String -> whole HTML site
**/
function getWebsiteContent($url,$username,$password,$measuretime=false)
{
	if($measuretime===true) 
	{
		$timer=microtime(true);
	}
	$process = curl_init($url);
	curl_setopt($process, CURLOPT_USERPWD, $username . ":" . $password);  
	curl_setopt($process, CURLOPT_HTTPHEADER, array('Content-Type: application/xml'));
	curl_setopt($process, CURLOPT_HEADER, 1);
	curl_setopt($process, CURLOPT_USERPWD, $username . ":" . $password);
	curl_setopt($process, CURLOPT_TIMEOUT, 30);
	curl_setopt($process, CURLOPT_POST, 1);
	//curl_setopt($process, CURLOPT_POSTFIELDS, $payloadName);
	curl_setopt($process, CURLOPT_RETURNTRANSFER, TRUE);
	$return = curl_exec($process);
	curl_close($process);
	if($measuretime===true) 
	{
		echo "Content for ".$url." retrieved in ".round(microtime(true)-$timer,3)."s".chr(10);
	}
	return $return;
}

/**
retrieves the table-content of an IRMC Site / parses the listed sites for interesting data
possible sites are:
	http://$host/87 -> Controller Infos
	http://$host/88 -> Physical Disks
	http://$host/89 -> Logical Drives
	
	@return -> 	array which contains the ID of the controller, disks, LVs
				the table headings
				the table data
**/


function parseWebSiteToIDs($return)
{
	$ret=array(); 
	$cutFrom=strpos($return,"<!-- Menu end-->");
	$cutTo=strpos($return,'<div id="bottom">');
	$return=str_replace('</td>','</td>'.chr(10),substr($return,$cutFrom,$cutTo-$cutFrom));
	$return=str_replace('</th>','</th>'.chr(10),$return);
	$Loop=$return;
	
	//echo $table;
	$count=0;
	//go through all tatles which are found in the passed web-content-string & extract the tables 
	while(strpos($Loop,'<table')!==false)
	{
		
		//$table[]=substr($return, strpos($return,"<table"),strpos($return,"</table>")-strpos($return,"<table>"));
		$Loop=substr($Loop, strpos($Loop,'<table'),strlen($Loop)-strpos($Loop,'<table'));
		//get the table id
		$id=substr($Loop,strpos($Loop,"summary=\"")+9,strpos($Loop,"\">")-(strpos($Loop,"summary=\"")+9));
		$ret[$count]['id']=explode("_",$id);
		$Loop=substr($Loop, strpos($Loop,'<tr'),strlen($Loop)-strpos($Loop,'<tr'));
		//save just the html-table to the var & add ***###*** as a delimitter after every tablerow
		$currentTable=str_replace("</tr>","</tr>***###***",substr($Loop,0,strpos($Loop,"</table>")));
		$currentTable=substr($currentTable,0,strlen($currentTable)-9);
		//explode all table rows by the added delimiter
		$rows=explode("***###***",$currentTable);
		$count2=0;
		//iterate through the table rows and extract useful information
		foreach($rows AS $row)
		{
			unset($detailID);
			//check if submit button for details is listed in the table - if yes - get the submit-value for correct id -> is appended to the URL when querying the details page
			if(strpos($row,'<input class="submit" type="submit" value="Details"')!==false)
			{
				$detailID=trim(substr($row,strpos($row,'<input class="submit" type="submit" value="Details"')+strlen('<input class="submit" type="submit" value="Details"'),strpos(strtolower($row),'onclick=')-(strpos($row,'<input class="submit" type="submit" value="Details"')+strlen('<input class="submit" type="submit" value="Details"'))));
				//eg: extracted name="pd_10" from row
				$detailID=str_replace("\"","",$detailID);
				$detailID=explode("=",$detailID);
				$detailID=$detailID[1];
			}
			$row=trim(strip_tags($row));
			//if $count==0 -> first table row which stores only the headings/descriptions -> are saved in a seperate node in the array
			if($count2==0)
			{
				$ret[$count]["description"]=explode(chr(10),$row);
			} else {
			//all the other rows contain data -> also store the in the array
				$rowArray=explode(chr(10),$row);
				$ret[$count]["data"][]=$rowArray;
				if(isset($detailID))
				{
					$ret[$count]["data"][$rowArray[0]]["detailID"]=explode("_",$detailID);
				}
			}
			$count2++;
		}	
		$count++;
	}
	return $ret;
}



?>
web-iRMCS RAID-query