VMware is a relay nice product, but there is one little problem. It’s realy hard to monitor VMware products with SNMP or any other “old school” technologies.
The actual problem is to get an alarm in Zabbix if there occures an error on the vCenter. So Zabbix is used as an umbrella monitoring for the whole environment.
All this could also be done with SNMP-Traps what would be a lot easier – at first appereance, but Zabbix is … how do I say … not the best tool to monitor events. It’s designed to monitor statuses.
So it’s designed to continuously monitor as specific value – if this value raises over a defined alert-value an alert is displayed and when it falls below the value the problem disappears.
With events there is the problem that we get only one single value which describes the error. So firstly we have to analyze the received value/message and secondly – how do we know when the problem is okay again? And thats one of the design flaws of Zabbix – you do not have any possibilty to reset such events to “OK” if such an event happend.
So we need to monitor the vCenter alarms, because this alerts are raised if an problem occures and disappear if the problem changes to OK again.
So how do we get all the vCenter alarms to zabbix? I don’t want to copy/create all the alarms by hand because its a dynamic environment and alarms could be added or deleted, so the system has to “import” the alarms “on the fly” from the vCenter.
Since Zabbix 2.0 there exist discovery rules which are kind of helpful to import dynamic values. So I’m using a discovery to peridodically pull the data from the vCenter and create an item for every alarm. All the alarms in the vCenter need to be configured to run a custom alarm when an alarm becomes active which sends the current status to zabbix and voilá – we are done.
So the alerting will run through the follwoing steps:
A problem occured when sending the data directly after the discovery, because the zabbix server process syncs the items, which it is processing every 2 minutes from the database. Sof if a new alarm is triggerd, a new item is created which is note cached by the zabbix server process. But the zabbix server process has to know about the item first, before it is able to process data for this item. So the Alert-script tries to send the data multiple times (with a timeout in between) to make sure that the data are sent again when the zabbix server process should have knowledge about the items and is able to process the data. If it fails multiple times, the alert handler process will exit without sending the data to avoid being stuck in an endless loop. Otherwise – if multiple processes would be stuck in an endless loop the vCenter would eventually crash, and we do not want that 😉
To send the data to Zabbix the alert handler makes use of the zabbix agent configuration file & the zabbix sender utility, so a zabbix agent & zabbix sender utility have to be installed on the vCenter.
Another prerequisit would be that the vSphere Power CLI is installed on the system because several functions of the vSphere Power CLI are used.
The package for the vCenter monitoring is attached to this post and consits of the following files:
A readme file which contains nearly the same content as this blog post
Is just a wrapper which is ran when a vCenter alarm is trigged and calls the real alarm handler.
For debugging purpose the output could be piped to a file -> see the batch file for an example!
The real alarm handler which forwards the data to Zabbix. The alarm handler gathers the data from specific environmental variables and at first sends a discovery to Zabbix. After the Discovery was sent to Zabbix the alarm handler tries to forward the data itself to Zabbix. If this is not successful, the alarm handler will try to send the data for about 2 minutes. Specific settings can be provided in the alarmHandler.ps1-script. By default the variables should be sett correctly so there is no need to change them.
With this script all the alarms in the vCenter can be updated to use the alarm handler – so there is no need to add it by hand. By default the alarm is set to “repeat” so alarms are sent periodically.
So let me summarize:
To use the script the follwoing steps have to be taken:
- make sure that vSphere Power CLI is installed
- make sure that Zabbix Agent ist installed and configuration file is correct
- extract the attached zip file to the vCenter and extract it to a path of your choice.
- adopt the pathes in the batch and powershell scripts to match your path
- run the updateRunCommandAlarm.ps1 to attach the alert handler to all alarms in your vCenter
- Import the attached vCenter Template or create your own discovery to get the alarms to your Zabbix.
Now everything should be set up so that you can get all vcenter Alarms dynamically to your Zabbix 😉
vCenter Alerting Scripts: vCenterAlerting
vCenter Template: TMPL-NIS-Service-VMware-vCenter-WIN