As it has been some time since my last post about a solution on how to get vCenter alarms to Zabbix, and VMware also evolved I followed a new approach on that topic as my initial post only supports Windows vCenters. Furthermore the solution is not as stable as I wished that it would be, so my new approach is to query all alarms from a vCenter via it’s SDK.
Initially all the alarms are discovered and created in Zabbix and in a second step the values for the discovered alarms are polled.
Currently the script used the data center object of the vCenter to discover alarms, so it can’t be used on a standalone ESXi-Server. However – if the code is changed to use whatever object is needed to get the alarms directly from the ESXi-server it should also be possible to get alarms directly from a server without the need of a vCenter (but I didn’t implement that till now as there wasn’t the need/time).
vCenter alarms – SDK (tested with ESXi 6.0+ and Zabbix 3.0 on RHEL 7)
To install the vCenter alarms the attached zip needs to be downloaded and the VMware Perl SDK must be installed on the Zabbix Server.
The template needs to be imported into Zabbix and the vCenter username and password need to be set in the username/password macros of the template.
The other two files (vcenterAlarms.pl & vcenterAlarms.wrapper) need to be extracted to the externalscripts folder of the Zabbix Server. The wrapper script is just a shell script that is executed by a Zabbix item to call the per script and send the that to Zabbix via Zabbix Sender. As the VMware API is quite slow the wrapper also starts itself again with NOHUP because otherwise the timeout defined in the Zabbix Server configuration would cause an exit of the script. For my setup it always took longer than 30 Seconds till tall data where gathered and therefor the Zabbix Server would kill the script in the middle of the execution and no data would be sent to Zabbix. That’s why I added this workaround. Furthermore it also checks if there are less than eleven vcenterAlarms.wrapper processes running, and only starts if there are less, to ensure that Zabbix does not spawn hundreds of NOHUP-processes.
Sometimes you run into the problem, that you have a host which had a template attached but somebody wanted to replace the template or something like that and unfortunately hit just “Unlink” instead of “Unlink and Clear” and all the items are still in the host.
If you have only one host it’s normaly no problem to delete all items per hand, but if you have multiple of those hosts it’s quite some work do remove the old items.
Solution nr. one would be tu use the filters to select all items in a specified host group and delte those items, but the applications, discovery rules and so on will still remain in the hosts and have to be deleted in a 2nd/3rd step.
See the Screenshot below:
My preferred solution for this problem is a simple regex based find/replace with Notepad++.
Herefor an export of the affected hosts is needed. The xml-file could be opend with NPP and the following regexes are needed for find/replace (CTRL+H) to remove the unwanted items.
In the above example multiple regexes with multiple replace-patterns are used to replace the items, discovery rules, triggers an inventory and reset it.
Lately I was asked to help to upgrade Zabbix from 1.8 to 2.2 in a project. It wasn’t a problem to upgrade the templates – that was easily done with a xml-export/import but the hosts where kind of a challenge because the exported xml-files for the hosts itself pretty differs between 1.8 and 2.2.
Because i already had the PhpZabbixApi (https://github.com/confirm/PhpZabbixApi/blob/master/README.md) installed on the tared system i decided to write a little script which pareses the 1.8-host export and creates the hosts in 2.2. The script inc. the lib is attached at the end of the post.
I tested the script with Zabbix 1.8.6->2.2.10 and everything worked fine. Currently the script is capable of creating the hosts (with Zabbix-agent & SNMP-interface), creating the host groups and adding the hosts to the correct host group and also linking the correct templates to the host. However, the templates need to be already available on the target system to be linked correctly.
After extracting the script on the target Zabbix server the xml-import from the old system needs to be uploaded into the same directory as the script (scp) and the login data for Zabbix need to be adapted in the script. Afterwards the import can be started from a bash via:
Some time ago i wrote a post on how to forward vCenter alarms to Zabbix ( https://blog.fawcs.info/2015/05/getting-vcenter-alarms-to-zabbix/) and I have to admit, that this solutions is kind of a pain in the ass. I’m getting the alarm info from environmental varaibles which are automatically set by the vCenter when an alarm changes its status, but it seems that there is a “littel” problem with “overlapping” alarms. For example if there are occuring multiple alarms within a short period only the first alarm will be forwarded to zabbix, but non of the follwoing alarms. Besides that this is not an ideal solution I personally do not like my former approach because it’s an event driven approach. So if one event goes missing we have an inconsistent system :/
It’s quite some time since I wanted to redesign the solution and now I’m finally having some time ( and the pressure) to do so. 🙂
The new approach is based on using userparameters to execute a powershellscript on the vCenter to discover all active alarms and create items in Zabbix. At the moment I’m creating three item prototyes. One for the Timestamp when the alarm became active, another item for the acknowledged-state of the alarm and the last one for the severity of the alarm.
There are two userparemeters which run two powershell scripts. The first one (vcenter.alarm.polling.discovery.ps1) does the discovery and the second one (vcenter.alarm.polling.itemdata.ps1) is to get the data for the discoverd items.
There are also three triggers (one for each severity gray, yellow, red) which will be active als long as the alarm is not acknowledged.
You can download the scripts, userparameters and the template down below:
Ther can occure problems if there are different addresses used to connect to the vcenter (eg. 127.0.0.1, loclahost, vcenterhostname, …)
It seems that the vCenter creates a sperate datacenter instance for every connection, so if you use the three examples from abovve you will end up creating three instances and mess up the script.
If special characters want to be passed to the powershellscript (e.g. special chars in passwords ord login with firstname.lastname@example.org) the “UnsafeUserParameters=1” – parameter from the zabbix-agent.conf needs to be set to 1. (default value is 0)
Wouldn’t it be cool to monitor your home? For example all your devices, but also temperature and other sensors an have all that data accessible via a web interfaces?
I think it would so, i thought about setting up Zabbix for home monitoring, but on the RaPi B and B+ it’s not the most performant setup, So i decided to try it again with the PI2.
This post provides a short log on how I set it up.
At first we have to download the source from Zabbix’ SF-page because there is no official package for the ARM-architecture available.
VMware is a relay nice product, but there is one little problem. It’s realy hard to monitor VMware products with SNMP or any other “old school” technologies.
The actual problem is to get an alarm in Zabbix if there occures an error on the vCenter. So Zabbix is used as an umbrella monitoring for the whole environment.
All this could also be done with SNMP-Traps what would be a lot easier – at first appereance, but Zabbix is … how do I say … not the best tool to monitor events. It’s designed to monitor statuses.
So it’s designed to continuously monitor as specific value – if this value raises over a defined alert-value an alert is displayed and when it falls below the value the problem disappears.
With events there is the problem that we get only one single value which describes the error. So firstly we have to analyze the received value/message and secondly – how do we know when the problem is okay again? And thats one of the design flaws of Zabbix – you do not have any possibilty to reset such events to “OK” if such an event happend.
So we need to monitor the vCenter alarms, because this alerts are raised if an problem occures and disappear if the problem changes to OK again.
So how do we get all the vCenter alarms to zabbix? I don’t want to copy/create all the alarms by hand because its a dynamic environment and alarms could be added or deleted, so the system has to “import” the alarms “on the fly” from the vCenter.
Since Zabbix 2.0 there exist discovery rules which are kind of helpful to import dynamic values. So I’m using a discovery to peridodically pull the data from the vCenter and create an item for every alarm. All the alarms in the vCenter need to be configured to run a custom alarm when an alarm becomes active which sends the current status to zabbix and voilá – we are done.
Continue reading Getting vCenter alarms to Zabbix