VMware is a relay nice product, but there is one little problem. It’s realy hard to monitor VMware products with SNMP or any other “old school” technologies.
The actual problem is to get an alarm in Zabbix if there occures an error on the vCenter. So Zabbix is used as an umbrella monitoring for the whole environment.
All this could also be done with SNMP-Traps what would be a lot easier – at first appereance, but Zabbix is … how do I say … not the best tool to monitor events. It’s designed to monitor statuses.
So it’s designed to continuously monitor as specific value – if this value raises over a defined alert-value an alert is displayed and when it falls below the value the problem disappears.
With events there is the problem that we get only one single value which describes the error. So firstly we have to analyze the received value/message and secondly – how do we know when the problem is okay again? And thats one of the design flaws of Zabbix – you do not have any possibilty to reset such events to “OK” if such an event happend.
So we need to monitor the vCenter alarms, because this alerts are raised if an problem occures and disappear if the problem changes to OK again.
So how do we get all the vCenter alarms to zabbix? I don’t want to copy/create all the alarms by hand because its a dynamic environment and alarms could be added or deleted, so the system has to “import” the alarms “on the fly” from the vCenter.
Since Zabbix 2.0 there exist discovery rules which are kind of helpful to import dynamic values. So I’m using a discovery to peridodically pull the data from the vCenter and create an item for every alarm. All the alarms in the vCenter need to be configured to run a custom alarm when an alarm becomes active which sends the current status to zabbix and voilá – we are done.