All posts by Fawcs

The author is working as an IT-Systems Engineer for an Austrian company and has spezialiced on Linux (RHEL), Deployment and Monitoring but is also working with VMware, Windows, Cisco, ...

iPXE Network boot that supports Virtualbox VMs

iPXE is pretty nice when it comes to network booting computers as it offers lots of scripting functionality at a very early stage of the deployment as it could be configured to load an iPXE script from an webserver. The script provided by the webserver itself can be created dynamically with any scripting language of your choice depeding on parameters that get handed over.

That provides the possibilityty, to automatically roll out systems that have been specified in an inventory. In case a machine can not be found in the inventory you could provide a menu where an users can manually choose what shall be done and lots more.

Not a problem for bare metal machines and also most VMs. However – Oracle with Virtualbox – also seem to have discovered the advantages of virtual box for their virtual machines and every Virtualbox VM will initially load iPXE.
As that’s an iPXE binary with very little capabilities, this could cause issues when trying to PXE boot an Oracle Virtualbox VM via iPXE as the dhcp-server used for iPXE will get the iPXE identifier form Oracls built in iPXE binary instead of the one that will be initially provided via the DHCP-server.

If the iPXE-script, which is loaded in the second stage, uses the console-command, the deployment will halt as that functionality is not supported by Oracles iPXE binary.

To work around this problem we can modify the user-class identifier provided by our iPXE binary to provide something different than the default “iPXE”-string and use that to make sure that our dhcp-server will always provide our ipxe binary in case a new client tires to PXE boot.

To change the “iPXE” userclass string to a custom string, you have to open the “src/net/udp/dhcp.c” file, once the iPXE repository was checked out.

The interesting part is somewhere down at line 90 in the file:

If you don’t want to change to much code – just change any character to somthing else:

e.g.

DHCP_USER_CLASS_ID, DHCP_STRING ( 'i', 'P', 'X', 'E' ),
TO
DHCP_USER_CLASS_ID, DHCP_STRING ( 'x', 'P', 'X', 'E' ),
or
DHCP_USER_CLASS_ID, DHCP_STRING ( 'C', 'S', 'T', 'M' ),

Whatever is defined there will be the new user-class identifier that can be used to determine if our custom iPXE was loaded or if the iPXE binary from another vendor is used.

Once the change was done the ipxe file needs to be recompiled and copied to the tftp servers directory.

The check for the custom user-class identifier in dnsmasq would look like: (CSTM as the userclass identifier)

...

# Boot for iPXE. The idea is to send two different
# filenames, the first loads iPXE, and the second tells iPXE what to
# load. The dhcp-match sets the ipxe tag for requests from iPXE.
dhcp-boot=ipxe.efi
dhcp-userclass=set:ipxe,CSTM
dhcp-boot=tag:ipxe,http://10.16.96.16/script.ipxe

...

A nice example on how dnsmasq can be configured for iPXE can be found at in the iPXE Forum.

Simple script to test if the chan works properly:

#!ipxe
console --x 1024 --y 768
dhcp
config

VM stuck in invalid state after export

After trying to export a VM and cancelling the export it could happen that the VM is nolonger responsive (no start, unregister, delete, … possible)

The hostd.log (in the /var/log directory on the ESXi) will show an error similar to:

2021-09-07T10:55:19.583Z error hostd[2099544] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/XXXXXXXX-05c01598-574e-88d7f6d5ef52/myvm/myvm.vmx opID=esxui-6c72-aafe user=root] Invalid transition requested (VM_STATE_EXPORTING -> VM_STATE_DELETING): Invalid state
2021-09-07T10:55:19.583Z warning hostd[2099544] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/XXXXXXXX-05c01598-574e-88d7f6d5ef52/myvm/myvm.vmx opID=esxui-6c72-aafe user=root] Method fault exception during VM destroy: Fault cause: vim.fault.InvalidPowerState

It seems the VM is somehow stuck in the Exporting-state and therefore no other operation is possible on the VM .

As a workaround log in to the ESXi Host and restart the Management Agents (https://kb.vmware.com/s/article/1003490)

Bash – Monitor directory size for change

A simple bash-script to easily monitor if a directory has grown or shrunk in size:

while [ 1 ]; do result=$(du -s * | egrep "bitcoin-0.21.1$"); echo -e "\e[95m$result\e[0m"; curSize=$(echo $result | cut -d" " -f1); if [ $curSize -lt $oldSize ]; then echo -e
 "\e[92mShrunk: $curSize\e[0m"; else echo -e "\e[91mGrown: $curSize\e[0m"; fi; oldSize=$curSize; sleep 5; done

Script needs to be executed in the parrent directory of the monitored dir and directory name must be adapted: bitcoin-0.21.1$ -> to whatever you want to grep for

Start PS-Admin session from an unprivileged user

 runas  /user:administrator 'powershell -command Start-Process powershell -verb runas' 

PowerCLI batch revert to Snapshot

The following script-let can be used to revert a bunch of VMs/all VMs in a folder back to the first snapshot made for the VM

$vms = Get-Folder "MY_VM_FOLDER_NAME" | Get-VM
 
foreach($vm in $vms)
{
    $vm.name
    $snap = Get-Snapshot -VM $vm | Sort-Object -Property Created -Descending | Select -First 1
    Set-VM -VM $vm -SnapShot $snap -Confirm:$false
    echo "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
}

MIkrotik packet loss (Ping <70%)

Today I experienced an interesting story with my Mikrotik router at home. While updating the PiHole instance the system hat quite some problems obtaining either system updates but also the PiHole update packages. A ping on 9.9.9.9 showed that the Raspberry – on which the PiHole was hosted – had somewhat between 70-80% packet loss. Pinging the same IP from a Windows machine resulted in 0% packet.

All this happened only on wired connections but did not cause any problmes when connected via wifi.

As the Raspberry is directly attached to the Mikrotik router I also tried to connect it via a switch in between as that’s the setup for the windows machine. Same behavior.
Running pings from 3 Linux systems in the network and two Windows systems (even with exchanging network connections to be directly connected to the Mikrotik and connecting through the switch) came up with an interesting result:
All Windows machines had hardly any packet loss in 10 mins (<3%) and all the Linux systems had somewhat between 70%-80% packet loss (tested with a ping).
Any ping that involved the Mikrotik routers L2 functionality seemed to result in packet loss on the Linux machines.
Pinging any other machine on the same subnet worked without problems, but as soon as there was one hob in between the problem occurred.

Interestingly the problem vanished as far as the Torch tool was activated and no more packet loss occurred on any of the systems.

After some additional troubleshooting time (and disabling nearly all Mikrotik-configuration -> Firewall-Rules/Interfaces) the problem seemed to be with the Bridge interface used. It seems that the deactivation of the IP Firewall for the bridge interface caused the problem. After enabling it the behavior vanished and all systems no longer had any packet loss issues.

[admin@MikroTik] /interface bridge settings> /interface bridge settings 
[admin@MikroTik] /interface bridge settings> set use-ip-firewall yes    

Monitoring Tasmota with Zabbix

The repository contains a python-script which can be used to subscribe to an MQTT-server (e.q. mosquitto) and forward all MQTT-message to Zabbix. There the Template can be used to monitore the values sent via MQTT. Currently the Template is mainly intended as an example and requires adaption to your current setup/configuration of tasmote.

https://gitlab.com/fawcs/zabbix_tasmota_mqtt

CVSS-Regex

Just a little regex to validate CVSSvector strings

^CVSS\:\d\.\d\/AV\:[N,A,L,P]\/AC\:[L,H]\/PR\:[N,L,H]\/UI\:[N,R]\/S\:[U,C]\/C\:[N,L,H]\/I\:[N,L,H]\/A\:[N,L,H]$

Additional JS-Scripts for CVSS-cacluation:

https://www.first.org/cvss/v3.1/use-design

Batch import beats dashboards

cd /usr/share/
for BEAT in $(ls | grep -e "beat$"); do
echo -e "\e[92mBEAT: $BEAT\e[0m"
./$BEAT/bin/$BEAT setup --pipelines -c /etc/$BEAT/$BEAT.yml.* -path.home /usr/share/$BEAT/
./$BEAT/bin/$BEAT setup --dashboards -c /etc/$BEAT/$BEAT.yml.* -path.home /usr/share/$BEAT/
done

Zabbix SELinux policy generation

Commands to query the auditlog for Zabbix relevant queries and create/import a compiled policy file within Zabbix

Could be adapted to generate policies for any other system.

The suggestion is to set SELinux to permissive (setenforce=0) execute the action and afterwards create the policy based on the logged events. If the policy does not work on the first try after re-enabeling SELinux again it it could happen that a call was blocked (which is also logged within the auditlog) that was not blocked with SELinux in permissive mode. Therefore it could help creating a new human readable policy (.te-file) and checking the first version vs. the second version + merging them. 

filename=zabbix-server
cat /var/log/audit/audit.log  | grep zabbix | audit2allow -m $filename >> $filename.te
checkmodule -M -m -o $filename.mod $filename.te
semodule_package -o $filename.pp -m $filename.mod
semodule -i $filename.pp
 
 
#restorecon -R -v /run/zabbix/zabbix_server_alerter.sock    #suggested by the policygenerator