How to properly check of host and some items availability.
In this article will be covered
- Availability monitoring by DNS name
- EXE monitoring
- HDD monitoring
- Forecasting function Timeleft
Did remember old soviet cartoon about python, elephant, monkey and parrot? Python wanted to know, how long he is. Monkey said python is 4/4 of python, but python disagree. Then they asked to parrot and parrot said python is 38,25 parrots long. Python was happy and now if I don't know what units to measure something I put in parrots or ppg.
Availability monitoring is required not from Zabbix server , but from host1-> host2:port
In 2012 I ordered exe file to programmer , that exe just tried to ping port from host, by example SQL server available from web server. Nowadays you can just use Zabbix Net.tcp.service, just don’t forget units PPG. It appears net.tcp.service is available since version 2. Some people don't know that net.tcp.service can do remote ip and port checks too. In most cases it is enough, but actually it shows if port is open from particular host. You can add powershell check in Zabbix agent and get similar result. Add UserParameter=CheckDnsAndPort,powershell.exe -NoProfile -ExecutionPolicy bypass Test-NetConnection PYTHON-WEB.CFLA.GOV.LV -Port 443 -InformationLevel Quiet
Parameter Quiet at the end returns only one word "True" or "False"
https://www.zabbix.com/documentation/4.0/en/manual/config/items/itemtypes/simple_checks It appears that net.tcp.service function can do DNS checks. I find it out just because I was preparing to presentation and double check all that I wrote., dont know when this feature appears but it works so all my previos job, sophsticated scripts, clever powershell scripts goes to trash. In documentation there is no reference to it, so maybe Zabbix guys don't know or forget. See the manual net.tcp.service[service,<ip>,<port>] https://www.zabbix.com/documentation/6.0/en/manual/appendix/items/supported_by_platform There are pointed <ip>, I just tried DNS for curiosity and it worked! As it is not documented checking by dns not sure if it disappears one day as it appeared.
Zabbix do what? EXE file monitoring
Last year after Zabbix conference I met old colleague and told I am monitoring a RAM usage by EXE from Zabbix and he said «What?» like Eminem so I figured out lot of admins don't know all capabilites of Zabbix. 12 years ago it was hard job to enumerate all processes by procmon, add custom config, now it is just standard template.
Don't forget to Add Custom multiplier to get proper bytes.
True or False- Microsoft SQL eats all RAM? As you can see on picture there are 32 GB of RAM but Microsoft SQL does not pretend on more than 22 GB of RAM. He keeps what he get it but does not take more.
HDD monitoring
Some admins thinks they can install out of the box template and it will work fine. NO! If we show in dashboard only space usage we could get a wrong impression about volume space, as you can see there are alert Disk space is low >90% red line. Zabbix use dynamic (delta) view but if we dive deeper
to monitor together total space and used space we see, red line D disk is filling very slowly so we can leave it Admin overreacted by buying 300GB expensive DATACENTER storage In Azure it will cost You 1470$ in AWS 1008$. Check for Yourselves. https://azure.microsoft.com/en-us/pricing/details/managed-disks/ and https://aws.amazon.com/ebs/pricing/ In next saection I will explain how I calculated that my admin overreacted and disk will be full in 7 years.
Predictive functions
There are 2 predictive functions in Zabbix- Timeleft and Forecast. I am using only Timeleft.
You can add some alert and get noted like this. Conclusion
To better monitor server availability
- Use DNS name check from remote host to port instead of IP from Zabbix server
- Use Timeleft instead of Disk almost full
- Use EXE file monitoring instead of service state
There are article and formulas that make my headache stronger.
https://www.zabbix.com/documentation/5.0/assets/en/manual/config/triggers/prediction_docs.pdf
You don't have to use formula I just made 2 templates You can use to work with timeleft and dns check. Here You find Windows Template to monitor EXE, DNS and TimeLeft and Linux template to check Timeleft for / /home and /root volumes.
https://github.com/giorsgeks/ZebbixTemplates/tree/main
https://github.com/giorsgeks/ZebbixTemplates/tree/main