PVE (Proxmox Virtual Environment)监控虚拟机存活状态

在部分场景下,会遇到PVE虚拟机自动关机的情况,也没排查出具体原因来,索性从根本上解决,监控虚拟机状态,检查到虚拟机关机状态的时候,直接执行qm start启动。

监控脚本:

#!/usr/bin/env bash
function check_and_restart() {
vm_id="${1}"
vm_ip="${2}"
# curl --connect-timeout 5 -sSL "${vm_ip}" > /dev/null
ping -c 1 "${vm_ip}" > /dev/null
if [[ $? != 0 ]]; then
now=`timedatectl status | grep 'Local time' | awk -F"Local time: " '{ print $2 }'`
echo "[${now}] [NO] id = ${vm_id}, ip = ${vm_ip}"
/usr/sbin/qm stop "${vm_id}"
/usr/sbin/qm start "${vm_id}"
else
echo VM "$vm_id" is runing!
fi
}
function main() {
vm_list=${1}
for each in ${vm_list}; do
vm_id=`echo "${each}" | awk -F: '{ print $1 }'`
vm_ip=`echo "${each}" | awk -F: '{ print $2 }'`
check_and_restart "${vm_id}" "${vm_ip}"
done
}
# 需要检查的虚拟机列表,格式为 vm_id:vm_ip
vm_list="
100:192.168.1.2
101:192.168.1.3
103:192.168.1.4
102:192.168.1.5
"
# 打印时间
timedatectl status | grep 'Local time' | awk -F"Local time: " '{ print $2 }'
main "${vm_list}"

存为/root/check文件后,使用crontab -e 添加到crontab中:

*/10 * * * * bash /root/check >> /root/log.txt

注意:如果部分虚拟机启动较慢,需要手动调整检测时间,否则可能会存在启动中的虚拟机无法检测到存活,然后再次执行强制启动,会导致死循环。

执行日志如下:

Tue 2024-02-06 09:50:01 CST
VM 100 is runing!
VM 101 is runing!
VM 103 is runing!
VM 102 is runing!
Tue 2024-02-06 10:00:01 CST
VM 100 is runing!
VM 101 is runing!
VM 103 is runing!
VM 102 is runing!
Tue 2024-02-06 10:10:01 CST
VM 100 is runing!
VM 101 is runing!
VM 103 is runing!
VM 102 is runing!
» 本文链接:PVE (Proxmox Virtual Environment)监控虚拟机存活状态
» 转载请注明来源:刺客博客
» 如果文章失效或者安装失败,请留言进行反馈。