前言

各位小伙伴好久不见,最近工作和生活上琐事比较多,所以一直没更文,今天忙里偷闲,跟各位更新一篇拿来即用脚本文章,还望各位笑纳,另外附加介绍每一个的语句的使用以及含义。话不多说,直接开始我们今天的正题:

项目需求

我们采购的工作站老是因为CPU和显卡温度高而导致服务器挂掉,因此领导让写一个监控CPU的温度脚本来实时监测温度,并把异常情况输出到服务器

需求具体实现

1 lm_sensors工具安装

目前市面上通过lm_sensors工具查看主板上多核CPU中温度最高的内核
脚本依赖:lm_sensors工具,可以执行下面语句进行安装,如果安装不了请更换YUM源
工具安装命令:yum install -y lm_sensors
注意事项:
1.“sensors coretemp-isa-0000”中后面的参数视主机实际的参数而定
2.目前vmware虚拟机中lm_sensors工具无法查看硬件温度

2 每行脚本具体说明

[root@dana ~]# sensors coretemp-isa-0000
coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +33.0°C (high = +83.0°C, crit = +93.0°C)
Core 0: +32.0°C (high = +83.0°C, crit = +93.0°C)
Core 1: +29.0°C (high = +83.0°C, crit = +93.0°C)
Core 2: +29.0°C (high = +83.0°C, crit = +93.0°C)
Core 3: +31.0°C (high = +83.0°C, crit = +93.0°C)
Core 4: +30.0°C (high = +83.0°C, crit = +93.0°C)
Core 5: +30.0°C (high = +83.0°C, crit = +93.0°C)
Core 8: +30.0°C (high = +83.0°C, crit = +93.0°C)
Core 9: +29.0°C (high = +83.0°C, crit = +93.0°C)
Core 10: +28.0°C (high = +83.0°C, crit = +93.0°C)
Core 11: +29.0°C (high = +83.0°C, crit = +93.0°C)
Core 12: +31.0°C (high = +83.0°C, crit = +93.0°C)
Core 13: +28.0°C (high = +83.0°C, crit = +93.0°C)
这条指令主要是看CPU温度的所有信息,第一个温度代表当前设备温度值,high = +83.0°C表示超过83度CPU温度过高,crit = +93.0°C表示超过93度CPU就会烧坏
[root@dana ~]# sensors coretemp-isa-0000 | tail -n +3
Package id 0: +33.0°C (high = +83.0°C, crit = +93.0°C)
Core 0: +31.0°C (high = +83.0°C, crit = +93.0°C)
Core 1: +27.0°C (high = +83.0°C, crit = +93.0°C)
Core 2: +28.0°C (high = +83.0°C, crit = +93.0°C)
Core 3: +30.0°C (high = +83.0°C, crit = +93.0°C)
Core 4: +29.0°C (high = +83.0°C, crit = +93.0°C)
Core 5: +29.0°C (high = +83.0°C, crit = +93.0°C)
Core 8: +29.0°C (high = +83.0°C, crit = +93.0°C)
Core 9: +28.0°C (high = +83.0°C, crit = +93.0°C)
Core 10: +28.0°C (high = +83.0°C, crit = +93.0°C)
Core 11: +28.0°C (high = +83.0°C, crit = +93.0°C)
Core 12: +31.0°C (high = +83.0°C, crit = +93.0°C)
Core 13: +28.0°C (high = +83.0°C, crit = +93.0°C)
这条指令中tail -n +3表示显示文件从第3行至文件末尾的文件内容
[root@dana ~]# sensors coretemp-isa-0000 | tail -n +3 |tr -s " "
Package id 0: +33.0°C (high = +83.0°C, crit = +93.0°C)
Core 0: +32.0°C (high = +83.0°C, crit = +93.0°C)
Core 1: +29.0°C (high = +83.0°C, crit = +93.0°C)
Core 2: +28.0°C (high = +83.0°C, crit = +93.0°C)
Core 3: +32.0°C (high = +83.0°C, crit = +93.0°C)
Core 4: +30.0°C (high = +83.0°C, crit = +93.0°C)
Core 5: +30.0°C (high = +83.0°C, crit = +93.0°C)
Core 8: +30.0°C (high = +83.0°C, crit = +93.0°C)
Core 9: +29.0°C (high = +83.0°C, crit = +93.0°C)
Core 10: +28.0°C (high = +83.0°C, crit = +93.0°C)
Core 11: +29.0°C (high = +83.0°C, crit = +93.0°C)
Core 12: +31.0°C (high = +83.0°C, crit = +93.0°C)
Core 13: +29.0°C (high = +83.0°C, crit = +93.0°C)
这条指令中tr -s " “表示删除” ",可以理解为删除空格
[root@dana ~]# sensors coretemp-isa-0000 | tail -n +3 |tr -s " " |awk -F [°C+] ‘{print $0}’
Package id 0: +32.0°C (high = +83.0°C, crit = +93.0°C)
Core 0: +31.0°C (high = +83.0°C, crit = +93.0°C)
Core 1: +27.0°C (high = +83.0°C, crit = +93.0°C)
Core 2: +27.0°C (high = +83.0°C, crit = +93.0°C)
Core 3: +29.0°C (high = +83.0°C, crit = +93.0°C)
Core 4: +28.0°C (high = +83.0°C, crit = +93.0°C)
Core 5: +28.0°C (high = +83.0°C, crit = +93.0°C)
Core 8: +28.0°C (high = +83.0°C, crit = +93.0°C)
Core 9: +27.0°C (high = +83.0°C, crit = +93.0°C)
Core 10: +27.0°C (high = +83.0°C, crit = +93.0°C)
Core 11: +28.0°C (high = +83.0°C, crit = +93.0°C)
Core 12: +30.0°C (high = +83.0°C, crit = +93.0°C)
Core 13: +28.0°C (high = +83.0°C, crit = +93.0°C)
这条指令中awk -F [°C+] '{print $0}'表示按照°C+为指定分割符并输出每一行内容
[root@dana ~]# sensors coretemp-isa-0000 | tail -n +3 |tr -s " " |awk -F [°C+] ‘{print $1}’
Package id 0:
这条指令中{print $1}表示只输出每一行第1个字段内容
[root@dana ~]# sensors coretemp-isa-0000 | tail -n +3 |tr -s " " |awk -F [°C+] ‘{print $3}’

32.0
28.0
28.0
31.0
29.0
29.0
29.0
29.0
28.0
29.0
31.0
28.0
这条指令中{print $3}表示只输出每一行第3个字段内容
[root@dana ~]# sensors coretemp-isa-0000 | tail -n +3 |tr -s " " |awk -F [°C+] ‘{print $6}’

83.0
83.0
83.0
83.0
83.0
83.0
83.0
83.0
83.0
83.0
83.0
83.0
这条指令中{print $6}表示只输出每一行第6个字段内容
[root@dana ~]# sensors coretemp-isa-0000 | tail -n +3 |tr -s " " |awk -F [°C+] ‘{print $9}’

93.0
93.0
93.0
93.0
93.0
93.0
93.0
93.0
93.0
93.0
93.0
93.0
这条指令中{print $9}表示只输出每一行第9个字段内容

3 终极代码实现(放大招)

#!/bin/bash
# 功能:查看主板上单个多核CPU中温度最高的一个内核
# 脚本依赖:lm_sensors工具,可以执行下面语句进行安装,如果安装不了请更换YUM源
# 工具安装命令:yum install -y lm_sensors
# 注意事项:
# 1.“sensors  coretemp-isa-0000”中后面的参数视主机实际的参数而定
# 2.目前vmware虚拟机中lm_sensors工具无法查看硬件温度
date=`date "+%Y-%m-%d %H:%M:%S"`
echo $date
CPU0=`sensors  coretemp-isa-0000 | tail -n +3 |tr -s " " |awk -F [°C+] '{print $3}'`
CPU1=`sensors  coretemp-isa-0001 | tail -n +3 |tr -s " " |awk -F [°C+] '{print $3}'`
echo $CPU0
echo $CPU1
highCPU0=`sensors  coretemp-isa-0000 | tail -n +3 |tr -s " " |awk -F [°C+] '{print $1$6}'`
highCPU1=`sensors  coretemp-isa-0001 | tail -n +3 |tr -s " " |awk -F [°C+] '{print $1$6}'`
for i in $CPU0;
do
echo $i
#echo $i|awk '{printf("%d\n",$0);}'
#if [ $i -lt 83 ];then
if [ `echo "$i > 83.0"` ] ; then
    echo "$date 请注意!!!coretemp-isa-0000 cpu温度为$i度,请及时排查">>/var/log/cpu_message
elif [ `echo "$i > 93.0"` ] ; then
    echo "请注意!!!!!!coretemp-isa-0000 cpu超过此温度就会烧坏,请及时排查">>/var/log/cpu_message
else
    echo "cpu运行一切正常"
fi
done

for j in $CPU1;
do
echo $i
#echo $i|awk '{printf("%d\n",$0);}'
#if [ $i -lt 83 ];then
if [ `echo "$j > 83.0"` ] ; then
    echo "$date 请注意!!!coretemp-isa-0001 cpu温度为$i度,请及时排查">>/var/log/cpu_message
elif [ `echo "$j > 93.0"` ] ; then
    echo "$date 请注意!!!!!!coretemp-isa-0001 cpu超过此温度就会烧坏,请及时排查">>/var/log/cpu_message
else
    echo "cpu运行一切正常"
fi
done

最终运行结果
在这里插入图片描述
本篇文章所有代码,我都亲测无误,如果本篇博客对您有所帮助,请三连点赞,关注,收藏支持下。有需要沟通交流的,可随时沟通交流,多谢大家支持!!!
在这里插入图片描述
在这里插入图片描述

在这里插入图片描述

Logo

更多推荐