kipmi0 导致cpu 100%问题

    最近在做测试的时候,观察ganglia发现几台机器即使没有任务的时候load值依然不低,到机器上top后发现kipmi0这个进程一直占满一个核,非常恶心。google了一下,下面是kipmi0的一些说明:

The kipmi0 process may show increased CPU utilization in Linux. The utilization may increase up to 100% when the IPMI (Intelligent Platform Management Interface) device, such as a BMC (Baseboard Management Controller) or IMM (Integrated Management Controller) is busy or non-responsive.

Fix

No fix required. You should ignore increased CPU utilization as it has no impact on actual system performance.

看着不顺眼的简单处理就是

echo 100 > /sys/module/ipmi_si/parameters/kipmid_max_busy_us
更新:
问了下公司的ops,解决方法为,请各位自己评估:

这个问题最近我们也有遇到,在高版本的redhat/centos系统中(比如6.4/6.5),Redhat官网也有说明,是由于操作系统的驱动与BMC交互出现了问题,导致kipmi0进程占用CPU 100%,并且ipmitool操作无响应。

In this case, there is a problem in the interaction between the driver and the hardware/firmware which leads the driver to believe that an operation is still in progress, causing the high CPU load to continue until the system is rebooted.

kipmi0进程优先级是非常低的,当有系统应用需要CPU资源时,kipmi0会释放资源。

 

对于目前的情况,临时措施可以尝试使用命令hot plug驱动来恢复:

echo “remove,” > /sys/modules/ipmi_si/parameters/hotmod

echo “add,” > /sys/modules/ipmi_si/parameters/hotmod.

 

最终解决方案:本次提供的firmware中加入了相应的解决方案,升级后可以解决。ps:需要先恢复后才可以升级。https://access.redhat.com/solutions/21322