创建gpu挂载/dev/nvidia开机启动进程

由于GPU机器重启后gpu的device并不会主动挂载,所以需要开机后执行一个脚本,开机自动挂载,以便于后面Docker进行挂载。执行的脚本gpu-service如下:

#!/bin/bash

/sbin/modprobe nvidia

if [ "$?" -eq 0 ]; then
  # Count the number of NVIDIA controllers found.
  NVDEVS=`lspci | grep -i NVIDIA`
  N3D=`echo "$NVDEVS" | grep "3D controller" | wc -l`
  NVGA=`echo "$NVDEVS" | grep "VGA compatible controller" | wc -l`

  N=`expr $N3D + $NVGA - 1`
  for i in `seq 0 $N`; do
    mknod -m 666 /dev/nvidia$i c 195 $i
  done

  mknod -m 666 /dev/nvidiactl c 195 255

else
  exit 1
fi

/sbin/modprobe nvidia-uvm

if [ "$?" -eq 0 ]; then
  # Find out the major device number used by the nvidia-uvm driver
  D=`grep nvidia-uvm /proc/devices | awk '{print $1}'`

  mknod -m 666 /dev/nvidia-uvm c $D 0
else
  exit 1
fi

需要加入新的system服务,方法为

touch /etc/systemd/system/gpu.service
chmod 664 /etc/systemd/system/gpu.service

修改gpu.service文件为

[Unit]
Description=auto run gpu construct
[Service]
Type=simple
ExecStart=/usr/sbin/gpu-service
[Install]
WantedBy=multi-user.target

将gpu-service脚本拷贝到/usr/sbin/gpu-service

mv gpu-service usr/sbin/
chmod 554 /usr/sbin/gpu-service

通过systemctl命令,将gpu-service作为开机自启动命令

systemctl daemon-reload
systemctl enable gpu.service
Print Friendly

jiang yu

Leave a Reply