1.Docker仓库缓存位置
默认Docker会存储在/var/lib/docker/,如果系统盘过小,很容易导致磁盘写满。为了改变存储位置,需要修改启动脚本。对于CentOS来说,修改/usr/lib/systemd/system/docker.service加入如下一行,-g
ExecStart=/usr/bin/dockerd-current
-g /mnt/disk1/docker_home
2.打开管理端口
由于安全原因,默认现在是不打开2375端口,为了使用Docker-java等管理工具,需要打开端口,方法同上,修改/usr/lib/systemd/system/docker.service
--userland-proxy-path=/usr/libexec/docker/docker-proxy-current
-H tcp://0.0.0.0:2375 -H unix://var/run/docker.sock
3.环境变量等问题
通过Commit等方式或者传入的环境变量或多或少有问题,需要通过Dockerfile写法,进行设置。比如TensorFlow,需要配置HADOOP_HDFS_HOME,LD_LIBRARY_PATH以及CLASSPATH等来读取HADOOP数据,但是通过-e传递参数方式,并不起作用。
4.Nvidia驱动安装问题
如果希望在Docker内部能够使用GPU,则应该在宿主机(host)以及Docker Container内部都安装相同的cuda版本以及cudnn版本。同时,在启动container的时候需要将GPU设备映射到container,需要映射的设备有
--device /dev/nvidia0:/dev/nvidia0 --device /dev/nvidiactl:/dev/nvidiactl --device /dev/nvidia-uvm:/dev/nvidia-uvm
但是,有个问题,如果重启的时候,这三个设备默认没有加载,通过以下脚本启动加载。
#!/bin/bash
/sbin/modprobe nvidia
if [ "$?" -eq 0 ]; then
# Count the number of NVIDIA controllers found.
NVDEVS=`lspci | grep -i NVIDIA`
N3D=`echo "$NVDEVS" | grep "3D controller" | wc -l`
NVGA=`echo "$NVDEVS" | grep "VGA compatible controller" | wc -l`
N=`expr $N3D + $NVGA - 1`
for i in `seq 0 $N`; do
mknod -m 666 /dev/nvidia$i c 195 $i
done
mknod -m 666 /dev/nvidiactl c 195 255
else
exit 1
fi
/sbin/modprobe nvidia-uvm
if [ "$?" -eq 0 ]; then
# Find out the major device number used by the nvidia-uvm driver
D=`grep nvidia-uvm /proc/devices | awk '{print $1}'`
mknod -m 666 /dev/nvidia-uvm c $D 0
else
exit 1
fi
5.整体的Dockerfile
FROM centos:7.3.1611
RUN yum update -y
RUN yum install -y java-1.8.0-openjdk-devel.x86_64
RUN yum install -y vim
RUN yum install -y wget
RUN yum -y install epel-release
RUN yum install -y python-pip
RUN yum -y install python-devel
RUN pip install --upgrade pip
ADD ./hadoop-2.7.2-1.2.8.tar.gz /usr/local
RUN mkdir /install
COPY ./cuda-repo-rhel7-8-0-local-ga2-8.0.61-1.x86_64-rpm /install
COPY ./cuda-repo-rhel7-8-0-local-cublas-performance-update-8.0.61-1.x86_64-rpm /install
RUN rpm -i /install/cuda-repo-rhel7-8-0-local-ga2-8.0.61-1.x86_64-rpm
RUN yum -y install cuda
RUN rpm -i /install/cuda-repo-rhel7-8-0-local-cublas-performance-update-8.0.61-1.x86_64-rpm
RUN yum -y install cuda-cublas-8-0
ADD ./cudnn-8.0-linux-x64-v6.0.tar.gz /install
RUN cp /install/cuda/include/cudnn.h /usr/local/cuda/include/
RUN cp -d /install/cuda/lib64/libcudnn* /usr/local/cuda/lib64/
RUN chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
ENV JAVA_HOME /etc/alternatives/java_sdk_1.8.0
ENV HADOOP_HOME /usr/local/hadoop-2.7.2-1.2.8
ENV HADOOP_HDFS_HOME $HADOOP_HOME
ENV LD_LIBRARY_PATH /usr/local/cuda/lib64:${JAVA_HOME}/jre/lib/amd64/server:$LD_LIBRARY_PATH
ENV PATH $JAVA_HOME/bin:$HADOOP_HOME/bin:$PATH