最近发现,除了ufoym/deepo这个镜像,还有 NVidia NGC 项目提供了 PyTorch 和 TensorFlow 的镜像,详见:

其实 ufoym/deepotensorflow/tensorflow  等都是基于 NGC 的镜像。

不想看就直接 pull:

docker run --gpus all --rm -v local_dir:container_dir
  • -it means run in interactive mode

  • --rm will delete the container when finished

  • -v is the mounting directory

  • local_dir is the directory or file from your host system (absolute path) that you want to access from inside your container. For example, the local_dir in the following path is /home/jsmith/data/mnist.

-v /home/jsmith/data/mnist:/data/mnist

  If you are inside the container, for example, `ls /data/mnist`, you will see the same files as if you issued the `ls /home/jsmith/data/mnist` command from outside the container.
  • container_dir is the target directory when you are inside your container. For example, /data/mnist is the target directory in the example:
  -v /home/jsmith/data/mnist:/data/mnist
  • xx.xx is the container version. For example, 20.01.

  • command is the command you want to run in the image.

  • Note: DIGITS uses shared memory to share data between processes. For example, if you use Torch multiprocessing for multi-threaded data loaders, the default shared memory segment size that the container runs with may not be enough. Therefore, you should increase the shared memory size by issuing either:




See /workspace/ inside the container for information on customizing your PyTorch image.

前往ImageNet官网下载 ILSVRC 2012 数据集(需要使用学校的邮箱注册账号并同意使用协议)



cd /path/to/your/dataset/train
tar -xf ILSVRC2012_img_train.tar && rm -f ILSVRC2012_img_train.tar
# 递归解压,看看如何实现才能不必将源文件删掉
# 提醒:执行脚本前一定要看一下脚本做了什么!否则,哭都没处哭去
find . -name "*.tar" | while read NAME ; do mkdir -p "${NAME%.tar}"; tar -xvf "${NAME}" -C "${NAME%.tar}"; rm -f "${NAME}"; done
cd ..


cd /path/to/your/dataset/val
tar -xvf ILSVRC2012_img_val.tar
# wget -qO- | bash
# 或者
# wget -qO- | bash

如何在真实的深度学习环境中测试自己的 GPU?


ataLoader worker (pid 8639) is killed by signal: Bus error. It is possible that dataloader's workers are out of shared memory. Please try to raise your shared memory limit.


Okay. I think I solved it. Looks like the shared memory of the docker container wasn't set high enough. Setting a higher amount by adding --shm-size 8G to the docker run command seems to be the trick as mentioned here. Let me fully test it, if solved I'll close issue.


If you're using docker-compose, you can set the your_service.shm_size value if you want your container to use that /dev/shm size when running or when building.


version: '3.5'
      context: .
      shm_size: '2gb' <-- this will set the size when BUILDING
    shm_size: '2gb' <-- when RUNNING

最后更新: 2023-01-31