I’m trying using cupy in my docker container. I use to containers which one is for CUDA and cuDNN, and the other is for cupy.
I tried this code.
import cupy as cp cupy_array = cp.array([1, 2, 3]) cupy_result = cupy_array + 5 print("CuPy Result:", cupy_result)
The full error log is like
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "cupy/_core/core.pyx", line 1191, in cupy._core.core.ndarray.__add__ File "cupy/_core/core.pyx", line 1591, in cupy._core.core.ndarray.__array_ufunc__ File "cupy/_core/_kernel.pyx", line 1292, in cupy._core._kernel.ufunc.__call__ File "cupy/_core/_kernel.pyx", line 1319, in cupy._core._kernel.ufunc._get_ufunc_kernel File "cupy/_core/_kernel.pyx", line 1025, in cupy._core._kernel._get_ufunc_kernel File "cupy/_core/_kernel.pyx", line 72, in cupy._core._kernel._get_simple_elementwise_kernel File "cupy/_core/core.pyx", line 2141, in cupy._core.core.compile_with_cache File "/usr/local/lib/python3.8/dist-packages/cupy/cuda/compiler.py", line 492, in _compile_module_with_cache return _compile_with_cache_cuda( File "/usr/local/lib/python3.8/dist-packages/cupy/cuda/compiler.py", line 614, in _compile_with_cache_cuda mod.load(cubin) File "cupy/cuda/function.pyx", line 264, in cupy.cuda.function.Module.load File "cupy/cuda/function.pyx", line 266, in cupy.cuda.function.Module.load File "cupy_backends/cuda/api/driver.pyx", line 210, in cupy_backends.cuda.api.driver.moduleLoadData File "cupy_backends/cuda/api/driver.pyx", line 60, in cupy_backends.cuda.api.driver.check_status cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_INVALID_SOURCE: device kernel image is invalid
The result of nvidia-smi
+---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 4080 Off | 00000000:01:00.0 On | N/A | | 0% 32C P8 6W / 320W | 483MiB / 16376MiB | 4% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| +---------------------------------------------------------------------------------------+
The result of nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Tue_Mar__8_18:18:20_PST_2022 Cuda compilation tools, release 11.6, V11.6.124 Build cuda_11.6.r11.6/compiler.31057947_0
The result of cat /usr/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
cat /usr/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 8 #define CUDNN_MINOR 4 #define CUDNN_PATCHLEVEL 0 -- #define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL) #endif /* CUDNN_VERSION_H */
The result of pip3 freeze | grep cupy is cupy-cuda116==10.6.0
pip3 freeze | grep cupy
The results above are all shown in docker container for cupy.
I ran docker for CUDA and cuDNN with sudo docker run --name cuda11.6.1-cudnn8 --gpus all --runtime=nvidia -it \ --privileged --env="DISPLAY=:0:0" -v=/tmp/.X11-unix:/tmp/.X11-unix:ro \ -v=/home/youngjoo/Documents/Elevation_ws:/home/youngjoo/Documents/Elevation_ws \ -v=/dev:/dev -w=/home/youngjoo/Documents/Elevation_ws \ nvidia/cuda:11.6.1-cudnn8-devel-ubuntu20.04
sudo docker run --name cuda11.6.1-cudnn8 --gpus all --runtime=nvidia -it \ --privileged --env="DISPLAY=:0:0" -v=/tmp/.X11-unix:/tmp/.X11-unix:ro \ -v=/home/youngjoo/Documents/Elevation_ws:/home/youngjoo/Documents/Elevation_ws \ -v=/dev:/dev -w=/home/youngjoo/Documents/Elevation_ws \ nvidia/cuda:11.6.1-cudnn8-devel-ubuntu20.04
My OS is Ubuntu 20.04.
Docker version is 24.0.7, build afdd53b.
How can I resolve this?
I deleted all docker containers and restarted but the result was the same.
The error you are encountering, CUDA_ERROR_INVALID_SOURCE: device kernel image is invalid, suggests that there might be an issue with the compilation of the CUDA kernel code used by CuPy. Here are a few steps you can take to resolve the issue:
CUDA_ERROR_INVALID_SOURCE: device kernel image is invalid
cupy-cuda116==10.6.0
pip install cupy-cuda122
pip uninstall cupy pip install cupy-cuda122
This ensures a fresh installation.
Check GPU Drivers: Ensure that your GPU drivers are up-to-date. You have CUDA 12.2 installed, so make sure that your GPU drivers are compatible with CUDA 12.2.
Check CUDA Toolkit Installation: Make sure that the CUDA toolkit is correctly installed in your Docker container. You can check the version of the CUDA toolkit by running:
nvcc --version
Ensure that it matches the CUDA version you are using with CuPy.
--gpus all
After performing these steps, try running your CuPy code again. If the issue persists, there might be an underlying compatibility or configuration problem that requires further investigation.