example code 실행이 안 됩니다.

#8
by ljm9667 - opened

RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

/pytorch/aten/src/ATen/native/cuda/TensorCompare.cu:110: _assert_async_cuda_kernel: block: [0,0,0], thread: [0,0,0] Assertion probability tensor contains either inf, nan or element < 0 failed.

이런 오류가 나는데 원인이 뭘까요?

Bllossom org

최신 transformers 패키지와 머신에 맞는 torch버젼을 설치하신 뒤 실행해보시길 바랍니다.

pip install -U transformers
https://pytorch.org/get-started/previous-versions/

버전 requirements 알 수 있을까요?

Bllossom org

requirements은 따로 없고 최신버전으로 사용해보시길 바랍니다.
아래는 정상작동하는 저희의 pip list입니다

Package                  Version
------------------------ ------------
accelerate               1.6.0
certifi                  2025.1.31
charset-normalizer       3.4.1
filelock                 3.13.1
fsspec                   2024.6.1
huggingface-hub          0.30.2
idna                     3.10
Jinja2                   3.1.4
MarkupSafe               2.1.5
mpmath                   1.3.0
networkx                 3.3
numpy                    2.1.2
nvidia-cublas-cu12       12.4.5.8
nvidia-cuda-cupti-cu12   12.4.127
nvidia-cuda-nvrtc-cu12   12.4.127
nvidia-cuda-runtime-cu12 12.4.127
nvidia-cudnn-cu12        9.1.0.70
nvidia-cufft-cu12        11.2.1.3
nvidia-curand-cu12       10.3.5.147
nvidia-cusolver-cu12     11.6.1.9
nvidia-cusparse-cu12     12.3.1.170
nvidia-cusparselt-cu12   0.6.2
nvidia-nccl-cu12         2.21.5
nvidia-nvjitlink-cu12    12.4.127
nvidia-nvtx-cu12         12.4.127
packaging                24.2
pillow                   11.0.0
pip                      22.0.2
psutil                   7.0.0
PyYAML                   6.0.2
regex                    2024.11.6
requests                 2.32.3
safetensors              0.5.3
setuptools               59.6.0
sympy                    1.13.1
tokenizers               0.21.1
torch                    2.6.0+cu124
torchaudio               2.6.0+cu124
torchvision              0.21.0+cu124
tqdm                     4.67.1
transformers             4.51.3
triton                   3.2.0
typing_extensions        4.12.2
urllib3                  2.4.0

Name Version Build Channel

_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_gnu conda-forge
accelerate 1.6.0 pypi_0 pypi
bzip2 1.0.8 h4bc722e_7 conda-forge
ca-certificates 2025.1.31 hbcca054_0 conda-forge
certifi 2025.1.31 pypi_0 pypi
charset-normalizer 3.4.1 pypi_0 pypi
filelock 3.13.1 pypi_0 pypi
fsspec 2024.6.1 pypi_0 pypi
huggingface-hub 0.30.2 pypi_0 pypi
idna 3.10 pypi_0 pypi
jinja2 3.1.4 pypi_0 pypi
ld_impl_linux-64 2.43 h712a8e2_4 conda-forge
libexpat 2.7.0 h5888daf_0 conda-forge
libffi 3.4.6 h2dba641_1 conda-forge
libgcc 14.2.0 h767d61c_2 conda-forge
libgcc-ng 14.2.0 h69a702a_2 conda-forge
libgomp 14.2.0 h767d61c_2 conda-forge
liblzma 5.8.1 hb9d3cd8_0 conda-forge
libnsl 2.0.1 hd590300_0 conda-forge
libsqlite 3.49.1 hee588c1_2 conda-forge
libuuid 2.38.1 h0b41bf4_0 conda-forge
libxcrypt 4.4.36 hd590300_1 conda-forge
libzlib 1.3.1 hb9d3cd8_2 conda-forge
markupsafe 2.1.5 pypi_0 pypi
mpmath 1.3.0 pypi_0 pypi
ncurses 6.5 h2d0b736_3 conda-forge
networkx 3.3 pypi_0 pypi
numpy 2.1.2 pypi_0 pypi
nvidia-cublas-cu12 12.4.5.8 pypi_0 pypi
nvidia-cuda-cupti-cu12 12.4.127 pypi_0 pypi
nvidia-cuda-nvrtc-cu12 12.4.127 pypi_0 pypi
nvidia-cuda-runtime-cu12 12.4.127 pypi_0 pypi
nvidia-cudnn-cu12 9.1.0.70 pypi_0 pypi
nvidia-cufft-cu12 11.2.1.3 pypi_0 pypi
nvidia-curand-cu12 10.3.5.147 pypi_0 pypi
nvidia-cusolver-cu12 11.6.1.9 pypi_0 pypi
nvidia-cusparse-cu12 12.3.1.170 pypi_0 pypi
nvidia-cusparselt-cu12 0.6.2 pypi_0 pypi
nvidia-nccl-cu12 2.21.5 pypi_0 pypi
nvidia-nvjitlink-cu12 12.4.127 pypi_0 pypi
nvidia-nvtx-cu12 12.4.127 pypi_0 pypi
openssl 3.5.0 h7b32b05_0 conda-forge
packaging 24.2 pypi_0 pypi
pillow 11.0.0 pypi_0 pypi
pip 22.0.2 pypi_0 pypi
psutil 7.0.0 pypi_0 pypi
python 3.12.10 h9e4cc4f_0_cpython conda-forge
pyyaml 6.0.2 pypi_0 pypi
readline 8.2 h8c095d6_2 conda-forge
regex 2024.11.6 pypi_0 pypi
requests 2.32.3 pypi_0 pypi
safetensors 0.5.3 pypi_0 pypi
setuptools 59.6.0 pypi_0 pypi
sympy 1.13.1 pypi_0 pypi
tk 8.6.13 noxft_h4845f30_101 conda-forge
tokenizers 0.21.1 pypi_0 pypi
torch 2.6.0+cu124 pypi_0 pypi
torchaudio 2.6.0+cu124 pypi_0 pypi
torchvision 0.21.0+cu124 pypi_0 pypi
tqdm 4.67.1 pypi_0 pypi
transformers 4.51.3 pypi_0 pypi
triton 3.2.0 pypi_0 pypi
typing-extensions 4.12.2 pypi_0 pypi
tzdata 2025b h78e105d_0 conda-forge
urllib3 2.4.0 pypi_0 pypi
wheel 0.45.1 pyhd8ed1ab_1 conda-forge

위와 같이 알려주신대로 설치했는데도 아래와 같은 오류가 나네요 ㅠ

RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

/pytorch/aten/src/ATen/native/cuda/TensorCompare.cu:110: _assert_async_cuda_kernel: block: [0,0,0], thread: [0,0,0] Assertion probability tensor contains either inf, nan or element < 0 failed.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment