如何确定我的机器能不能用 cuda + nvidia gpu模式跑

我查我的机器device manager显示的好像有nvidia的显卡
image

然后不太了解 cuda是需要 我自己安装的吗?
然后是不是 才可以用这个模式跑?

  • 首先确定你是不是装过Nvidia的显卡驱动,没装过的话先去装一个
  • 高一点版本的驱动会自动注册nvidia-smi应用,可以直接去cmd运行查看
  • 装过驱动但是执行不生效的话,去查找C:\Windows\System32\DriverStore\FileRepository\nvdm*\nvidia-smi.exe 中间nvdm*表示不同的显卡驱动版本,可以自己近似匹配一下
  • 运行之后就可以看到本机支持的最高版本CUDA了

我获取到下面的信息

Mon Sep 25 12:17:02 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 528.79       Driver Version: 528.79       CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro T2000       WDDM  | 00000000:01:00.0 Off |                  N/A |
| N/A   67C    P8     7W /  20W |      0MiB /  4096MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

然后需要怎么做?

CUDA Version: 12.0
最起码12和12之前的CUDA都可以支持,可以去安装11.7 或者 11.8的CUDA,然后安装对应版本的pytoch


Pytorch 建议选什么版本呢? 2.0.1写的stable
如果选的CUDA 11.8 会生成一个 pip的命令 直接安装就行吧

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

要先保证你的CUDA是安装好了的

可以去nvidia官网找到符合版本的cuda-toolkit下载,安装
之后再去装对应的pytorch

尝试装了 cuda_11.8.0_522.06_windows.exe
然后 pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 (pytorch Stable (2.0.1))

代码测试的时候用的
model.half().cuda()

结果出错

D:\rgzn_source_code\ai_testing\venv\Scripts\python.exe "D:/dev_tool/JetBrains/PyCharm Community Edition 2023.2.1/plugins/python-ce/helpers/pycharm/_jb_pytest_runner.py" --target test_hugging_face.py::test_chatglm 
Testing started at 9:49 PM ...
Launching pytest with arguments test_hugging_face.py::test_chatglm --no-header --no-summary -q in D:\rgzn_source_code\ai_testing\tests\chatglm

============================= test session starts =============================
collecting ... collected 1 item

test_hugging_face.py::test_chatglm 

======================== 1 failed, 1 warning in 38.52s ========================

-------------------------------- live log call --------------------------------
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:1048 21:49:34 DEBUG - Starting new HTTPS connection (1): huggingface.co:443
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 21:49:37 DEBUG - https://huggingface.co:443 "HEAD /THUDM/chatglm2-6b/resolve/main/tokenizer_config.json HTTP/1.1" 200 0
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 21:49:37 DEBUG - https://huggingface.co:443 "HEAD /THUDM/chatglm2-6b/resolve/main/tokenization_chatglm.py HTTP/1.1" 200 0
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 21:49:41 DEBUG - https://huggingface.co:443 "GET /api/models/THUDM/chatglm2-6b HTTP/1.1" 200 5947
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 21:49:41 DEBUG - https://huggingface.co:443 "HEAD /THUDM/chatglm2-6b/resolve/main/config.json HTTP/1.1" 200 0
Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 21:49:42 DEBUG - https://huggingface.co:443 "HEAD /THUDM/chatglm2-6b/resolve/main/configuration_chatglm.py HTTP/1.1" 200 0
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 21:49:44 DEBUG - https://huggingface.co:443 "GET /api/models/THUDM/chatglm2-6b HTTP/1.1" 200 5947
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 21:49:46 DEBUG - https://huggingface.co:443 "HEAD /THUDM/chatglm2-6b/resolve/main/config.json HTTP/1.1" 200 0
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 21:49:46 DEBUG - https://huggingface.co:443 "HEAD /THUDM/chatglm2-6b/resolve/main/modeling_chatglm.py HTTP/1.1" 200 0
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 21:49:47 DEBUG - https://huggingface.co:443 "GET /api/models/THUDM/chatglm2-6b HTTP/1.1" 200 5947
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 21:49:47 DEBUG - https://huggingface.co:443 "HEAD /THUDM/chatglm2-6b/resolve/main/configuration_chatglm.py HTTP/1.1" 200 0
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 21:49:48 DEBUG - https://huggingface.co:443 "GET /api/models/THUDM/chatglm2-6b HTTP/1.1" 200 5947
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 21:49:48 DEBUG - https://huggingface.co:443 "HEAD /THUDM/chatglm2-6b/resolve/main/quantization.py HTTP/1.1" 200 0
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 21:49:51 DEBUG - https://huggingface.co:443 "GET /api/models/THUDM/chatglm2-6b HTTP/1.1" 200 5947
Loading checkpoint shards: 100%|██████████| 7/7 [00:19<00:00,  2.73s/it]
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 21:50:10 DEBUG - https://huggingface.co:443 "HEAD /THUDM/chatglm2-6b/resolve/main/generation_config.json HTTP/1.1" 404 0
FAILED
chatglm\test_hugging_face.py:8 (test_chatglm)
def test_chatglm():
        tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True)
        model = AutoModel.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True)
        # debug(model)
        # cuda + nvidia gpu
>       model.half().cuda()

test_hugging_face.py:14: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
..\..\venv\lib\site-packages\torch\nn\modules\module.py:905: in cuda
    return self._apply(lambda t: t.cuda(device))
..\..\venv\lib\site-packages\torch\nn\modules\module.py:797: in _apply
    module._apply(fn)
..\..\venv\lib\site-packages\torch\nn\modules\module.py:797: in _apply
    module._apply(fn)
..\..\venv\lib\site-packages\torch\nn\modules\module.py:797: in _apply
    module._apply(fn)
..\..\venv\lib\site-packages\torch\nn\modules\module.py:820: in _apply
    param_applied = fn(param)
..\..\venv\lib\site-packages\torch\nn\modules\module.py:905: in <lambda>
    return self._apply(lambda t: t.cuda(device))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    def _lazy_init():
        global _initialized, _queued_calls
        if is_initialized() or hasattr(_tls, 'is_initializing'):
            return
        with _initialization_lock:
            # We be double-checked locking, boys!  This is OK because
            # the above test was GIL protected anyway.  The inner test
            # is for when a thread blocked on some other thread which was
            # doing the initialization; when they get the lock, they will
            # find there is nothing left to do.
            if is_initialized():
                return
            # It is important to prevent other threads from entering _lazy_init
            # immediately, while we are still guaranteed to have the GIL, because some
            # of the C calls we make below will release the GIL
            if _is_in_bad_fork():
                raise RuntimeError(
                    "Cannot re-initialize CUDA in forked subprocess. To use CUDA with "
                    "multiprocessing, you must use the 'spawn' start method")
            if not hasattr(torch._C, '_cuda_getDeviceCount'):
>               raise AssertionError("Torch not compiled with CUDA enabled")
E               AssertionError: Torch not compiled with CUDA enabled

..\..\venv\lib\site-packages\torch\cuda\__init__.py:239: AssertionError

Process finished with exit code 1

上面的问题 卸载了最新的版本 用的2.0.0解决了

pip install torch==2.0.0+cu118 torchvision==0.15.1+cu118 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu118

再运行有OutOfMemory的错 应该我是没办法继续用这个模式了 只能用CPU的方式了吧?

D:\rgzn_source_code\ai_testing\venv\Scripts\python.exe "D:/dev_tool/JetBrains/PyCharm Community Edition 2023.2.1/plugins/python-ce/helpers/pycharm/_jb_pytest_runner.py" --path D:\rgzn_source_code\ai_testing\src\tests\chatglm\test_hugging_face.py 
Testing started at 12:03 AM ...
Launching pytest with arguments D:\rgzn_source_code\ai_testing\src\tests\chatglm\test_hugging_face.py --no-header --no-summary -q in D:\rgzn_source_code\ai_testing\src\tests\chatglm

============================= test session starts =============================
collecting ... collected 1 item

test_hugging_face.py::test_chatglm 

======================== 1 failed, 1 warning in 43.64s ========================

-------------------------------- live log call --------------------------------
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:1048 00:03:34 DEBUG - Starting new HTTPS connection (1): huggingface.co:443
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 00:03:38 DEBUG - https://huggingface.co:443 "HEAD /THUDM/chatglm2-6b/resolve/main/tokenizer_config.json HTTP/1.1" 200 0
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 00:03:38 DEBUG - https://huggingface.co:443 "HEAD /THUDM/chatglm2-6b/resolve/main/tokenization_chatglm.py HTTP/1.1" 200 0
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 00:03:42 DEBUG - https://huggingface.co:443 "GET /api/models/THUDM/chatglm2-6b HTTP/1.1" 200 5947
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 00:03:43 DEBUG - https://huggingface.co:443 "HEAD /THUDM/chatglm2-6b/resolve/main/config.json HTTP/1.1" 200 0
Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 00:03:43 DEBUG - https://huggingface.co:443 "HEAD /THUDM/chatglm2-6b/resolve/main/configuration_chatglm.py HTTP/1.1" 200 0
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 00:03:44 DEBUG - https://huggingface.co:443 "GET /api/models/THUDM/chatglm2-6b HTTP/1.1" 200 5947
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 00:03:46 DEBUG - https://huggingface.co:443 "HEAD /THUDM/chatglm2-6b/resolve/main/config.json HTTP/1.1" 200 0
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 00:03:46 DEBUG - https://huggingface.co:443 "HEAD /THUDM/chatglm2-6b/resolve/main/modeling_chatglm.py HTTP/1.1" 200 0
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 00:03:47 DEBUG - https://huggingface.co:443 "GET /api/models/THUDM/chatglm2-6b HTTP/1.1" 200 5947
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 00:03:47 DEBUG - https://huggingface.co:443 "HEAD /THUDM/chatglm2-6b/resolve/main/quantization.py HTTP/1.1" 200 0
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 00:03:50 DEBUG - https://huggingface.co:443 "GET /api/models/THUDM/chatglm2-6b HTTP/1.1" 200 5947
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 00:03:51 DEBUG - https://huggingface.co:443 "HEAD /THUDM/chatglm2-6b/resolve/main/configuration_chatglm.py HTTP/1.1" 200 0
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 00:03:51 DEBUG - https://huggingface.co:443 "GET /api/models/THUDM/chatglm2-6b HTTP/1.1" 200 5947
Loading checkpoint shards: 100%|██████████| 7/7 [00:18<00:00,  2.66s/it]
D:\rgzn_source_code\ai_testing\venv\lib\site-packages\urllib3\connectionpool.py:546 00:04:11 DEBUG - https://huggingface.co:443 "HEAD /THUDM/chatglm2-6b/resolve/main/generation_config.json HTTP/1.1" 404 0
FAILED
chatglm\test_hugging_face.py:8 (test_chatglm)
def test_chatglm():
        tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True)
        model = AutoModel.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True)
        # debug(model)
        # cuda + nvidia gpu
>       model.half().cuda()

test_hugging_face.py:14: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
..\..\..\venv\lib\site-packages\torch\nn\modules\module.py:905: in cuda
    return self._apply(lambda t: t.cuda(device))
..\..\..\venv\lib\site-packages\torch\nn\modules\module.py:797: in _apply
    module._apply(fn)
..\..\..\venv\lib\site-packages\torch\nn\modules\module.py:797: in _apply
    module._apply(fn)
..\..\..\venv\lib\site-packages\torch\nn\modules\module.py:797: in _apply
    module._apply(fn)
..\..\..\venv\lib\site-packages\torch\nn\modules\module.py:797: in _apply
    module._apply(fn)
..\..\..\venv\lib\site-packages\torch\nn\modules\module.py:797: in _apply
    module._apply(fn)
..\..\..\venv\lib\site-packages\torch\nn\modules\module.py:797: in _apply
    module._apply(fn)
..\..\..\venv\lib\site-packages\torch\nn\modules\module.py:820: in _apply
    param_applied = fn(param)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

t = Parameter containing:
tensor([[ 0.0055, -0.0108,  0.0022,  ...,  0.0097,  0.0121, -0.0108],
        [ 0.0172,  0.0019,...
        [ 0.0095, -0.0236,  0.0022,  ...,  0.0035,  0.0276, -0.0058]],
       dtype=torch.float16, requires_grad=True)

>   return self._apply(lambda t: t.cuda(device))
E   torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 108.00 MiB (GPU 0; 4.00 GiB total capacity; 3.44 GiB already allocated; 0 bytes free; 3.44 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

..\..\..\venv\lib\site-packages\torch\nn\modules\module.py:905: OutOfMemoryError

Process finished with exit code 1

  • torch安装可以直接去pytorch官网找到对应版本的安装命令
  • 如果使用cuda加速的方式发现无法完整加载模型,报了OutOfMemoryError,说明显存不够大,没办法支撑当前规模模型的加载,可以试试量化过之后的模型版本,如果还不行的话就只能使用CPU+内存的方式来运行模型了