2025-02-19 配置pytorch环境

This article is categorized as "Garbage" . It should NEVER be appeared in your search engine's results.

1, pytorch狂吃cpu,完全不吃gpu

用代码torch.cuda.is_available()发现返回False,说明cuda没搞对

2,用conda安装正确的环境

$ nvcc --version 显示版本11.5

然后参考🔗 [PyTorch installation with GPU support on Ubuntu - PyTorch Forums] https://discuss.pytorch.org/t/pytorch-installation-with-gpu-support-on-ubuntu/196350 里面的命令

$ conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

安装完毕后解决了torch.cuda.is_available()的问题

3, 指定model使用gpu

device = torch.device("cuda")
model = model.cuda()

4, 训练的过程中把普通的x和y变成tensor类型(否则会报错Expected all tensors to be on the same device, but found at least two devices...)

# 原代码
for X, y in train_data:

# 新代码
device = torch.device("cuda")
for X, y in train_data:
    X, y = X.to(device), y.to(device)

5. 训练出的gpu模型如果用来计算test_data,则原本的cpu代码也要改,而且改动幅度比较大。如果直接用gpu model去计算整个test_data则有概率gpu内存爆炸,这种情况下需要用batch_size切分,然后再把计算出来的结果append到一起去。实际上大多数情况下test_data只需要用cpu model就可以:

cpu_model = gpu_model.to('cpu')


 Last Modified in 2025-07-01 

Leave a Comment Anonymous comment is allowed / 允许匿名评论