Q&A: How to Run the Latest PyTorch with CUDA

Recently I got a graphics card update. To be fair, it was long overdue. I was running a GTX1070 from 2016 and my PyTorch runs were taking forever. I think at one point I made a lasagne from scratch, before my NLP transfer learning onto a BERT model finished. It was not awesome. I eventually bit the bullet and splurged.

Of course, this is when the real fun started. I noticed that my PyTorch and CUDA were not finding the new card. The card was too new for the version of PyTorch. Thankfully, with ClearML and Docker, it was almost painless to fix. I made the following changes to my clearml.conf, so it would download the nightly PyTorch. The Docker image was also similarly updated to CUDA 11.1. I restarted the clearml-agents of course, and within minutes, all my experiments were running away. Faster than ever.

agent {
  package_manager {
      extra_index_url: ["https://download.pytorch.org/whl/torch_stable.html"]
    }

  default_docker {
      image: "nvidia/cuda:11.1-cudnn8-runtime-ubuntu20.04"
  }
}