GPU Computation

Introduction

To enable GPU acceleration for your code, two conditions must be met:

You need to run your application on a GPU-enabled size. By default, applications on Nuvolos run on nodes without a GPU card, but you can scale your applications to sizes with GPU support. Note that all GPU-enabled sizes are credit-based.
You need to ensure the application libraries are properly configured to use a GPU. The documentation below primarily addresses this topic for various frameworks, so your application can actually use the available GPU.

Library Versions

The NVIDIA device drivers are automatically loaded in all GPU-enabled sizes. However, depending on the software you use, additional components (e.g., CUDA toolkit) might need to be installed via conda.

If you launch an app in a GPU-enabled size on Nuvolos, the nvidia-smi tool will be available from the command line/terminal. You can use this to check the driver version and monitor memory usage of the card.

$ nvidia-smi
Thu Jun  1 08:39:06 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.08    Driver Version: 510.73.08    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A10-4Q       On   | 00000002:00:00.0 Off |                    0 |
| N/A   N/A    P0    N/A /  N/A |    333MiB /  4096MiB |     N/A      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

Due to the underlying virtualization technology in Nuvolos, the nvidia-smi tool is currently unable to list processes using the GPU

Note that nvidia-smi reports the CUDA Driver API version in its output (11.6). However, most high-level machine learning frameworks use the CUDA Runtime API as well, which is provided by the CUDA Runtime library. Most frameworks can automatically install the required version of the runtime, so if you're starting from scratch, this should be straightforward to set up.

Please find examples below on how to get started with GPU computations on Nuvolos, or consult the relevant machine learning library documentation directly. If you need additional support, please reach out to our team.

GPU Monitoring

We recommend using the nvitop package to interactively monitor GPU usage. You can install it with:

conda install -c conda-forge nvitop

Due to the underlying virtualization technology in Nuvolos, the nvitop tool cannot load the details of the processes using the GPU.

Large Language Models

A few useful guidelines for running LLMs on Nuvolos:

Always assess your VRAM requirements first. A helpful estimator can be found here: https://huggingface.co/docs/accelerate/main/en/usage_guides/model_size_estimator
Try loading your models with quantized parameters first, which require less VRAM. The HuggingFace transformers library has built-in support for automatic weight quantization: https://huggingface.co/docs/transformers/main_classes/model#transformers.PreTrainedModel.from_pretrained

Python

Installing the right version of CUDA for Python packages can be an overwhelming task. We recommend starting with a clean image and installing high-level AI/ML Python libraries first. Only install other libraries afterward if possible. This way, PyTorch or TensorFlow can install the exact CUDA libraries they need.

PyTorch

In our experience, installing PyTorch with pip is better than with conda, as it won't try to overwrite system libraries:

pip3 install torch torchvision torchaudio

The above command will install PyTorch with the latest major CUDA Runtime version (12). On Nuvolos, all GPUs currently support version 12 except the A10 card. If you want to run your computation on an A10, please install PyTorch with the older CUDA Runtime version 11:

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

You don't need a GPU available in your application to install PyTorch with GPU support. It's sufficient to scale up to a GPU-enabled size after installation is complete. To test if your installation was successful, execute the following code snippet while on a GPU-enabled size:

import torch

dtype = torch.float
device = torch.device("cuda")
a = torch.randn((), device=device, dtype=dtype)

If it completes without an error, your configuration is correct.

Note that pip will install the runtime libraries needed by PyTorch, but will not set up a complete developer environment that you could use outside Python (see official notes). To use tools like nvcc from the command line, please install the CUDA Toolkit via conda instead.

NVCC

If you want to compile CUDA executables with nvcc, you'll need to install the following packages as a minimum:

conda install -c nvidia cuda-nvcc cuda-cudart-dev

The cuda-nvcc package provides the compiler binaries, the cuda-cudart-dev provides the header and library files. Both packages are available in CUDA 11 and 12 versions.

CUDA Toolkit

If you need the entire CUDA toolkit, you can also install it with conda. Nvidia recently changed how they ship the package with conda:

conda install -c nvidia cuda-toolkit

The package is available in both CUDA 11 and 12 versions.

TensorFlow

To install TensorFlow, we recommend using conda as TensorFlow requires the cudatoolkit package.

conda install -c conda-forge tensorflow-gpu "cudatoolkit<=CUDA_VERSION"

where CUDA_VERSION is the version reported by nvidia-smi. If you don't need the latest version of CUDA, we recommend starting with an older version like 11.6 to achieve compatibility with older GPU cards.

We recommend installing both TensorFlow and cudatoolkit from the same conda channel if possible. See the notes above for cudatoolkit.

You don't need a GPU available in your running app to install TensorFlow with GPU support. It's sufficient to scale up to a GPU-enabled size after installation is complete. To test if your installation was successful, execute the following code snippet while on a GPU-enabled size:

import tensorflow as tf

a = tf.constant([1, 2, 3])
print(a.device)

If you see an output similar to

/job:localhost/replica:0/task:0/device:GPU:0

that ends with GPU:0, your configuration is correct.

RStudio

With Machine Learning (CUDA-enabled) RStudio images, you can run GPU computations on GPU-accelerated nodes. These images have the CUDA runtime/toolkit installed as well.

XGBoost

We recommend using the pre-built experimental binary to get started with XGBoost and R. In a terminal on a GPU node:

# define version used - update if needed
XGBOOST_VERSION=1.4.1
# download binary
wget https://github.com/dmlc/xgboost/releases/download/v${XGBOOST_VERSION}/xgboost_r_gpu_linux_${XGBOOST_VERSION}.tar.gz
# Install dependencies
R -q -e "install.packages(c('data.table', 'jsonlite'))"
# Install XGBoost
R CMD INSTALL ./xgboost_r_gpu_linux_${XGBOOST_VERSION}.tar.gz

You can test the code via the following example program: https://rdrr.io/cran/xgboost/src/demo/gpu_accelerated.R

TensorFlow / Keras

You can use TensorFlow with GPU acceleration by following our TensorFlow installation guide and selecting to install version = "gpu" when installing TensorFlow.

PreviousInviting a Reviewer NextEducation guides

Last updated 2 months ago

Was this helpful?

hashtagIntroduction

hashtagLibrary Versions

hashtagGPU Monitoring

hashtagLarge Language Models

hashtagPython

hashtagPyTorch

hashtagNVCC

hashtagCUDA Toolkit

hashtagTensorFlow

hashtagRStudio

hashtagXGBoost

hashtagTensorFlow / Keras