GPU Computation
Introduction
To enable GPU acceleration for your code, two conditions must be met:
You need to run your application on a GPU-enabled size. By default, applications on Nuvolos run on nodes without a GPU card, but you can scale your applications to sizes with GPU support. Note that all GPU-enabled sizes are credit-based.
You need to ensure the application libraries are properly configured to use a GPU. The documentation below primarily addresses this topic for various frameworks, so your application can actually use the available GPU.
Library Versions
The NVIDIA device drivers are automatically loaded in all GPU-enabled sizes. However, depending on the software you use, additional components (e.g., CUDA toolkit) might need to be installed via conda.
If you launch an app in a GPU-enabled size on Nuvolos, the nvidia-smi tool will be available from the command line/terminal. You can use this to check the driver version and monitor memory usage of the card.
$ nvidia-smi
Thu Jun 1 08:39:06 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.08 Driver Version: 510.73.08 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA A10-4Q On | 00000002:00:00.0 Off | 0 |
| N/A N/A P0 N/A / N/A | 333MiB / 4096MiB | N/A Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+Note that nvidia-smi reports the CUDA Driver API version in its output (11.6). However, most high-level machine learning frameworks use the CUDA Runtime API as well, which is provided by the CUDA Runtime library. Most frameworks can automatically install the required version of the runtime, so if you're starting from scratch, this should be straightforward to set up.
Please find examples below on how to get started with GPU computations on Nuvolos, or consult the relevant machine learning library documentation directly. If you need additional support, please reach out to our team.
GPU Monitoring
We recommend using the nvitop package to interactively monitor GPU usage. You can install it with:
conda install -c conda-forge nvitop
Large Language Models
A few useful guidelines for running LLMs on Nuvolos:
Always assess your VRAM requirements first. A helpful estimator can be found here: https://huggingface.co/docs/accelerate/main/en/usage_guides/model_size_estimator
Try loading your models with quantized parameters first, which require less VRAM. The HuggingFace transformers library has built-in support for automatic weight quantization: https://huggingface.co/docs/transformers/main_classes/model#transformers.PreTrainedModel.from_pretrained
Python
Installing the right version of CUDA for Python packages can be an overwhelming task. We recommend starting with a clean image and installing high-level AI/ML Python libraries first. Only install other libraries afterward if possible. This way, PyTorch or TensorFlow can install the exact CUDA libraries they need.
PyTorch
In our experience, installing PyTorch with pip is better than with conda, as it won't try to overwrite system libraries:
pip3 install torch torchvision torchaudioThe above command will install PyTorch with the latest major CUDA Runtime version (12). On Nuvolos, all GPUs currently support version 12 except the A10 card. If you want to run your computation on an A10, please install PyTorch with the older CUDA Runtime version 11:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118You don't need a GPU available in your application to install PyTorch with GPU support. It's sufficient to scale up to a GPU-enabled size after installation is complete. To test if your installation was successful, execute the following code snippet while on a GPU-enabled size:
import torch
dtype = torch.float
device = torch.device("cuda")
a = torch.randn((), device=device, dtype=dtype)If it completes without an error, your configuration is correct.
NVCC
If you want to compile CUDA executables with nvcc, you'll need to install the following packages as a minimum:
conda install -c nvidia cuda-nvcc cuda-cudart-devThe cuda-nvcc package provides the compiler binaries, the cuda-cudart-dev provides the header and library files. Both packages are available in CUDA 11 and 12 versions.
CUDA Toolkit
If you need the entire CUDA toolkit, you can also install it with conda. Nvidia recently changed how they ship the package with conda:
conda install -c nvidia cuda-toolkitThe package is available in both CUDA 11 and 12 versions.
TensorFlow
To install TensorFlow, we recommend using conda as TensorFlow requires the cudatoolkit package.
conda install -c conda-forge tensorflow-gpu "cudatoolkit<=CUDA_VERSION"where CUDA_VERSION is the version reported by nvidia-smi. If you don't need the latest version of CUDA, we recommend starting with an older version like 11.6 to achieve compatibility with older GPU cards.
You don't need a GPU available in your running app to install TensorFlow with GPU support. It's sufficient to scale up to a GPU-enabled size after installation is complete. To test if your installation was successful, execute the following code snippet while on a GPU-enabled size:
import tensorflow as tf
a = tf.constant([1, 2, 3])
print(a.device)If you see an output similar to
/job:localhost/replica:0/task:0/device:GPU:0that ends with GPU:0, your configuration is correct.
RStudio
With Machine Learning (CUDA-enabled) RStudio images, you can run GPU computations on GPU-accelerated nodes. These images have the CUDA runtime/toolkit installed as well.
XGBoost
We recommend using the pre-built experimental binary to get started with XGBoost and R. In a terminal on a GPU node:
# define version used - update if needed
XGBOOST_VERSION=1.4.1
# download binary
wget https://github.com/dmlc/xgboost/releases/download/v${XGBOOST_VERSION}/xgboost_r_gpu_linux_${XGBOOST_VERSION}.tar.gz
# Install dependencies
R -q -e "install.packages(c('data.table', 'jsonlite'))"
# Install XGBoost
R CMD INSTALL ./xgboost_r_gpu_linux_${XGBOOST_VERSION}.tar.gzYou can test the code via the following example program: https://rdrr.io/cran/xgboost/src/demo/gpu_accelerated.R
TensorFlow / Keras
You can use TensorFlow with GPU acceleration by following our TensorFlow installation guide and selecting to install version = "gpu" when installing TensorFlow.
Last updated
Was this helpful?