cuda native installation
on this page
overview
native cuda toolkit installation for compiling cuda code and maximum performance. required for kernel development and cuda compilation.
current version: cuda 12.9 update 1 (august 2025)
when to use:
- cuda kernel development
- custom cuda libraries
- system-wide cuda tools
- no docker overhead acceptable
prerequisites
# verify nvidia driver
nvidia-smi
# need driver 550.54.14+ for cuda 12.9
# check gcc
gcc --version
# need gcc 11 or 12
# verify kernel headers
uname -r
ls /usr/src/linux-headers-$(uname -r)installation methods
method 1: deb packages (recommended)
for ubuntu 22.04:
# download cuda keyring
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
# update and install
sudo apt update
sudo apt install cuda-toolkit-12-9for ubuntu 24.04:
# download cuda keyring
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
# update and install
sudo apt update
sudo apt install cuda-toolkit-12-9method 2: runfile installer
more control but manual management:
# download runfile
wget https://developer.download.nvidia.com/compute/cuda/12.9.0/local_installers/cuda_12.9.0_550.54.14_linux.run
# make executable
chmod +x cuda_12.9.0_550.54.14_linux.run
# install (skip driver if already installed)
sudo sh cuda_12.9.0_550.54.14_linux.run --toolkit --silent --overrideinteractive options:
# interactive mode for component selection
sudo sh cuda_12.9.0_550.54.14_linux.run
# select cuda-gdb-src in "CUDA Tools 12.9" for debuggingpost-installation setup
environment variables
add to ~/.bashrc:
# cuda paths
export PATH=/usr/local/cuda-12.9/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-12.9/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
# optional: default cuda version
export CUDA_HOME=/usr/local/cuda-12.9apply changes:
source ~/.bashrcverification
# check nvcc
nvcc --version
# nvcc: NVIDIA (R) Cuda compiler driver
# Cuda compilation tools, release 12.9
# compile test
cat > test.cu << 'EOF'
#include <stdio.h>
__global__ void hello() {
    printf("Hello from GPU!\n");
}
int main() {
    hello<<<1,1>>>();
    cudaDeviceSynchronize();
    return 0;
}
EOF
nvcc test.cu -o test
./test
# Hello from GPU!component overview
default installation includes:
| component | location | purpose | 
|---|---|---|
| nvcc | /usr/local/cuda/bin/nvcc | cuda compiler | 
| cuda libraries | /usr/local/cuda/lib64/ | runtime libraries | 
| headers | /usr/local/cuda/include/ | development headers | 
| samples | /usr/local/cuda/samples/ | example code | 
| nsight | /usr/local/cuda/bin/nsight | ide for cuda | 
| cuda-gdb | /usr/local/cuda/bin/cuda-gdb | cuda debugger | 
multiple cuda versions
install multiple versions side-by-side:
# install cuda 11.8
sudo apt install cuda-toolkit-11-8
# install cuda 12.9
sudo apt install cuda-toolkit-12-9
# switch versions
sudo update-alternatives --config cuda
# or manually
export PATH=/usr/local/cuda-11.8/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64:$LD_LIBRARY_PATHversion management script
#!/bin/bash
# save as ~/bin/cuda-switch
CUDA_VERSION=$1
if [ -z "$CUDA_VERSION" ]; then
    echo "Usage: cuda-switch <version>"
    echo "Available versions:"
    ls -1 /usr/local/ | grep cuda- | sed 's/cuda-//'
    exit 1
fi
export PATH=/usr/local/cuda-$CUDA_VERSION/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-$CUDA_VERSION/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda-$CUDA_VERSION
echo "Switched to CUDA $CUDA_VERSION"
nvcc --versiondevelopment tools
cuda samples
# copy samples to home
cp -r /usr/local/cuda/samples ~/cuda-samples
# build all samples
cd ~/cuda-samples
make -j$(nproc)
# run deviceQuery
./bin/x86_64/linux/release/deviceQueryprofiling tools
# nsight systems
nsys profile ./myapp
# nsight compute
ncu ./myapp
# legacy profiler
nvprof ./myapp  # deprecated but still usefuldebugging
# compile with debug info
nvcc -g -G test.cu -o test
# debug with cuda-gdb
cuda-gdb ./test
(cuda-gdb) break main
(cuda-gdb) run
(cuda-gdb) info cuda kernelscudnn installation
deep learning requires cudnn:
# download from nvidia (requires account)
# https://developer.nvidia.com/cudnn
# install deb package
sudo dpkg -i cudnn-linux-x86_64-9.3.0.xxx_cuda12.deb
# or manual installation
tar -xf cudnn-linux-x86_64-9.3.0.xxx_cuda12.tgz
sudo cp cuda/include/* /usr/local/cuda/include/
sudo cp cuda/lib64/* /usr/local/cuda/lib64/
sudo chmod a+r /usr/local/cuda/include/cudnn*
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*uninstallation
deb method
# remove cuda toolkit
sudo apt remove --purge cuda-toolkit-12-9
sudo apt autoremove
# remove repository
sudo rm /etc/apt/sources.list.d/cuda-ubuntu2204-x86_64.list
sudo apt updaterunfile method
# use uninstaller
sudo /usr/local/cuda-12.9/bin/cuda-uninstaller
# or manual removal
sudo rm -rf /usr/local/cuda-12.9troubleshooting
gcc version mismatch
# cuda 12.9 supports gcc 11-13
sudo apt install gcc-12 g++-12
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 100
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-12 100library not found
# regenerate cache
sudo ldconfig
# check library path
ldconfig -p | grep cuda
# manually add path
echo '/usr/local/cuda/lib64' | sudo tee /etc/ld.so.conf.d/cuda.conf
sudo ldconfigkernel module issues
# rebuild kernel modules
sudo dkms status
sudo dkms install nvidia/xxx.xx.xx
# check loaded modules
lsmod | grep nvidiaperformance tuning
persistence mode
# enable for lower latency
sudo nvidia-smi -pm 1
# set clock speeds
sudo nvidia-smi -ac 1215,1410  # memory,graphics clocksmemory overclocking
# check current clocks
nvidia-smi -q -d CLOCK
# set memory transfer rate offset
sudo nvidia-settings -a '[gpu:0]/GPUMemoryTransferRateOffset[3]=500'integration testing
pytorch test
# test pytorch cuda
python -c "import torch; print(torch.cuda.is_available())"
# detailed info
python -c "import torch; print(torch.version.cuda)"tensorflow test
# test tensorflow cuda
python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"comparison with docker
| aspect | native | docker | 
|---|---|---|
| performance | baseline | ~same | 
| isolation | none | complete | 
| disk usage | ~4gb | ~2gb/image | 
| multi-version | complex | simple | 
| system impact | high | minimal | 
best practices
-  backup before installing - driver conflicts possible
- kernel module issues
 
-  use cuda-toolkit-x-y packages - easier updates
- dependency management
 
-  avoid mixing methods - deb or runfile, not both
- conflicts with paths
 
-  test after updates - kernel updates can break modules
- driver updates affect cuda
 
tips
- install cuda samples for testing
- use nvidia-smi dmonfor monitoring
- cuda-memcheckfor memory debugging
- prefer docker for production
- native for development only