Skip to content

GPU Passthrough for Proxmox: Complete Setup Guide 2026

· · 9 min read

GPU passthrough lets a virtual machine talk directly to a physical GPU with near-zero overhead. For home lab users running local LLMs, this means you can run Proxmox as your hypervisor, pass an NVIDIA GPU to a dedicated AI VM, and get the same tokens-per-second you’d see on bare metal — while still running other VMs and containers on the same box.

The setup isn’t hard, but it’s specific. One wrong kernel parameter and the GPU won’t unbind. One missing VFIO module and the VM won’t start. This guide walks through every step, from BIOS settings to running your first Ollama prompt in a GPU-accelerated VM.

If you’re still choosing a GPU, read best GPU for local LLMs first. If you’re choosing hardware to run Proxmox on, see best mini PCs for Proxmox.

Hardware Requirements

GPU passthrough depends on specific hardware features. Not every motherboard, CPU, or GPU combination works cleanly. Here’s what you need.

CPU: IOMMU Support Is Non-Negotiable

Your CPU must support IOMMU — the technology that lets the hypervisor map physical devices to virtual machines safely.

  • Intel: VT-d (Virtualization Technology for Directed I/O). Present on most Core i5/i7/i9 from 6th gen onward, and all Xeon processors.
  • AMD: AMD-Vi. Present on all Ryzen and EPYC processors.

Check your CPU’s spec sheet on Intel Ark or AMD’s product page. If IOMMU isn’t listed, passthrough won’t work — no software workaround exists.

Motherboard: IOMMU Grouping Matters

The motherboard’s chipset determines how devices are grouped in IOMMU. Each IOMMU group is a unit — you must pass through an entire group to a VM. If your GPU shares a group with your SATA controller, you’d have to pass both.

Good IOMMU grouping means each PCIe slot gets its own group. Server and workstation boards generally do this well. Consumer boards vary. AMD X570/X670/B650 boards tend to have better grouping than equivalent Intel consumer boards.

If your board has bad grouping, the ACS override patch can split groups apart, but it reduces security isolation. Try to avoid needing it.

Not every GPU passes through cleanly. These are proven choices for Proxmox passthrough in home labs:

NVIDIA RTX 3090 (Used) — ~$1,730. 24 GB VRAM handles models up to 32B parameters at Q4 quantization. Well-documented passthrough behavior. The go-to card for local LLM workloads with maximum VRAM. See how much VRAM you need for LLMs to size your GPU correctly.

NVIDIA RTX 4060 Ti 16GB — ~$450. 16 GB VRAM in a 165W TDP. Fits in compact builds where power and thermals are constrained. Good for 8B–13B models.

NVIDIA Tesla P40 — ~$400 used. 24 GB VRAM, no video output (compute-only), runs cool in server chassis. Requires a blower or custom fan mount in desktop cases. Strong value if you only need headless inference.

For a full comparison, see best GPU for home server inference.

Minisforum MS-01 — ~$600 (barebones). Intel Core i9-13900H, PCIe x16 slot via OCuLink/expansion, dual 10GbE. One of the few mini PCs that supports full GPU passthrough. Compact and power-efficient. See best mini PCs for local AI for more options.

Custom tower build. An LGA 1700 or AM5 board with a decent IOMMU layout, 64 GB RAM, and a proper x16 PCIe slot. More flexible than a mini PC, better airflow for high-TDP GPUs like the RTX 3090.

Step-by-Step Setup

This guide assumes Proxmox VE 8.x on a system with an NVIDIA GPU. The process is nearly identical for AMD GPUs — just swap the driver names.

1. Enable IOMMU in BIOS

Reboot into your BIOS/UEFI settings and enable:

  • Intel: VT-d (usually under CPU or Advanced settings)
  • AMD: AMD-Vi / IOMMU (usually under Advanced or NBIO settings)

Also ensure SR-IOV is enabled if available, and that Above 4G Decoding is turned on — some GPUs need it for proper BAR mapping.

2. Configure Kernel Parameters

Proxmox uses GRUB by default on BIOS systems and systemd-boot on UEFI systems. You need to add IOMMU parameters to whichever bootloader you’re using.

For GRUB (most common):

Edit /etc/default/grub and modify the GRUB_CMDLINE_LINUX_DEFAULT line:

# Intel CPU:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"

# AMD CPU:
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt"

Then update GRUB:

update-grub

For systemd-boot:

Edit the appropriate entry in /etc/kernel/cmdline:

# Intel CPU:
root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on iommu=pt

# AMD CPU:
root=ZFS=rpool/ROOT/pve-1 boot=zfs amd_iommu=on iommu=pt

Then refresh boot entries:

proxmox-boot-tool refresh

The iommu=pt flag enables passthrough mode, which improves performance for devices that aren’t passed through by skipping IOMMU translation for them.

3. Load VFIO Modules

Add the VFIO kernel modules so they load at boot. Edit /etc/modules and add:

vfio
vfio_iommu_type1
vfio_pci

4. Blacklist Host GPU Drivers

Prevent the Proxmox host from claiming the GPU. Create /etc/modprobe.d/blacklist-gpu.conf:

blacklist nouveau
blacklist nvidia
blacklist nvidiafb
blacklist nvidia_drm

For AMD GPUs, blacklist radeon and amdgpu instead.

5. Bind the GPU to VFIO

Find your GPU’s PCI IDs:

lspci -nn | grep -i nvidia

You’ll see output like:

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation ... [10de:2204]
01:00.1 Audio device [0403]: NVIDIA Corporation ... [10de:1aef]

Note the vendor:device IDs in brackets. Create /etc/modprobe.d/vfio.conf:

options vfio-pci ids=10de:2204,10de:1aef disable_vga=1

Include both the GPU and its audio device — they share an IOMMU group and must be passed through together.

Update the initramfs and reboot:

update-initramfs -u -k all
reboot

6. Verify IOMMU and VFIO

After reboot, verify everything is working:

# Check IOMMU is enabled
dmesg | grep -e DMAR -e IOMMU

# Check VFIO has claimed the GPU
lspci -nnk -s 01:00 | grep "Kernel driver in use"
# Should show: Kernel driver in use: vfio-pci

If the kernel driver shows nvidia or nouveau instead of vfio-pci, the blacklisting or VFIO binding didn’t take effect. Double-check your config files and run update-initramfs -u -k all again.

7. Create a VM with GPU Passthrough

In the Proxmox web UI:

  1. Create a new VM with your preferred OS (Ubuntu Server 22.04/24.04 for LLM workloads, or Windows 11 for gaming/rendering).
  2. Set Machine type to q35 and BIOS to OVMF (UEFI).
  3. Add an EFI disk.
  4. Under Hardware, click Add > PCI Device.
  5. Select your GPU from the list. Enable All Functions to include the audio device. Enable PCI-Express mode.
  6. If passing through to a Windows VM, also check Primary GPU if you want display output.

Important VM settings (edit /etc/pve/qemu-server/<vmid>.conf directly):

cpu: host
machine: q35

For Windows VMs, add these lines to hide the hypervisor from NVIDIA drivers:

args: -cpu host,kvm=off,hv_vendor_id=proxmox
cpu: host,hidden=1

8. Install GPU Drivers in the VM

Linux VM:

# Ubuntu/Debian
sudo apt update
sudo apt install -y nvidia-driver-550 nvidia-cuda-toolkit

# Verify
nvidia-smi

Windows VM:

Download the latest NVIDIA driver from nvidia.com and install normally. If you see Code 43 in Device Manager, revisit the hypervisor-hiding flags from the previous step.

Running Ollama in a GPU-Passthrough VM

Once your Linux VM has working NVIDIA drivers (verify with nvidia-smi), setting up Ollama for local LLM inference is straightforward:

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull a model and run it
ollama pull llama3.1:8b
ollama run llama3.1:8b "Explain GPU passthrough in one sentence"

Ollama will automatically detect and use the passed-through GPU. You can verify with:

ollama ps
# Should show the model loaded on GPU with VRAM usage

For llama.cpp users, the same principle applies — compile with CUDA support and point it at the GPU. Performance in a VFIO-passthrough VM is within 1–3% of bare metal because the VM has direct hardware access through the IOMMU, bypassing the hypervisor for data-plane operations.

This setup gives you the best of both worlds: Proxmox manages your containers and VMs (Pi-hole, Home Assistant, NAS services), while a dedicated VM gets full GPU access for AI workloads. If the LLM VM crashes, nothing else goes down.

Troubleshooting Common Issues

IOMMU Groups Not Separating

If your GPU shares an IOMMU group with other devices, you have two options:

  1. Move the GPU to a different PCIe slot. Different slots often map to different IOMMU groups.
  2. Apply the ACS override patch. This kernel patch forces each device into its own group. It weakens device isolation, but for a home lab this is acceptable. Add pcie_acs_override=downstream,multifunction to your kernel parameters.

Check your groups with:

find /sys/kernel/iommu_groups/ -type l | sort -V

Code 43 in Windows VMs

NVIDIA consumer GeForce drivers detect hypervisors and refuse to load, showing Error 43 in Device Manager. The fix is hiding the hypervisor:

args: -cpu host,kvm=off,hv_vendor_id=proxmox
cpu: host,hidden=1

Also ensure the VM machine type is q35 and BIOS is OVMF. Legacy i440fx and SeaBIOS configurations trigger Code 43 more often.

GPU Reset Bug

Some GPUs (notably AMD Polaris/Navi and older NVIDIA cards) don’t reset properly when a VM shuts down. The GPU gets stuck in a bad state and can’t be re-assigned until you reboot the host.

Workarounds:

  • NVIDIA: The vendor reset kernel module (nvidia-vfio-reset) fixes most cases. Install it from the GitHub repository.
  • AMD: The vendor-reset kernel module handles most Polaris and Navi cards.
  • Nuclear option: Script the VM to always stop cleanly before shutdown and avoid force-killing GPU VMs.

Audio Passthrough

The GPU’s HDMI/DP audio device passes through automatically when you select “All Functions” during PCI device assignment. If audio is choppy or missing in the VM, ensure both PCI function 0 (GPU) and function 1 (audio) are assigned to the same VM.

For USB audio devices or separate sound cards, pass them through as additional PCI or USB devices.

Common Mistakes to Avoid

Not including the audio function. The GPU and its audio controller share an IOMMU group. If you pass through the GPU but not the audio device, the VM won’t start. Always pass through all functions.

Using i440fx instead of q35. The q35 machine type supports PCIe passthrough natively. The older i440fx emulates PCI, not PCIe, which causes compatibility problems with modern GPUs.

Forgetting to update initramfs. Every change to modprobe or VFIO configuration requires update-initramfs -u -k all to take effect. Without this, your changes sit in config files but don’t apply at boot.

Skipping UEFI/OVMF. GPU passthrough requires UEFI boot in the VM. SeaBIOS (legacy BIOS) doesn’t properly initialize modern GPUs. Always select OVMF when creating the VM.

Assigning too little RAM. GPU workloads, especially LLMs, need system RAM in addition to VRAM. The model weights load into VRAM, but the inference engine, OS, and surrounding services need system RAM. Allocate at least 16 GB to an LLM VM — 32 GB if you’re running models larger than 13B.

What’s Next

With GPU passthrough running, you have a proper AI inference setup inside your Proxmox home lab. From here, you can:

  • Benchmark your GPU’s inference performance — see best GPU for local LLMs for reference numbers
  • Set up a second GPU for multi-GPU inference or pass one GPU to each of two VMs
  • Expose your Ollama instance as an API endpoint for other VMs and containers on your network
  • Compare your setup against purpose-built AI mini PCs — see best mini PCs for local AI

Frequently Asked Questions

Can I pass through my only GPU in Proxmox?
Yes, but you'll lose video output on the host. Proxmox runs fine headless — manage it through the web UI from another device. The GPU gets fully dedicated to the VM.
Does GPU passthrough work with AMD CPUs?
Yes. AMD CPUs with AMD-Vi (IOMMU) support passthrough just as well as Intel VT-d. AMD platforms often have better IOMMU grouping out of the box, making passthrough easier to configure.
Can I run Ollama with a passed-through GPU?
Yes. Once the GPU is passed through to a Linux VM and NVIDIA drivers are installed, Ollama detects the GPU automatically. Performance is within 1-3% of bare metal because VFIO gives the VM direct hardware access.
Why does my Windows VM show Code 43 for the NVIDIA GPU?
NVIDIA consumer drivers detect hypervisors and throw Code 43. Fix it by adding hidden state flags to your VM config: 'args: -cpu host,kvm=off' and setting the vendor_id in the CPU section. This hides the virtualization from the driver.
Do I need a separate GPU for the Proxmox host?
No. Proxmox is managed entirely through its web interface at https://your-ip:8006. Most home lab users run Proxmox headless and pass through their only GPU to a VM.

Get our weekly picks

The best home lab deals and new reviews, every week. Free, no spam.

Join home lab builders who get deals first.