GPU Passthrough for Proxmox: Complete Setup Guide 2026
GPU passthrough lets a virtual machine talk directly to a physical GPU with near-zero overhead. For home lab users running local LLMs, this means you can run Proxmox as your hypervisor, pass an NVIDIA GPU to a dedicated AI VM, and get the same tokens-per-second you’d see on bare metal — while still running other VMs and containers on the same box.
The setup isn’t hard, but it’s specific. One wrong kernel parameter and the GPU won’t unbind. One missing VFIO module and the VM won’t start. This guide walks through every step, from BIOS settings to running your first Ollama prompt in a GPU-accelerated VM.
If you’re still choosing a GPU, read best GPU for local LLMs first. If you’re choosing hardware to run Proxmox on, see best mini PCs for Proxmox.
Hardware Requirements
GPU passthrough depends on specific hardware features. Not every motherboard, CPU, or GPU combination works cleanly. Here’s what you need.
CPU: IOMMU Support Is Non-Negotiable
Your CPU must support IOMMU — the technology that lets the hypervisor map physical devices to virtual machines safely.
- Intel: VT-d (Virtualization Technology for Directed I/O). Present on most Core i5/i7/i9 from 6th gen onward, and all Xeon processors.
- AMD: AMD-Vi. Present on all Ryzen and EPYC processors.
Check your CPU’s spec sheet on Intel Ark or AMD’s product page. If IOMMU isn’t listed, passthrough won’t work — no software workaround exists.
Motherboard: IOMMU Grouping Matters
The motherboard’s chipset determines how devices are grouped in IOMMU. Each IOMMU group is a unit — you must pass through an entire group to a VM. If your GPU shares a group with your SATA controller, you’d have to pass both.
Good IOMMU grouping means each PCIe slot gets its own group. Server and workstation boards generally do this well. Consumer boards vary. AMD X570/X670/B650 boards tend to have better grouping than equivalent Intel consumer boards.
If your board has bad grouping, the ACS override patch can split groups apart, but it reduces security isolation. Try to avoid needing it.
Recommended GPUs
Not every GPU passes through cleanly. These are proven choices for Proxmox passthrough in home labs:
NVIDIA RTX 3090 (Used) — ~$1,730. 24 GB VRAM handles models up to 32B parameters at Q4 quantization. Well-documented passthrough behavior. The go-to card for local LLM workloads with maximum VRAM. See how much VRAM you need for LLMs to size your GPU correctly.
NVIDIA RTX 4060 Ti 16GB — ~$450. 16 GB VRAM in a 165W TDP. Fits in compact builds where power and thermals are constrained. Good for 8B–13B models.
NVIDIA Tesla P40 — ~$400 used. 24 GB VRAM, no video output (compute-only), runs cool in server chassis. Requires a blower or custom fan mount in desktop cases. Strong value if you only need headless inference.
For a full comparison, see best GPU for home server inference.
Recommended Host Hardware
Minisforum MS-01 — ~$600 (barebones). Intel Core i9-13900H, PCIe x16 slot via OCuLink/expansion, dual 10GbE. One of the few mini PCs that supports full GPU passthrough. Compact and power-efficient. See best mini PCs for local AI for more options.
Custom tower build. An LGA 1700 or AM5 board with a decent IOMMU layout, 64 GB RAM, and a proper x16 PCIe slot. More flexible than a mini PC, better airflow for high-TDP GPUs like the RTX 3090.
Step-by-Step Setup
This guide assumes Proxmox VE 8.x on a system with an NVIDIA GPU. The process is nearly identical for AMD GPUs — just swap the driver names.
1. Enable IOMMU in BIOS
Reboot into your BIOS/UEFI settings and enable:
- Intel: VT-d (usually under CPU or Advanced settings)
- AMD: AMD-Vi / IOMMU (usually under Advanced or NBIO settings)
Also ensure SR-IOV is enabled if available, and that Above 4G Decoding is turned on — some GPUs need it for proper BAR mapping.
2. Configure Kernel Parameters
Proxmox uses GRUB by default on BIOS systems and systemd-boot on UEFI systems. You need to add IOMMU parameters to whichever bootloader you’re using.
For GRUB (most common):
Edit /etc/default/grub and modify the GRUB_CMDLINE_LINUX_DEFAULT line:
# Intel CPU:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
# AMD CPU:
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt"
Then update GRUB:
update-grub
For systemd-boot:
Edit the appropriate entry in /etc/kernel/cmdline:
# Intel CPU:
root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on iommu=pt
# AMD CPU:
root=ZFS=rpool/ROOT/pve-1 boot=zfs amd_iommu=on iommu=pt
Then refresh boot entries:
proxmox-boot-tool refresh
The iommu=pt flag enables passthrough mode, which improves performance for devices that aren’t passed through by skipping IOMMU translation for them.
3. Load VFIO Modules
Add the VFIO kernel modules so they load at boot. Edit /etc/modules and add:
vfio
vfio_iommu_type1
vfio_pci
4. Blacklist Host GPU Drivers
Prevent the Proxmox host from claiming the GPU. Create /etc/modprobe.d/blacklist-gpu.conf:
blacklist nouveau
blacklist nvidia
blacklist nvidiafb
blacklist nvidia_drm
For AMD GPUs, blacklist radeon and amdgpu instead.
5. Bind the GPU to VFIO
Find your GPU’s PCI IDs:
lspci -nn | grep -i nvidia
You’ll see output like:
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation ... [10de:2204]
01:00.1 Audio device [0403]: NVIDIA Corporation ... [10de:1aef]
Note the vendor:device IDs in brackets. Create /etc/modprobe.d/vfio.conf:
options vfio-pci ids=10de:2204,10de:1aef disable_vga=1
Include both the GPU and its audio device — they share an IOMMU group and must be passed through together.
Update the initramfs and reboot:
update-initramfs -u -k all
reboot
6. Verify IOMMU and VFIO
After reboot, verify everything is working:
# Check IOMMU is enabled
dmesg | grep -e DMAR -e IOMMU
# Check VFIO has claimed the GPU
lspci -nnk -s 01:00 | grep "Kernel driver in use"
# Should show: Kernel driver in use: vfio-pci
If the kernel driver shows nvidia or nouveau instead of vfio-pci, the blacklisting or VFIO binding didn’t take effect. Double-check your config files and run update-initramfs -u -k all again.
7. Create a VM with GPU Passthrough
In the Proxmox web UI:
- Create a new VM with your preferred OS (Ubuntu Server 22.04/24.04 for LLM workloads, or Windows 11 for gaming/rendering).
- Set Machine type to
q35and BIOS toOVMF (UEFI). - Add an EFI disk.
- Under Hardware, click Add > PCI Device.
- Select your GPU from the list. Enable All Functions to include the audio device. Enable PCI-Express mode.
- If passing through to a Windows VM, also check Primary GPU if you want display output.
Important VM settings (edit /etc/pve/qemu-server/<vmid>.conf directly):
cpu: host
machine: q35
For Windows VMs, add these lines to hide the hypervisor from NVIDIA drivers:
args: -cpu host,kvm=off,hv_vendor_id=proxmox
cpu: host,hidden=1
8. Install GPU Drivers in the VM
Linux VM:
# Ubuntu/Debian
sudo apt update
sudo apt install -y nvidia-driver-550 nvidia-cuda-toolkit
# Verify
nvidia-smi
Windows VM:
Download the latest NVIDIA driver from nvidia.com and install normally. If you see Code 43 in Device Manager, revisit the hypervisor-hiding flags from the previous step.
Running Ollama in a GPU-Passthrough VM
Once your Linux VM has working NVIDIA drivers (verify with nvidia-smi), setting up Ollama for local LLM inference is straightforward:
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull a model and run it
ollama pull llama3.1:8b
ollama run llama3.1:8b "Explain GPU passthrough in one sentence"
Ollama will automatically detect and use the passed-through GPU. You can verify with:
ollama ps
# Should show the model loaded on GPU with VRAM usage
For llama.cpp users, the same principle applies — compile with CUDA support and point it at the GPU. Performance in a VFIO-passthrough VM is within 1–3% of bare metal because the VM has direct hardware access through the IOMMU, bypassing the hypervisor for data-plane operations.
This setup gives you the best of both worlds: Proxmox manages your containers and VMs (Pi-hole, Home Assistant, NAS services), while a dedicated VM gets full GPU access for AI workloads. If the LLM VM crashes, nothing else goes down.
Troubleshooting Common Issues
IOMMU Groups Not Separating
If your GPU shares an IOMMU group with other devices, you have two options:
- Move the GPU to a different PCIe slot. Different slots often map to different IOMMU groups.
- Apply the ACS override patch. This kernel patch forces each device into its own group. It weakens device isolation, but for a home lab this is acceptable. Add
pcie_acs_override=downstream,multifunctionto your kernel parameters.
Check your groups with:
find /sys/kernel/iommu_groups/ -type l | sort -V
Code 43 in Windows VMs
NVIDIA consumer GeForce drivers detect hypervisors and refuse to load, showing Error 43 in Device Manager. The fix is hiding the hypervisor:
args: -cpu host,kvm=off,hv_vendor_id=proxmox
cpu: host,hidden=1
Also ensure the VM machine type is q35 and BIOS is OVMF. Legacy i440fx and SeaBIOS configurations trigger Code 43 more often.
GPU Reset Bug
Some GPUs (notably AMD Polaris/Navi and older NVIDIA cards) don’t reset properly when a VM shuts down. The GPU gets stuck in a bad state and can’t be re-assigned until you reboot the host.
Workarounds:
- NVIDIA: The vendor reset kernel module (
nvidia-vfio-reset) fixes most cases. Install it from the GitHub repository. - AMD: The
vendor-resetkernel module handles most Polaris and Navi cards. - Nuclear option: Script the VM to always stop cleanly before shutdown and avoid force-killing GPU VMs.
Audio Passthrough
The GPU’s HDMI/DP audio device passes through automatically when you select “All Functions” during PCI device assignment. If audio is choppy or missing in the VM, ensure both PCI function 0 (GPU) and function 1 (audio) are assigned to the same VM.
For USB audio devices or separate sound cards, pass them through as additional PCI or USB devices.
Common Mistakes to Avoid
Not including the audio function. The GPU and its audio controller share an IOMMU group. If you pass through the GPU but not the audio device, the VM won’t start. Always pass through all functions.
Using i440fx instead of q35. The q35 machine type supports PCIe passthrough natively. The older i440fx emulates PCI, not PCIe, which causes compatibility problems with modern GPUs.
Forgetting to update initramfs. Every change to modprobe or VFIO configuration requires update-initramfs -u -k all to take effect. Without this, your changes sit in config files but don’t apply at boot.
Skipping UEFI/OVMF. GPU passthrough requires UEFI boot in the VM. SeaBIOS (legacy BIOS) doesn’t properly initialize modern GPUs. Always select OVMF when creating the VM.
Assigning too little RAM. GPU workloads, especially LLMs, need system RAM in addition to VRAM. The model weights load into VRAM, but the inference engine, OS, and surrounding services need system RAM. Allocate at least 16 GB to an LLM VM — 32 GB if you’re running models larger than 13B.
What’s Next
With GPU passthrough running, you have a proper AI inference setup inside your Proxmox home lab. From here, you can:
- Benchmark your GPU’s inference performance — see best GPU for local LLMs for reference numbers
- Set up a second GPU for multi-GPU inference or pass one GPU to each of two VMs
- Expose your Ollama instance as an API endpoint for other VMs and containers on your network
- Compare your setup against purpose-built AI mini PCs — see best mini PCs for local AI
Frequently Asked Questions
Can I pass through my only GPU in Proxmox?
Does GPU passthrough work with AMD CPUs?
Can I run Ollama with a passed-through GPU?
Why does my Windows VM show Code 43 for the NVIDIA GPU?
Do I need a separate GPU for the Proxmox host?
Get our weekly picks
The best home lab deals and new reviews, every week. Free, no spam.
Join home lab builders who get deals first.