Sponsored by vCluster

GPU-Enabled Platforms on Kubernetes

In Linux, user-space applications can't interact with hardware directly. Every interaction must go through the Linux kernel via system calls. GPUs break this model completely -- they bypass the kernel, manage their own memory, and resist every isolation mechanism containers rely on.

This ebook starts from first principles -- how containers actually work at the kernel level -- and builds up to why GPU multi-tenancy is fundamentally harder than anything else in Kubernetes.

GPU-Enabled Platforms on Kubernetes

By downloading, your email will be shared with vCluster, who may contact you about their products and services. You can unsubscribe from their communications at any time. See vCluster's privacy policy.

Six parts that take you from Linux kernel fundamentals to production GPU platform architecture:

  • Foundations -- How GPUs Meet Kubernetes: How containers work through syscalls, cgroups, and namespaces. Why GPUs break every assumption about resource isolation. How device plugins bridge GPUs into Kubernetes.
  • Why GPU Multi-Tenancy Is Hard: The trust problem -- when two teams share a GPU, one can crash the other's workloads. CUDA memory isn't isolated. There's no cgroup for GPU compute.
  • Orchestrating GPU Sharing: Time-slicing, MPS, and how Kubernetes manages turn-taking on GPU hardware. What happens when two pods try to use the same GPU simultaneously.
  • Hardware Isolation and Enforcement: MIG (Multi-Instance GPU), HAMi, and the trade-offs between software-level and hardware-level isolation. Why MIG profiles can't be changed without draining the node.
  • Monitoring GPU Clusters: Why nvidia-smi shows 87% utilisation when you're doing almost nothing. Real metrics that matter: SM activity, tensor core utilisation, memory bandwidth.
  • Multi-Tenant GPU Platforms with vCluster: Architecting GPU infrastructure with virtual Kubernetes clusters for isolation and efficiency.

This ebook is for platform engineers building internal GPU platforms and infrastructure teams running AI/ML workloads on Kubernetes. If you need to give multiple teams access to expensive GPU hardware without them stepping on each other, this is for you.

By downloading, your email will be shared with vCluster, who may contact you about their products and services. You can unsubscribe from their communications at any time. See vCluster's privacy policy.