Kubernetes for GPU Workloads

Preparing Kubernetes for GPU Workloads

As of version 1.10, Kubernetes has support for GPU accelerated workloads. Packet’s line of GPU-accelerated machines make an excellent substrate for containerized workloads leveraging GPUs within Kubernetes using the native tooling available in your cluster.

However, unlike traditional containerized workloads, some additional setup to add driver and plugin support for the GPU devices is required to configure Docker and Kubernetes to schedule resources for them.


The NVIDIA container runtime creates a backend for Docker to connect to in order to run containers with GPU acceleration, so much like any other container runtime supported by Kubernetes, this backend can be made available to Kubernetes, which we will need to setup first.

Configuring your nodes, if you are using a diverse group of hardware types within your cluster, will require a couple of pre-requisites on your GPU-equipped nodes:

1 . Docker and the nvidia-docker package need to be installed. 2. The NVIDIA container runtime may need be set on your nodes’ Docker daemon configuration, depending which nvidia-docker package is being used. 3 .You can also make node preparation part of your worker spin-up routine. 4. Enabling GPU support for your cluster by applying the NVIDIA device DaemonSet to your cluster.

This will use the specialized runtime for GPU-targeted workloads on the nodes that those workloads will be scheduled to, and by adding the Nvidia device plugin to your cluster, once scheduled, nodes with this specialized hardware can receive these pods as scheduled.

Scheduling your workloads

Kubernetes replication controllers like DaemonSets and Deployments allow you to target these Pods (and the accompanying replication controller) onto taint-ed nodes by setting a toleration in your spec in order to effect the scheduling rules for workloads of a certain type, onto a node tainted for whatever reason.

In this case, the nodes with GPUs will have a taint indicating it is a GPU node:

So, you can, then, target a pod set to GPU nodes by setting a toleration like:

apiVersion: extensions/v1beta1
kind: DaemonSet
  name: some-gpu-workload
  namespace: default
      - key:
        operator: Exists
        effect: NoSchedule

Which means that nodes that do not carry the key taint, will be unschedulable when this key exists in a deployment; in the above example, a DaemonSet will traditionally run on all nodes in a cluster, so with this toleration in place, you can target a complete subset of nodes matching that toleration to taints on the nodes, in this case, the presence of the GPU, surfaced through this identifier.

Alternatively, more broadly as hardware diversity grows in your environment, you may also use a traditional nodeSelector in your spec to identify GPUs of a specific type, by first labelling the nodes, and then in your Pod spec:

    accelerator nvidia-${model}

which defines the accelrator GPU model that is appropriate for the workload in question.

Additional Resources

If you use GPU-accelerated software now, or if you are new to the space, and want to see how Kubernetes/Containers can help, there are some common use cases for GPU-based workloads that fit right into a container, and benefit from a tightly controlled orchestration platform for those resource lifecycles like compute cycles, etc. are things like data science and machine learning software, such as:

and all manner of high-performance parallel programming tasks for containerized tasks, or one-off workloads like Kubernetes Job resources or as a backend for FaaS operations.

Was it helpful?