Deploying GPU Workloads with Docker EE & Kubernetes

In my previous post I had demonstrated how easily we can setup Docker EE (Enterprise Edition) and all of it’s components including UCP (Universal Control Plane) and DTR (Docker Trusted Registry) on a single node. I had also outlined steps to deploy a sample application using Swarm orchestrator. Taking it further in this post I am going to provide you a walkthrough of how you can deploy GPU (Graphical Processing Unit) workloads on Docker EE using Kubernetes (K8s) as the orchestrator. K8s support is in experimental mode while GPU support for Swarm is still being worked upon. But before we get into details let me start by providing a quick perspective on GPUs.

If you are a geek most likely you have heard of GPUs. They have been around for a while. GPU, as you might know, was designed to perform computations needed for 3D graphics (for instance, interactive video games) an area where CPUs fell short. That’s how a computer’s motherboard started to have two separate chips – one CPU & other for GPU. Technically, GPU is a large array of small CPU processors performing highly parallelized computation.

Cool! But why are modern cloud computing platforms rushing to augment their compute services with GPUs (AWS, Azure, GCE)? Cloud is typically used for backend processing and has nothing to do with traditional displays. So what’s the rush about? The rush is to allow running computational workloads that have found sweet spot with GPUs – these include AI (Artificial Intelligence), Machine Learning, Deep Learning, and HPC (High Performance Computing) among others. These applications of GPU for non-display use cases is popularly referred to as GPGPU – General Purpose GPUs. Do note though, while it’s still possible to run these workloads with traditional CPUs but with GPUs their processing time is reduced from hours to minutes. Nvidia is the leading manufacturer of GPUs followed by AMD radeon. At the same time, cloud provides are coming up with their own native chips to cater to AI workloads.

So how can developers write programs to leverage GPUs? Well a lot of that depends on manufacturer and the tools they have built around it. For this post let us focus on Nvidia and CUDA (Compute Unified Device Architecture). CUDA is a parallel computing platform developed by Nvidia to allow developers use a CUDA enabled GPU (commonly referred as CUDA core). CUDA platform was designed to work with programming languages such as C & C++. As demand grew for deep learning workloads, Nvidia extended the platform with CUDA Deep Neural Network library (cuDNN) which provides a GPU-accelerated library of primitives for deep neural networks. cuDNN is used by popular deep learning frameworks including TensorFlow, MXNet, etc. to achieve GPU acceleration.

By this time you should have understood – to run GPU workloads you will need the manufacturer’s GPU driver / runtime, libraries and frameworks along with your project binaries. You can deploy all of these components on your machines or better leverage containerization and schedule / scale them using Docker and K8s. Coming to K8s the support for GPUs is in experimental mode. K8s implements Device Plugins to let Pods (Scheduling Unit of K8s) access specialized hardware features such as GPUs. Device Plugins are available for AMD and Nvidia but for this post I will continue to stick to Nvidia. Further for K8s you have two Nvidia device plugin implementations to choose from – one is official from Nvidia and other customized for Google Cloud. I will go with former and use a AWS Ubuntu 16.04 GPU based EC2 instance to deploy Nvidia Driver, Nvidia Docker runtime, K8s and then deploy a CUDA sample.

Let’s start by spinning up a p2.xlarge Ubuntu based Deep Learning AMI on AWS. This AMI has Nvidia CUDA and cuDNN preinstalled.

After logging you can execute ‘nvidia-smi’ command to get details of driver, CUDA version, GPU type, running processes and others. The output should be similar to below.

The first thing you need to with this AMI image is to install K8s. There are various options to install K8s but for this post we will be using Docker EE. You can use the same commands to install Docker EE on this node that I had outlined in my earlier post. Docker EE is bundled with upstream K8s (read here to get more details about Docker Kubernetes Service) hence that one node setup should suffice for our exercise. After installing Docker EE, to install Nvidia Docker container runtime you can follow these quick start instructions. As part those instructions don’t forget to enable K8s Device Plugin daemonset for Nvidia by executing below command (if you don’t have Kubectl installed you can follow the steps outlined here) .

kubectl create -f

That’s it! You are all set to deploy your first GPU workload to K8s. To create the Pod you can use the UCP UI or kubectl CLI. Here’s a sample yaml file requesting for 1 GPU to perform vector addition of 50000 elements:

apiVersion: v1
kind: Pod
  name: cuda-vector-add
  restartPolicy: OnFailure
    - name: cuda-vector-add
      image: ""
 1 # requesting 1 GPU

The pod should be scheduled on our single node cluster and run to completion. You can retrieve the logs via UCP UI or CLI as shown below. The logs should show you the job completion with success status.

I will extend our basic setup in future blog posts to demonstrate workloads with popular deep learning libraries. Meanwhile, do let me know if you have any questions or comments on this deployment workflow.

Until then, happy learning 🙂

Working with Docker APIs

Docker’s popularity is due to its simplicity. Most developers can cover quite a bit of ground just with build, push & run commands. But there could be times when you need to start looking beyond CLI commands. For instance, I was working with a customer who was trying to hook up their monitoring tool to Docker Engine to enhance productivity of their ops team. This hook is typically to pull status and stats of docker components and report / alert on them. The way monitoring tool would achieve this is by tapping into APIs provided by Docker Engine. Yes, Docker Engine does provide REST HTTP APIs which can be invoked by any language / runtime with HTTP support (that’s how Docker client works with Docker Engine). Docker even has SDKs for Go and Python. For this post I will be focus on how you can access these APIs using HTTP for both Linux and Windows hosts. Finally we will look at how to access APIs for Docker Enterprise deployments.

Accessing APIs with Linux Hosts

For Linux, Docker daemon listens by default on a socket – /var/run/docker.sock. So using our good old friend CURL here’s the command that you can execute locally to get the list of all containers (equivalent of docker ps)

curl --unix-socket /var/run/docker.sock -H "Content-Type: application/json" -X GET http:/containers/json

But how about accessing the API remotely? Well simple you need to expose a docker endpoint beyond socket using the -H flag for docker service. Link shows you how to configure the same. With that configuration in place you can fire CURL again but this time access the IP rather than the socket.

curl -k  -H "Content-Type: application/json" -X GET

So can just anyone access these APIs remotely? Is there an option to restrict or secure access to APIs? The answer is yes. Docker supports certificate based authentication wherein only authorized clients (possessing cert obtained from authorized CA) can communicate to Docker engine. This article covers required steps in detail to configure certs. Once configuration is done you can use below curl command to access Docker APIs.

curl -k --cert cert.pem --cacert ca.pem --key key.pem

Accessing APIs with Windows Hosts

Unlike Linux which uses sockets, Windows uses pipes. Once you install Docker and run the pipelist command you should see ‘docker_engine’ in the pipelist. You can access this pipe using custom C# code or use Docker.DotNet NuGet Package. Below is a sample code with Docker.DotNet library to retrieve list of containers using ‘docker_engine’ pipe.

DockerClient client = new DockerClientConfiguration(new Uri("npipe://./pipe/docker_engine")).CreateClient();

var containerList = client.Containers.ListContainersAsync(new ContainersListParameters());

foreach (var container in containerList.Result)

In addition, similar to Linux you can use the -H option to allow for TCP connections and also configure certs to secure connection. You can do this via config file, environment variables or just command line as shown below. You can refer to this article for securing Docker Engine with TLS on Windows. Once configured you can use Docker.DotNet to invoke Docker APIs.

dockerd -H npipe:// -H --tlsverify --tlscacert=C:\ProgramData\docker\daemoncerts\ca.pem --tlscert=C:\ProgramData\docker\daemoncerts\cert.pem --tlskey=C:\ProgramData\docker\daemoncerts\key.pem --register-service

Accessing APIs with Docker Enterprise Edition

So far we have invoked APIs for a standalone community docker engine but Docker has other enterprise offerings including Docker Enterprise Edition (EE). If you are not familiar with it check out my post to create a single node test environment. Universal Control Plane (UCP) part of Docker EE provides a set of APIs which you can interact with. Also unlike standalone engines, Docker UCP is secure by default. So to connect to UCP you would need to pass username password credentials or use certs from the client bundle both of which are shown below. With credentials you retrieve an AUTHTOKEN and pass that token for subsequent requests to gain access.

#Accessing UCP APIs via Creds
AUTHTOKEN=$(curl --insecure -s -X POST -d "{ \"username\":\"admin\",\"password\":\"Password123\" }" "" | awk -F ':' '{print    $2}' | tr -d '"{}')

curl -k -H  "content-type: application/json" -H "Authorization: Bearer $AUTHTOKEN" -X GET 

#Accessing UCP APIs via Client Bundle
curl -k --cert cert.pem --cacert ca.pem --key key.pem 

Hope this helps in extend your automation and monitoring tools to Docker!

Deploying Docker Enterprise on a single node

Most developers find installing Docker fairly straight forward. You download the EXE or DMG for your environment and go through an intuitive installation process. But what you have there is typically a Docker Desktop Community Edition (CE). Desktop editions are great for individual developers but Docker has lot more to offer to enterprises who want to run containers at scale. Docker Enterprise provides many capabilities to schedule and manage your container workloads and also offers a private docker image repository. The challenge though is while it’s fairly easy to install Docker Desktop it isn’t quite the same to install Docker Enterprise. Docker Enterprise invariably requires installing Docker Enterprise Engine, installing Universal Control Plane (UCP) with multiple manager and worker nodes, Docker Trusted Registry (DTR) – a private image repo (similar to public Docker Hub), and other components.

In this post, I am going to simplify the setup for you by creating a single node sandbox environment for Docker Enterprise. Hope you will find it useful for your POCs and test use cases. For setup all you will need is a single node VM. You can use any of the public cloud platforms to spin up that single node. For this post, I have setup a Red Hat Enterprise Linux (RHEL) m4.xlarge (4 vCPU & 16GB Mem) instance on AWS. Please note the key components of Docker EE including UCP & DTR only run on Linux platforms. You can install additional Windows nodes later as workers to run Windows workloads.

STEP 1: Install Docker Enterprise Engine
To begin you would have to sign up for free 1 month trial for Docker EE. Once you create docker hub account and fill out the contact details required for trial you should get the below screen.

Screen Shot 2019-07-08 at 3.57.35 PM

Copy the storebits URL and navigate to it in your browser. Here you will find packages for different OS versions. Now RDP into your EC2 instance. Add the relevant docker repo to your yum config manager. I will use RHEL since that’s the OS of my EC2 instance.

$ sudo -E yum-config-manager --add-repo "storebitsurl/rhel/docker-ee.repo"

Then install the docker EE engine, CLI & containerd packages.

$ sudo yum -y install docker-ee docker-ee-cli

In case you are aren’t using RHEL as your OS, here are the links to install Docker Enterprise Engine on Ubuntu, CentOS, SUSE Linux, and Oracle Linux.

STEP 2: Install UCP (Universal Control Plane)
UCP is the heart of Docker Enterprise. It allows for running containers at scale, across multiple nodes, let’s you choose an orchestrator, provides RBAC enabled web UI for operations, and much more. The UCP installer runs through ‘ucp’ container image installing ucp agent which in turn bootstraps services and containers required by UCP.

docker container run --rm -it \
-v /var/run/docker.sock:/var/run/docker.sock `#mount docker sock` \
docker/ucp install `#ucp image is stored on docker hub` \
--host-address `#private IP of your node` \
--san `#public IP of your node` \
--admin-password YourPassword `#set the default password` \
--force-minimums `#for cases where node configuration is less than ideal`

Upon completion you should be able to browse your public IP ( and access UCP UI shown below. To login use the ‘admin’ username and password ‘YourPassword’.

Screen Shot 2019-07-26 at 6.24.19 AM

STEP 3: Install DTR (Docker Trusted Registry)

DTR installation takes a similar route to UCP. It uses a container image ‘dtr’ with install command.

docker run -it --rm docker/dtr install \
--ucp-node yourNodeName `#try docker node ls to retrive name` \
--ucp-username admin \
--ucp-url \
--ucp-insecure-tls `#ignore cert errors for self signed UCP certs` \
--replica-http-port 8080 `#choose a port other than 80 to avoid conflict with UCP` \
--replica-https-port 8843 `#choose a port other than 443 to avoid conflict with UCP` \
--dtr-external-url `#this will add the external IP to DTR self-signed cert`

Once the command completes successfully you should be able to browse DTR UI on https port (

Screen Shot 2019-07-26 at 6.33.59 AM

STEP 4: Deploy an App

Let’s deploy an app to Docker EE. For this I would pull an ASP.NET sample image, create a org and repo in DTR, tag image for DTR repo and upload it. Post upload we can use docker stack to deploy the application. To begin let’s create an organization and repository in our DTR.

Screen Shot 2019-07-26 at 5.06.37 AM

Now let’s pull the image from Docker Hub, Tag it and Push it to DTR.

docker pull
docker tag
docker image push

Next create a docker-compose.yml file with below instructions.

version: "3"
 # replace username/repo:tag with your name and image details
 replicas: 1
 condition: on-failure
 cpus: "0.1"
 memory: 50M
 - "3776:80"

Once file is created run the docker stack deploy command to deploy listed services.

docker stack deploy -c docker-compose.yml aspnetapp

You should be able to browse app with node’s public IP on port 3776; in my case

Screen Shot 2019-07-26 at 6.38.43 AM.png

In future, we will dig into more Docker EE features but hope this single node setup provided you a jumpstart in exploring Docker Enterprise.

Happy Learning 🙂 !