Overview of Calico CNI for Kubernetes

One of the key reasons why Kubernetes (K8s) is so popular is due to its single responsibility design. It largely confines itself to one specific job of scheduling and running container workloads. For rest, it relies on container ecosystem vendors and open specifications to fill the gaps. Take for instance ‘CRI’ (Container Runtime interface) – a plugin interface which enables K8s to work with wide variety of container runtimes, including Docker. Another area is Networking. After you schedule the workloads to run on specific nodes, these workloads would invariably need IP address, traffic routing between them and at times even block traffic for security reasons. K8s could have baked in a networking solution into the scheduling engine but it didn’t. Enter CNI (Container Network Interface) – an open specification for container networking plugins that has been adopted by many projects including K8s. There are over dozen vendors out there who have implemented CNI specification and Calico, an open source project from Tigera, is one of them.

Once you install K8s, the first thing typically you would be required to do is to install a CNI plugin. For instance, if you are using kubeadm the official documentation states –

The network must be deployed before any applications. Also, CoreDNS will not start up before a network is installed. kubeadm only supports Container Network Interface (CNI) based networks…

To install Calico you can use yaml file which creates calico-node daemonset (pod that runs on all the nodes of the K8s cluster). Though if you really want to know the Calico installation underpinnings, it might benefit to do it the hard way.

kubectl apply -f https://docs.projectcalico.org/v3.8/manifests/calico.yaml

On other side, if you are using a managed K8s platform like Docker EE, Calico is setup for you by default. Regardless of how you install Calico you should also install Calicoctl – the command line option to interact with Calico resources. Three key components that make Calico work are Felix, BIRD, and etcd. Felix is the primary Calico agent that runs on each machine that hosts endpoints, programming routes and ensure only valid traffic is sent across endpoints. BIRD is a BGP client that distributes routes created by Felix across the datacenter. Calico uses etcd as a distributed data store to ensure network consistency. Once configured, Calico provides you a seamless networking experience. Customizations are typically around IPAM (IP Address Management) and Network Security requirements. Let’s take a quick look at them starting with IPAM.

By default Calico uses 192.168.0.0/16 IP pool to assign IP addresses to pods (sudo ./calicoctl get ippools -o wide). Calico allows to create multiple IP pools and request for an IP from a given IP pool as part of pod deployment. To request IP from a given IP pool you would add a K8s annotation to your pod yaml file as shown below. You can also refer to this link for detailed instruction.

annotations: "cni.projectcalico.org/ipv4pools": "[\"custompool-ipv4-ippool\"]"

Network policy is a specification of how groups of pods are allowed to communicate with each other and other network endpoints. NetworkPolicy is a standard K8s resource that uses labels to select pods and define rules which specify what traffic is allowed to the selected pods. Calico extends this further allowing for policy ordering/priority, deny rules, and more flexible match rules. Here’s a simple Calico tutorial on how to apply NetworkPolicy.

Couple of things before we wrap up. Most of public cloud providers have a native CNI plugin but using Calico provides you the desired portability (same reasons for using Docker EE platform). Also most cloud provides don’t allow for BGP (excluding the VPN Gateways & VNET peering) so it’s quite possible that you may use cloud native plugins for routing & IPAM and leverage Calico plugin for extended network policy features (also note Calico does supports IPinIP and VXLAN to overcome BGP shortcomings). You can examine the CNI plugin configuration of your cluster by looking at /etc/cni/net.d/10-calico.conflist.

Hope that gave you a fair idea of where CNI & Calico fit into the K8s ecosystem. Please let me know your thoughts in comments section below.

Deploying GPU Workloads with Docker EE & Kubernetes

In my previous post I had demonstrated how easily we can setup Docker EE (Enterprise Edition) and all of it’s components including UCP (Universal Control Plane) and DTR (Docker Trusted Registry) on a single node. I had also outlined steps to deploy a sample application using Swarm orchestrator. Taking it further in this post I am going to provide you a walkthrough of how you can deploy GPU (Graphical Processing Unit) workloads on Docker EE using Kubernetes (K8s) as the orchestrator. K8s support is in experimental mode while GPU support for Swarm is still being worked upon. But before we get into details let me start by providing a quick perspective on GPUs.

If you are a geek most likely you have heard of GPUs. They have been around for a while. GPU, as you might know, was designed to perform computations needed for 3D graphics (for instance, interactive video games) an area where CPUs fell short. That’s how a computer’s motherboard started to have two separate chips – one CPU & other for GPU. Technically, GPU is a large array of small CPU processors performing highly parallelized computation.

Cool! But why are modern cloud computing platforms rushing to augment their compute services with GPUs (AWS, Azure, GCE)? Cloud is typically used for backend processing and has nothing to do with traditional displays. So what’s the rush about? The rush is to allow running computational workloads that have found sweet spot with GPUs – these include AI (Artificial Intelligence), Machine Learning, Deep Learning, and HPC (High Performance Computing) among others. These applications of GPU for non-display use cases is popularly referred to as GPGPU – General Purpose GPUs. Do note though, while it’s still possible to run these workloads with traditional CPUs but with GPUs their processing time is reduced from hours to minutes. Nvidia is the leading manufacturer of GPUs followed by AMD radeon. At the same time, cloud provides are coming up with their own native chips to cater to AI workloads.

So how can developers write programs to leverage GPUs? Well a lot of that depends on manufacturer and the tools they have built around it. For this post let us focus on Nvidia and CUDA (Compute Unified Device Architecture). CUDA is a parallel computing platform developed by Nvidia to allow developers use a CUDA enabled GPU (commonly referred as CUDA core). CUDA platform was designed to work with programming languages such as C & C++. As demand grew for deep learning workloads, Nvidia extended the platform with CUDA Deep Neural Network library (cuDNN) which provides a GPU-accelerated library of primitives for deep neural networks. cuDNN is used by popular deep learning frameworks including TensorFlow, MXNet, etc. to achieve GPU acceleration.

By this time you should have understood – to run GPU workloads you will need the manufacturer’s GPU driver / runtime, libraries and frameworks along with your project binaries. You can deploy all of these components on your machines or better leverage containerization and schedule / scale them using Docker and K8s. Coming to K8s the support for GPUs is in experimental mode. K8s implements Device Plugins to let Pods (Scheduling Unit of K8s) access specialized hardware features such as GPUs. Device Plugins are available for AMD and Nvidia but for this post I will continue to stick to Nvidia. Further for K8s you have two Nvidia device plugin implementations to choose from – one is official from Nvidia and other customized for Google Cloud. I will go with former and use a AWS Ubuntu 16.04 GPU based EC2 instance to deploy Nvidia Driver, Nvidia Docker runtime, K8s and then deploy a CUDA sample.

Let’s start by spinning up a p2.xlarge Ubuntu based Deep Learning AMI on AWS. This AMI has Nvidia CUDA and cuDNN preinstalled.

After logging you can execute ‘nvidia-smi’ command to get details of driver, CUDA version, GPU type, running processes and others. The output should be similar to below.

The first thing you need to with this AMI image is to install K8s. There are various options to install K8s but for this post we will be using Docker EE. You can use the same commands to install Docker EE on this node that I had outlined in my earlier post. Docker EE is bundled with upstream K8s (read here to get more details about Docker Kubernetes Service) hence that one node setup should suffice for our exercise. After installing Docker EE, to install Nvidia Docker container runtime you can follow these quick start instructions. As part those instructions don’t forget to enable K8s Device Plugin daemonset for Nvidia by executing below command (if you don’t have Kubectl installed you can follow the steps outlined here) .

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta/nvidia-device-plugin.yml

That’s it! You are all set to deploy your first GPU workload to K8s. To create the Pod you can use the UCP UI or kubectl CLI. Here’s a sample yaml file requesting for 1 GPU to perform vector addition of 50000 elements:

apiVersion: v1
kind: Pod
metadata:
  name: cuda-vector-add
spec:
  restartPolicy: OnFailure
  containers:
    - name: cuda-vector-add
      # https://github.com/kubernetes/kubernetes/blob/v1.7.11/test/images/nvidia-cuda/Dockerfile
      image: "k8s.gcr.io/cuda-vector-add:v0.1"
      resources:
        limits:
          nvidia.com/gpu: 1 # requesting 1 GPU

The pod should be scheduled on our single node cluster and run to completion. You can retrieve the logs via UCP UI or CLI as shown below. The logs should show you the job completion with success status.

I will extend our basic setup in future blog posts to demonstrate workloads with popular deep learning libraries. Meanwhile, do let me know if you have any questions or comments on this deployment workflow.

Until then, happy learning 🙂

Working with Docker APIs

Docker’s popularity is due to its simplicity. Most developers can cover quite a bit of ground just with build, push & run commands. But there could be times when you need to start looking beyond CLI commands. For instance, I was working with a customer who was trying to hook up their monitoring tool to Docker Engine to enhance productivity of their ops team. This hook is typically to pull status and stats of docker components and report / alert on them. The way monitoring tool would achieve this is by tapping into APIs provided by Docker Engine. Yes, Docker Engine does provide REST HTTP APIs which can be invoked by any language / runtime with HTTP support (that’s how Docker client works with Docker Engine). Docker even has SDKs for Go and Python. For this post I will be focus on how you can access these APIs using HTTP for both Linux and Windows hosts. Finally we will look at how to access APIs for Docker Enterprise deployments.

Accessing APIs with Linux Hosts

For Linux, Docker daemon listens by default on a socket – /var/run/docker.sock. So using our good old friend CURL here’s the command that you can execute locally to get the list of all containers (equivalent of docker ps)

curl --unix-socket /var/run/docker.sock -H "Content-Type: application/json" -X GET http:/containers/json

But how about accessing the API remotely? Well simple you need to expose a docker endpoint beyond socket using the -H flag for docker service. Link shows you how to configure the same. With that configuration in place you can fire CURL again but this time access the IP rather than the socket.

curl -k  -H "Content-Type: application/json" -X GET http://172.31.6.154:2376/containers/json

So can just anyone access these APIs remotely? Is there an option to restrict or secure access to APIs? The answer is yes. Docker supports certificate based authentication wherein only authorized clients (possessing cert obtained from authorized CA) can communicate to Docker engine. This article covers required steps in detail to configure certs. Once configuration is done you can use below curl command to access Docker APIs.

curl -k --cert cert.pem --cacert ca.pem --key key.pem https://3.14.248.232/containers/json

Accessing APIs with Windows Hosts

Unlike Linux which uses sockets, Windows uses pipes. Once you install Docker and run the pipelist command you should see ‘docker_engine’ in the pipelist. You can access this pipe using custom C# code or use Docker.DotNet NuGet Package. Below is a sample code with Docker.DotNet library to retrieve list of containers using ‘docker_engine’ pipe.

DockerClient client = new DockerClientConfiguration(new Uri("npipe://./pipe/docker_engine")).CreateClient();

var containerList = client.Containers.ListContainersAsync(new ContainersListParameters());

foreach (var container in containerList.Result)
{...}

In addition, similar to Linux you can use the -H option to allow for TCP connections and also configure certs to secure connection. You can do this via config file, environment variables or just command line as shown below. You can refer to this article for securing Docker Engine with TLS on Windows. Once configured you can use Docker.DotNet to invoke Docker APIs.

dockerd -H npipe:// -H 0.0.0.0:2376 --tlsverify --tlscacert=C:\ProgramData\docker\daemoncerts\ca.pem --tlscert=C:\ProgramData\docker\daemoncerts\cert.pem --tlskey=C:\ProgramData\docker\daemoncerts\key.pem --register-service

Accessing APIs with Docker Enterprise Edition

So far we have invoked APIs for a standalone community docker engine but Docker has other enterprise offerings including Docker Enterprise Edition (EE). If you are not familiar with it check out my post to create a single node test environment. Universal Control Plane (UCP) part of Docker EE provides a set of APIs which you can interact with. Also unlike standalone engines, Docker UCP is secure by default. So to connect to UCP you would need to pass username password credentials or use certs from the client bundle both of which are shown below. With credentials you retrieve an AUTHTOKEN and pass that token for subsequent requests to gain access.

#Accessing UCP APIs via Creds
AUTHTOKEN=$(curl --insecure -s -X POST -d "{ \"username\":\"admin\",\"password\":\"Password123\" }" "https://3.14.248.232/auth/login" | awk -F ':' '{print    $2}' | tr -d '"{}')

curl -k -H  "content-type: application/json" -H "Authorization: Bearer $AUTHTOKEN" -X GET https://3.14.248.232/containers/json 

#Accessing UCP APIs via Client Bundle
curl -k --cert cert.pem --cacert ca.pem --key key.pem https://3.14.248.232/images/json 

Hope this helps in extend your automation and monitoring tools to Docker!

Deploying Docker Enterprise on a single node

Most developers find installing Docker fairly straight forward. You download the EXE or DMG for your environment and go through an intuitive installation process. But what you have there is typically a Docker Desktop Community Edition (CE). Desktop editions are great for individual developers but Docker has lot more to offer to enterprises who want to run containers at scale. Docker Enterprise provides many capabilities to schedule and manage your container workloads and also offers a private docker image repository. The challenge though is while it’s fairly easy to install Docker Desktop it isn’t quite the same to install Docker Enterprise. Docker Enterprise invariably requires installing Docker Enterprise Engine, installing Universal Control Plane (UCP) with multiple manager and worker nodes, Docker Trusted Registry (DTR) – a private image repo (similar to public Docker Hub), and other components.

In this post, I am going to simplify the setup for you by creating a single node sandbox environment for Docker Enterprise. Hope you will find it useful for your POCs and test use cases. For setup all you will need is a single node VM. You can use any of the public cloud platforms to spin up that single node. For this post, I have setup a Red Hat Enterprise Linux (RHEL) m4.xlarge (4 vCPU & 16GB Mem) instance on AWS. Please note the key components of Docker EE including UCP & DTR only run on Linux platforms. You can install additional Windows nodes later as workers to run Windows workloads.

STEP 1: Install Docker Enterprise Engine
To begin you would have to sign up for free 1 month trial for Docker EE. Once you create docker hub account and fill out the contact details required for trial you should get the below screen.

Screen Shot 2019-07-08 at 3.57.35 PM

Copy the storebits URL and navigate to it in your browser. Here you will find packages for different OS versions. Now RDP into your EC2 instance. Add the relevant docker repo to your yum config manager. I will use RHEL since that’s the OS of my EC2 instance.

$ sudo -E yum-config-manager --add-repo "storebitsurl/rhel/docker-ee.repo"

Then install the docker EE engine, CLI & containerd packages.

$ sudo yum -y install docker-ee docker-ee-cli containerd.io

In case you are aren’t using RHEL as your OS, here are the links to install Docker Enterprise Engine on Ubuntu, CentOS, SUSE Linux, and Oracle Linux.

STEP 2: Install UCP (Universal Control Plane)
UCP is the heart of Docker Enterprise. It allows for running containers at scale, across multiple nodes, let’s you choose an orchestrator, provides RBAC enabled web UI for operations, and much more. The UCP installer runs through ‘ucp’ container image installing ucp agent which in turn bootstraps services and containers required by UCP.

docker container run --rm -it \
-v /var/run/docker.sock:/var/run/docker.sock `#mount docker sock` \
docker/ucp install `#ucp image is stored on docker hub` \
--host-address 172.31.6.154 `#private IP of your node` \
--san 3.14.248.232 `#public IP of your node` \
--admin-password YourPassword `#set the default password` \
--force-minimums `#for cases where node configuration is less than ideal`

Upon completion you should be able to browse your public IP (https://3.14.248.232) and access UCP UI shown below. To login use the ‘admin’ username and password ‘YourPassword’.

Screen Shot 2019-07-26 at 6.24.19 AM

STEP 3: Install DTR (Docker Trusted Registry)

DTR installation takes a similar route to UCP. It uses a container image ‘dtr’ with install command.

docker run -it --rm docker/dtr install \
--ucp-node yourNodeName `#try docker node ls to retrive name` \
--ucp-username admin \
--ucp-url https://3.14.248.232 \
--ucp-insecure-tls `#ignore cert errors for self signed UCP certs` \
--replica-http-port 8080 `#choose a port other than 80 to avoid conflict with UCP` \
--replica-https-port 8843 `#choose a port other than 443 to avoid conflict with UCP` \
--dtr-external-url https://3.14.248.232:8843 `#this will add the external IP to DTR self-signed cert`

Once the command completes successfully you should be able to browse DTR UI on https port (https://3.14.248.232:8843).

Screen Shot 2019-07-26 at 6.33.59 AM

STEP 4: Deploy an App

Let’s deploy an app to Docker EE. For this I would pull an ASP.NET sample image, create a org and repo in DTR, tag image for DTR repo and upload it. Post upload we can use docker stack to deploy the application. To begin let’s create an organization and repository in our DTR.

Screen Shot 2019-07-26 at 5.06.37 AM

Now let’s pull the image from Docker Hub, Tag it and Push it to DTR.

docker pull mcr.microsoft.com/dotnet/core/samples:aspnetapp
docker tag mcr.microsoft.com/dotnet/core/samples:aspnetapp 3.14.248.232:8843/brkthroo/dotnet:aspnetapp
docker image push 3.14.248.232:8843/brkthroo/dotnet:aspnetapp

Next create a docker-compose.yml file with below instructions.

version: "3"
 services:
 web:
 # replace username/repo:tag with your name and image details
 image: 3.14.248.232:8843/brkthroo/dotnet:aspnetapp
 deploy:
 replicas: 1
 restart_policy:
 condition: on-failure
 resources:
 limits:
 cpus: "0.1"
 memory: 50M
 ports:
 - "3776:80"

Once file is created run the docker stack deploy command to deploy listed services.

docker stack deploy -c docker-compose.yml aspnetapp

You should be able to browse app with node’s public IP on port 3776; in my case http://3.14.248.232:3776/.

Screen Shot 2019-07-26 at 6.38.43 AM.png

In future, we will dig into more Docker EE features but hope this single node setup provided you a jumpstart in exploring Docker Enterprise.

Happy Learning 🙂 !