How to Deploy your Machine Learning Models on Kubernetes

Deploy, scale and manage your machine learning services with Kubernetes and Terraform on GCP.

Dimitris Poulopoulos

Towards Data Science

· ~7 min read · April 27, 2020 (Updated: December 14, 2021) · Free: No

Kubernetes is a production-grade container orchestration system, which automates the deployment, scaling and management of containerized applications. The project is open-sourced and battle-tested with mission-critical applications that Google runs.

Machine learning solutions are often broken down into many steps that can be OS and framework agnostic. Also, it is common for data science projects to follow the dataflow programming paradigm, where a program is designed as a directed graph with data moving from one operation to the next. These methodologies and abstractions are easily implemented with Kubernetes, leaving the infrastructure management burden on the system.

In this story, we will deploy a simple image classifier web service on Kubernetes. We will be working on Google Cloud Platform (GCP) and take advantage of their secured and managed Kubernetes service offering: GKE. We will first use Terraform to provision and manage the needed infrastructure and then create a simple deployment of a pre-trained PyTorch model. Let us start.

Learning Rate is a newsletter for those who are curious about the world of AI and MLOps. You'll hear from me every Friday with updates and thoughts on the latest AI news and articles. Subscribe here!

Infrastructure as Code

To build the infrastructure needed we will use Terraform. Terraform treats infrastructure as code, to provision and manage any cloud, infrastructure, or service. Terraform will make very easy for us to create the resources we need and clean up after ourselves when we are done, to avoid accumulating costs. To install Terraform follow the instructions for your platform.

Preliminaries

To work on GCP we first need to create a project resource. The project resource is a logical organization entity which forms the basis for creating, enabling, and using other Google Cloud services, among others. It is necessary to start working on GCP and easy to construct one or use the default project initialized with any new GCP account.

To create a new one, from the side panel go to IAM & Admin and choose Manage Resources at the bottom. Press the Create Project button, give a name and you are good to go.

The next step is to create a Sevice Account that terraform will use to create the resources we need. A service account is a special kind of account used by an application, not a person. Make sure you are using the project you created, go back to IAM & Admin choice and pick Service Accounts. Create a new one by pressing the button Create Service Account, name it terraform and press create. On the next step, grand the service account the role of Project Editor and press continue. Finally, generate a JSON key by pressing the Create Key button. This key will be the method of authentication for Terraform.

Terraform configuration

First, we need to configure Terraform to use a Google Storage bucket to store the metadata it needs. This is a great way to keep Terraform running in an automated environment. First, let us create the bucket; create a new file called main.tf:

Then, we need to create a new file for the variables it uses. Create that file and name it variables.tf:

This file tells Terraform which variables to expect and provides some defaults. For example, the location variable in our first script now has a default which is US. There are other variables, like the project_id, that we need to provide the value ourselves. To this end, we create a new terraform.tfvars file:

Be sure to fill the terraform.tfvars file with the info necessary and then run terraform init followed byterraform apply. Finally, go to the GCP console to verify that the bucket is there.

Terraform your GKE cluster

We are now ready to spin off our Kubernetes cluster. We will create three separate files in a new directory: one to define the Terraform backend, one for the cloud provider and one for the actual cluster. Of course, we will need two extra files for the variables. Let us set the backend. Create a terraform.tf file and copy the following inside:

Be sure to provide the name of the bucket you created before. Then create a provider.tf file for the cloud provider:

Finally, create a main.tf file to configure the GKE cluster we want. For this example, we will keep it to a bare minimum.

One thing we will change though is to remove the default node pool and create one of our own, where we use n1-standard-1 preemptible machines. Finally, create the variables.tf and terraform.tfvars files, to provide the necessary variables.

And the terraform.tfvars file:

Setting the location to a zone format will create a zonal cluster. Google offers one free zonal cluster per billing account, thus we will take advantage of that discount. As before, run terraform init and terraform apply, wait a few minutes and your cluster will be ready to go.

Image Classifier

For this example, we use a pre-trained PyTorch model and Flask, to create a simple machine learning web service. Next, we will create a simple Kubernetes deployment to deploy, manage and scale the web service.

PyTorch + Flask

To design a simple image-classifier service, we use a PyTorch implementation of AlexNet, a CNN model for image classification. We serve it using the Flask micro-framework.

First, we define a Predict Flask resource that initializes the model and accepts an HTTP POST request. We serve it with flask using the any-host identifier (0.0.0.0).

Flask's built-in server is not suitable for production as it doesn't scale well. For our purposes is fine but if you want to see how to deploy a Flask application properly see here.

Containerization

To deploy it on Kubernetes we first need to containerize the service. To this end, we define the following Dockerfile:

To build an image from this Dockerfile, create a requirements.txt file, in the same folder, with the following contents:

flask
flask-restful

Download in the same folder the JSON file that contained the labels of Imagenet here. We are now ready to build our docker image; run the following command:

docker build -t <your-dockerhub-username>/image-classifier:latest .

Finally, you need to push the image on a container registry. For instance, to push it on docker hub, create an account, configure docker and run the following commands:

docker login --username username
docker push <your-dockerhub-username>/image-classifier:latest

For simplicity, you can use the image I prepared for you. You can pull it from docker hub.

Deployment

The last step is deployment. For that, we use a simple YAML configuration file. But first, we need to install kubectl and get the credentials to connect to our cluster.

To install kubectl follow the instructions for your platform. Next, install gcloud, the CLI tool for GCP. To get the credentials of your GKE cluster run the following command:

cloud container clusters get-credentials <the-name-of-your-cluster> --zone <the-zone-that-it-is-in>

For example:

cloud container clusters get-credentials my-gke-cluster --zone us-central1-a

This command will create a configuration file and instruct kubectl on how to connect with your GKE cluster. Finally, define the image classifier YAML deployment file.

This configuration is split three-ways: first, we define a namespace which provides a scope for our resource names. Then, we define the actual deployment, where we use only one replica of our service. You can scale that number to whatever serves your needs. Finally, we expose our web service with a LoadBalancer. This will make our service accessible from the internet. Create the deployment by running the following command:

kubectl apply -f image-classifier.yaml

You are now ready to test your image classifier web service. For this example, we use curl.

curl -X POST -d '{"url": "https://i.imgur.com/jD2hDMc.jpg"}' -H 'Content-Type: application/json' http://<your-cluster's-ip>/predict

This command will query your service with a kitten image. What you get back is the prediction associated with some confidence. You have now successfully deployed a machine learning model on Kubernetes!

Cleaning Up

To clean up after ourselves we will move in reverse order:

Delete the namespace, deployment and load balancer:

kubectl delete -f image-classifier.yaml

2. Destroy the cluster:

terraform destroy

Run this command from the folder which you created in the "Terraform your GKE cluster" step.

3. Delete the Google Storage Bucket:

terraform destroy

Run this command from the folder which you created in the "Terraform configuration" step.

Conclusion

In this story, we saw how to create resources with Terraform on GKE. Specifically, we created a GKE cluster, to leverage the hosted Kubernetes solution provided by Google Cloud Platform. Then, we created a simple image classification web service, build a docker image out of it and deploy it on Kubernetes. Finally, we cleaned up after ourselves, destroying the resource we do not need anymore to avoid extra costs.

Kubernetes is a great platform to deploy machine learning models for production. In later articles, we will explore Kubeflow; a machine learning toolkit for Kubernetes.

My name is Dimitris Poulopoulos and I'm a machine learning researcher at BigDataStack and PhD(c) at the University of Piraeus, Greece. I have worked on designing and implementing AI and software solutions for major clients such as the European Commission, Eurostat, IMF, the European Central Bank, OECD, and IKEA. If you are interested in reading more posts about Machine Learning, Deep Learning and Data Science, follow me on Medium, LinkedIn or @james2pl on twitter.

#artificial-intelligence #technology #machine-learning #kubernetes #devops