Simple way to Deploy ML Models as Flask APIs on Amazon ECS | by Nikola Kuzmic | Mar, 2023

By Jessie Hobb On Mar 10, 2023

Deploy Flask APIs on Amazon ECS in 4 minutes

In this post we’ll cover how to deploy a linear regression XGBoost model which predicts a developer’s salary based on their years of experience.

👉 Game Plan

Train an XGBoost model
Build a simple Flask API to serve model predictions
Build a Docker Image for the Flask API
Deploy the Docker Container on Amazon ECS

Entire Source Code Github Repo: link🧑‍💻

flask-on-ecs - repo structure
.
├── Dockerfile
├── README.md
├── myapp.py
├── requirements.txt
└── train_xgb.ipynb

Why we need APIs to deploy ML models

If you are reading this post, likely you have reached the stage in your Data Science project where you want to make your awesome ML models available for everyone on the internet. People refer to this step as deploying them to production.

Here we won’t make things too complex and examine what a proper production-grade deployment should look like, rather we’re simply going to leverage the default Flask development server to demonstrate the process end-to-end what it takes to take the trained/pickled XGBoost ML model, dockerize it, and deploy it as a real-time API on Amazon ECS.

👉 Step 1: Train an XGBoost model

Train an XGBoost model to predict the developer salary based on their years of experience and save the model as a pickle file.

To run it inside VS Code, let’s create a separate Python 3.8 environment:

conda create --name py38demo python=3.8 
conda activate py38demo
pip install ipykernel pandas flask numpy xgboost scikit-learn

Then Restart VS Code and in Jupyter Notebook -> Select ‘py38demo’ as the Kernel.

Train & pickle the XGBoost model:

Time to create an API which can serve these recommendations!

👉 Step 2: Flask API

Our API will load the XGBoost model, accept POST requests, and produce a response.

Let’s first run our API locally. Then in a separate terminal we can test it by sending a payload POST request to see what a developer with 2.5 years of experience would make:

curl -X POST http://0.0.0.0:80/recms -H 'Content-Type: application/json' -d '{"years":"2.5"}'

$260k after 2.5 years, and $750k after 12.5 years. Not bad! 🤑

👉 Step 3: Docker Image

To run our application inside the Docker Container, we need a “blueprint” with instructions on what environment to use, which local files to copy, and how to run the application. All of this is referred to as a Docker Image and by convention it is specified in a Dockerfile

We can now run our API inside the Docker container and test locally as well.

Note: since I am building the image on a Mac, I need to specify

– – platform linux/amd64

for it to be compatible with the ECS Fargate Linux environment.

Here’s how we build & run the image.

Note: we bind our host (i.e. laptop’s) port 80 to docker container’s port 80:

docker build --platform linux/amd64 -t simpleflask .
docker run -dp 80:80 simpleflask

Let’s test our API which is now running inside the Docker Container! 📦

curl -X POST http://0.0.0.0:80/recms -H 'Content-Type: application/json' -d '{"years":"12.5"}'

Time to deploy this on AWS! 🚀

👉 Step 4: Run the container on Amazon ECS

This section may look overwhelming at first, but actually it’s quite simple if we break the process into 6 simple steps.

i) Push the Docker image to ECR

Let’s create an ECR repo called demo where we can push the Docker image.

Then we can use the Push Commands provided by ECR:

# autheticate
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin <Your-aws-acc-no>.dkr.ecr.us-east-1.amazonaws.com#tag the image
docker tag <Your-local-docker-image-name>:latest <Your-aws-acc-no>.dkr.ecr.us-east-1.amazonaws.com/<Your-ECR-repo-name>:latest
#push the image to ECR
docker push <Your-aws-acc-no>.dkr.ecr.us-east-1.amazonaws.com/<Your-ECR-repo-name>:latest

Assumption: you have configured AWS CLI on your local machine and setup an IAM user with the right permission to interact with the ECR. You can find more info at this link.

After running the above 3 commands, we can see our image is there on ECR! 🎉

Copy & Paste the Image URI somewhere as we’ll need it in the next couple of steps.

ii) Create an IAM Execution Role

We need to create an Execution Role so that our ECS task which will run the container has the access to pull images from the ECR. We’ll name it: simpleRole

iii) Create a Security Group

Security Group is needed to allow anyone on the internet to send requests to our application. In the real world you may want to constrain this to a specific set of IPs but here we’ll open it for everyone and call it:
simpleSG

iv) Create an ECS Cluster

This step is straightforward and only takes couple seconds. We’ll call it: flaskCluster

while our cluster is being provisioned, let’s create a Task Definition.

v) Create a Task Definition

Task Definition, as the name implies is a set of instructions related to which image to run, port to open, and how much virtual CPU and memory we want to allocate. We’ll call it: demoTask

vi) Run the Task

Let’s run our demoTask on our flaskCluster, with the simpleSG we created in step iii).

Time to test out the deployed API from the Public IP address! 🥁

curl -X POST http://<PUBLIC-IP>:80/recms -H 'Content-Type: application/json' -d '{"years":"2.5"}'

It’s working! 🥳

As you can see we are able to get the salary predictions by sending POST requests to the Public IP provided by ECS. 🔥