Techno Blender
Digitally Yours.

How to Build REST API in Simple Words | by Pedram Ataee, PhD | Jun, 2022

0 99


A playbook for creating a REST API as a data scientist using Flask and Docker

Photo by Huper by Joshua Earle on Unsplash

I recently developed a contextualized most-similar word API service named OWL. The OWL API is an advanced NLP service that is developed using Flask and Docker. This service is shared with the community through a series of REST APIs hosted on the RapidAPI marketplace. It was a great journey for me so I decided to share various stages of its development with you.

This article describes a step-by-step guideline on how to build a REST API using Flask and Docker. You can use this guideline in your data science projects as well as any other web applications that you work on. Plus, this article is part of the “… in Simple Words” series where I share my experience with you. This series helped many data science enthusiasts before. I hope it helps you as well.

If you want to read more about the OWL API, check out the article below.

Step 1 — Create an app.py

First, you must write a Python script that maps a RESTful endpoint to a function. Let’s name the script app.py. In this example, the RESTful endpoint is /general/<N>/<WORD> where general is a constant and N and WORD are two variables. Through these variables, users can send data to the corresponding function of the RESTful endpoint, i.e., similar_words(N, WORD). The RESTful endpoint that you define in the Python script will be added to the URL of the host after deploying on a server.

import os
import sys
from flask import Flask, jsonify, request
from flask_httpauth import HTTPBasicAuth
import gensim.downloader as api
sys.path += [os.getcwd()]
from src.core.word_clusterizer import WordClusterizer
from src.backend.wrapper import GensimWrapper
auth = HTTPBasicAuth()
app = Flask(__name__)
MODEL = api.load('glove-wiki-gigaword-300')@app.route('/general/<N>/<WORD>', methods=['GET'])
def similar_words(N, WORD):
model = GensimWrapper(MODEL)
try:
wc = WordClusterizer(model=model, word=WORD, top_n=int(N)
wc.fit()
results = wc.result
except KeyError:
results = 'Nothing is found!'
return jsonify(results)
if __name__ == '__main__':
port = os.environ.get("PORT", 8080)
app.run(host='0.0.0.0', port=port, debug=True)

Needless to say, the corresponding function of the RESTful endpoint is the core of your data science project. In this scenario, it is a word similarity service that extracts the most similar words with more granularity compared to the current solutions.

Step 2 — Create a Dockerfile

You need a Docker image that runs app.py when you spin up a server. To build such a Docker image, you need a Dockerfile such as below.

# MAINTAINER YOUR_NAMEFROM pdrm83/nlp-essentials:latestRUN pip install --upgrade pip
COPY . /project/
RUN pip install .
WORKDIR /project/src/.
ENV PYTHONPATH /project
ENV PORT 8080
EXPOSE 8080
EXPOSE 80
CMD ["python3", "app/app.py"]

In the Dockerfile, you must open/expose the port where you set app.py running. As you can see in Step 1, I set the app.pyscript to run on a port defined as port = os.environ.get("PORT", 8080) . This line means, the port variable is set to the value of PORT from the environment, if it is set, otherwise, it is set to 8080.

In this case, PORT was not previously set. So, I had to make sure to open port 8080 in the Dockerfile. I did that by EXPOSE 8080 in the Dockerfile. If PORT was previously set in the environment to anything except 8080, you must open that port in the Dockerfile by adding EXPOSE XXXX.

Note that, I run the CMD ["python3", "app/app.py"] on an environment created by the Docker image named nlp-essentials . This is a Docker image that I built previously by installing all the required libraries. This Docker image is stored in my Dockerhub account pdrm83 and used as needed. The nlp-essentials image is pulled whenever I want to update the API service. This helps to not install all the required libraries every time that I want to update the API. Last, but not least, since I stored the app.py in ROOT/src/app/. , I added WORKDIR /project/src/. to the Dockerfile. You must not replicate this line. You just need to configure it properly in the Dockerfile based on the file structure of your codebase.

Step 3 — Spin up a server

In nutshell, you must build a docker image using docker build -f <<DOCKERFILE_NAME>> -t <<IMAGE_NAME>> , and then run it using docker run -i -p 80:8080 <<IMAGE_ID>>. The IMAGE_ID is the id of the Docker image that you created in the previous step. Note that, I map port 8080 to 80 since the app.py runs on port 8080 and I want to offer the API service on port 80 to the users. In the end, you must ensure that port 80 is open to the internet on any machine that you work with.

There are many ways to let users have access to your services. However, I want to introduce you to a cool service named ngrok that lets you make your computer become online in 3 simple steps 😮. The 3 steps are as easy as follows:

  1. Download ngrok.zip and unzip /path/to/ngrok.zip .
  2. Run ngrok authtoken YOUR_OWN_TOKEN
  3. Run ngrok http 80

After running ngrok http 80 , the ngrok service creates a dedicated URL on ngrok.io domain for your project that looks like the link below.

http://YOUR_DEDICATED_SUBDOMAIN.ngrok.io/

Your service is now available on the internet for everyone. You don’t need any fancy cloud service to introduce your API service to the market. Isn’t that amazing? You can read in the article below why I think ngrok is very useful for you to offer an API service to the community.

Last Words

One of the most recommended architectures for software products is API Architecture. This architecture helps you design modular and reusable services that are also easy to maintain. For example, there are effective solutions to test and document a series of REST APIs that help you to maintain them easily. In this article, I described how to build a RESTful API for your data science project. There are other types of API protocols rather than RESTful which was not the focus of this article.

Thanks for Reading!

If you like this post and want to support me…


A playbook for creating a REST API as a data scientist using Flask and Docker

Photo by Huper by Joshua Earle on Unsplash

I recently developed a contextualized most-similar word API service named OWL. The OWL API is an advanced NLP service that is developed using Flask and Docker. This service is shared with the community through a series of REST APIs hosted on the RapidAPI marketplace. It was a great journey for me so I decided to share various stages of its development with you.

This article describes a step-by-step guideline on how to build a REST API using Flask and Docker. You can use this guideline in your data science projects as well as any other web applications that you work on. Plus, this article is part of the “… in Simple Words” series where I share my experience with you. This series helped many data science enthusiasts before. I hope it helps you as well.

If you want to read more about the OWL API, check out the article below.

Step 1 — Create an app.py

First, you must write a Python script that maps a RESTful endpoint to a function. Let’s name the script app.py. In this example, the RESTful endpoint is /general/<N>/<WORD> where general is a constant and N and WORD are two variables. Through these variables, users can send data to the corresponding function of the RESTful endpoint, i.e., similar_words(N, WORD). The RESTful endpoint that you define in the Python script will be added to the URL of the host after deploying on a server.

import os
import sys
from flask import Flask, jsonify, request
from flask_httpauth import HTTPBasicAuth
import gensim.downloader as api
sys.path += [os.getcwd()]
from src.core.word_clusterizer import WordClusterizer
from src.backend.wrapper import GensimWrapper
auth = HTTPBasicAuth()
app = Flask(__name__)
MODEL = api.load('glove-wiki-gigaword-300')@app.route('/general/<N>/<WORD>', methods=['GET'])
def similar_words(N, WORD):
model = GensimWrapper(MODEL)
try:
wc = WordClusterizer(model=model, word=WORD, top_n=int(N)
wc.fit()
results = wc.result
except KeyError:
results = 'Nothing is found!'
return jsonify(results)
if __name__ == '__main__':
port = os.environ.get("PORT", 8080)
app.run(host='0.0.0.0', port=port, debug=True)

Needless to say, the corresponding function of the RESTful endpoint is the core of your data science project. In this scenario, it is a word similarity service that extracts the most similar words with more granularity compared to the current solutions.

Step 2 — Create a Dockerfile

You need a Docker image that runs app.py when you spin up a server. To build such a Docker image, you need a Dockerfile such as below.

# MAINTAINER YOUR_NAMEFROM pdrm83/nlp-essentials:latestRUN pip install --upgrade pip
COPY . /project/
RUN pip install .
WORKDIR /project/src/.
ENV PYTHONPATH /project
ENV PORT 8080
EXPOSE 8080
EXPOSE 80
CMD ["python3", "app/app.py"]

In the Dockerfile, you must open/expose the port where you set app.py running. As you can see in Step 1, I set the app.pyscript to run on a port defined as port = os.environ.get("PORT", 8080) . This line means, the port variable is set to the value of PORT from the environment, if it is set, otherwise, it is set to 8080.

In this case, PORT was not previously set. So, I had to make sure to open port 8080 in the Dockerfile. I did that by EXPOSE 8080 in the Dockerfile. If PORT was previously set in the environment to anything except 8080, you must open that port in the Dockerfile by adding EXPOSE XXXX.

Note that, I run the CMD ["python3", "app/app.py"] on an environment created by the Docker image named nlp-essentials . This is a Docker image that I built previously by installing all the required libraries. This Docker image is stored in my Dockerhub account pdrm83 and used as needed. The nlp-essentials image is pulled whenever I want to update the API service. This helps to not install all the required libraries every time that I want to update the API. Last, but not least, since I stored the app.py in ROOT/src/app/. , I added WORKDIR /project/src/. to the Dockerfile. You must not replicate this line. You just need to configure it properly in the Dockerfile based on the file structure of your codebase.

Step 3 — Spin up a server

In nutshell, you must build a docker image using docker build -f <<DOCKERFILE_NAME>> -t <<IMAGE_NAME>> , and then run it using docker run -i -p 80:8080 <<IMAGE_ID>>. The IMAGE_ID is the id of the Docker image that you created in the previous step. Note that, I map port 8080 to 80 since the app.py runs on port 8080 and I want to offer the API service on port 80 to the users. In the end, you must ensure that port 80 is open to the internet on any machine that you work with.

There are many ways to let users have access to your services. However, I want to introduce you to a cool service named ngrok that lets you make your computer become online in 3 simple steps 😮. The 3 steps are as easy as follows:

  1. Download ngrok.zip and unzip /path/to/ngrok.zip .
  2. Run ngrok authtoken YOUR_OWN_TOKEN
  3. Run ngrok http 80

After running ngrok http 80 , the ngrok service creates a dedicated URL on ngrok.io domain for your project that looks like the link below.

http://YOUR_DEDICATED_SUBDOMAIN.ngrok.io/

Your service is now available on the internet for everyone. You don’t need any fancy cloud service to introduce your API service to the market. Isn’t that amazing? You can read in the article below why I think ngrok is very useful for you to offer an API service to the community.

Last Words

One of the most recommended architectures for software products is API Architecture. This architecture helps you design modular and reusable services that are also easy to maintain. For example, there are effective solutions to test and document a series of REST APIs that help you to maintain them easily. In this article, I described how to build a RESTful API for your data science project. There are other types of API protocols rather than RESTful which was not the focus of this article.

Thanks for Reading!

If you like this post and want to support me…

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment