Deploying a Data Science Platform on AWS: Setting Up AWS Batch (Part I) | by Eduardo Blancas | Oct, 2022

By Jessie Hobb On Oct 7, 2022

Data Science Cloud Infrastructure

A step-by-step guide to deploy a Data Science platform on AWS with open-source software

Your laptop isn’t enough, let’s use the cloud. Photo by CHUTTERSNAP on Unsplash

In this series of tutorials, we’ll show you how to deploy a Data Science platform with AWS and open-source software. By the end of the series, you’ll be able to submit computational jobs to AWS scalable infrastructure with a single command.

Architecture of the Data Science platform we’ll deploy. Image by author.

Screenshot of the AWS Batch console, showing our recent jobs. Image by author.

To implement our platform, we’ll be using several AWS services. However, the central one is AWS Batch.

AWS Batch is a managed service for computational jobs. It takes care of keeping a queue of jobs, spinning up EC2 instances, running our code and shutting down the instances. It scales up and down depending on how many jobs we submit. It’s a very convenient service that allows us to execute our code in a scalable fashion and to request custom resources for compute-intensive jobs (e.g., instances with many CPUs and large memory) without requiring us to maintain a cluster (no need to use Kubernetes!).

Let’s get started!

The only requirement for this tutorial is to have the AWS command-line interface installed (and access keys with enough permissions to use the tool). Follow the installation instructions. If you have issues, ask for help in our Slack.

Verify your installation (ensure you’re running version 2):

Output:

Then authenticate with your access keys:

We need to create a VPC (Virtual Private Cloud) for the EC2 instances that will run our tasks, this section has all the commands you need for configuring the VPC.

Note that all AWS accounts come with a default VPC. If you want to use that one, ensure you have the subnet IDs and security group IDs you want to use and skip this section.

If you neep help, feel free to ask us anything on Slack.

Let’s create a new VPC, and retrieve the VPC ID:

Output:

Let’s assign the ID to a variable so we can re-use it (replace the ID with yours):

Now, let’s create a subnet and get the subnet ID:

Output:

And assign the ID to a variable (replace the ID with yours):

We need to modify the subnet’s configuration so each instance gets a public IP:

Now, let’s configure internet access:

Output:

Assign the gateway ID to the following variable (replace the ID with yours):

Let’s attach the internet gateway to our VPC:

This documentation explains in more detail the commands above.

Note that allowing internet access to your instances simplifies the networking setup. However, if you don’t want the EC2 instances to have a public IP, you can configure a NAT gateway.

Let’s now finish configuring the subnet by adding a route table:

Output:

Assign the route table ID (replace the ID with yours):

Let’s add a route associated with our internet gateway:

Output:

And associate the table to the subnet:

Output:

Finally, create a security group in our VPC:

Output:

And assign the security ID (replace the ID with yours):

We now need to create a role to allow AWS Batch to call ECS (another AWS service).

Download the configuration file:

Output:

Create role:

Output:

Create instance profile:

Output:

Add role to instance profile:

Attach role policy:

With networking and permissions configured, we’re now ready to configure the compute environment!

In AWS Batch, a compute environment determines which instance types to use for our jobs.

We created a simple script to generate your configuration file:

Output:

Run the script and pass the subnet and security group IDs:

Output:

You may also edit the my-compute-env.json file and put your subnet IDs in the subnets list, and your security group IDs in the securityGroupIds list. If you need more customization for your compute environment, join our Slack and we’ll help you.

Create the compute environment:

Output:

To submit jobs, we need to create a job queue. The queue will receive job requests and route them to the relevant compute environment.

Note: give it a few seconds before running the next command, as the compute environment might take a bit to be created.

Download file:

Output:

Create a job queue:

Output:

Let’s test that everything is working!

We define an example job that waits for a few seconds and finishes:

Output:

Let’s submit a job to the queue:

Output:

Let’s ensure the job is executed successfully. Copy the jobId printed when executing the command and pass it to the following command:

Output:

The first time you run the above command, you’ll most likely see: RUNNABLE, this is normal.

AWS Batch spins up new EC2 machines and shut them down after your jobs are done. This is great because it’ll prevent idling machines that keep billing. However, since new machines spin up every time, this introduces some startup time overhead. Wait for a minute or so and run the command again, you should see STARTING, RUNNING, and SUCCEEDED shortly.

If the job is still stuck in RUNNABLE status after more than a few minutes, ask for help in our community.

In this blog post, we configured AWS Batch so we can submit computational jobs on demand. There’s no need to maintain a cluster or manually spin up and shut down EC2 instances. You’re only billed for the jobs you submit. Furthermore, AWS Batch is highly scalable, so you can submit as many jobs as you want!

In the next post, we’ll show you how to submit a custom container job to AWS Batch, and configure an S3 bucket to read input data and write results.

If you want to be the first to know when the second part comes out; follow us on Twitter, LinkedIn, or subscribe to our newsletter!

There’s no billing for using AWS Batch apart from EC2 usage. However, if you want to clean up your environment, follow these steps.

Disable the AWS Batch queue and compute environments:

Output:

Update compute environment:

Output:

You’ll need to wait 1–2 minutes for the queue and the compute environment to appear as DISABLED.

Delete the queue and the compute environment:

Delete the VPC and its components:

Delete IAM role:

aws iam remove-role-from-instance-profile --instance-profile-name \    ploomber-ecs-instance-role \    --role-name ploomber-ecs-instance-role
aws iam delete-instance-profile --instance-profile-name ploomber-ecs-instance-roleaws iam detach-role-policy --role-name ploomber-ecs-instance-role \
--policy-arn arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceforEC2Roleaws iam delete-role --role-name ploomber-ecs-instance-role

Hi! My name is Eduardo, and I like writing about all things data science. If you want to keep up-to-date with my content. Follow me on Medium or Twitter. Thanks for reading!

Data Science Cloud Infrastructure

A step-by-step guide to deploy a Data Science platform on AWS with open-source software

To implement our platform, we’ll be using several AWS services. However, the central one is AWS Batch.

Let’s get started!

Verify your installation (ensure you’re running version 2):

Output:

Then authenticate with your access keys:

We need to create a VPC (Virtual Private Cloud) for the EC2 instances that will run our tasks, this section has all the commands you need for configuring the VPC.

Note that all AWS accounts come with a default VPC. If you want to use that one, ensure you have the subnet IDs and security group IDs you want to use and skip this section.

If you neep help, feel free to ask us anything on Slack.

Let’s create a new VPC, and retrieve the VPC ID:

Output:

Let’s assign the ID to a variable so we can re-use it (replace the ID with yours):

Now, let’s create a subnet and get the subnet ID:

Output:

And assign the ID to a variable (replace the ID with yours):

We need to modify the subnet’s configuration so each instance gets a public IP:

Now, let’s configure internet access:

Output:

Assign the gateway ID to the following variable (replace the ID with yours):

Let’s attach the internet gateway to our VPC:

This documentation explains in more detail the commands above.

Note that allowing internet access to your instances simplifies the networking setup. However, if you don’t want the EC2 instances to have a public IP, you can configure a NAT gateway.

Let’s now finish configuring the subnet by adding a route table:

Output:

Assign the route table ID (replace the ID with yours):

Let’s add a route associated with our internet gateway:

Output:

And associate the table to the subnet:

Output:

Finally, create a security group in our VPC:

Output:

And assign the security ID (replace the ID with yours):

We now need to create a role to allow AWS Batch to call ECS (another AWS service).

Download the configuration file:

Output:

Create role:

Output:

Create instance profile:

Output:

Add role to instance profile:

Attach role policy:

With networking and permissions configured, we’re now ready to configure the compute environment!

In AWS Batch, a compute environment determines which instance types to use for our jobs.

We created a simple script to generate your configuration file:

Output:

Run the script and pass the subnet and security group IDs:

Output:

Create the compute environment:

Output:

To submit jobs, we need to create a job queue. The queue will receive job requests and route them to the relevant compute environment.

Note: give it a few seconds before running the next command, as the compute environment might take a bit to be created.

Download file:

Output:

Create a job queue:

Output:

Let’s test that everything is working!

We define an example job that waits for a few seconds and finishes:

Output:

Let’s submit a job to the queue:

Output:

Let’s ensure the job is executed successfully. Copy the jobId printed when executing the command and pass it to the following command:

Output:

The first time you run the above command, you’ll most likely see: RUNNABLE, this is normal.

If the job is still stuck in RUNNABLE status after more than a few minutes, ask for help in our community.

In the next post, we’ll show you how to submit a custom container job to AWS Batch, and configure an S3 bucket to read input data and write results.

If you want to be the first to know when the second part comes out; follow us on Twitter, LinkedIn, or subscribe to our newsletter!

There’s no billing for using AWS Batch apart from EC2 usage. However, if you want to clean up your environment, follow these steps.

Disable the AWS Batch queue and compute environments:

Output:

Update compute environment:

Output:

You’ll need to wait 1–2 minutes for the queue and the compute environment to appear as DISABLED.

Delete the queue and the compute environment:

Delete the VPC and its components:

Delete IAM role:

aws iam remove-role-from-instance-profile --instance-profile-name \    ploomber-ecs-instance-role \    --role-name ploomber-ecs-instance-role
aws iam delete-instance-profile --instance-profile-name ploomber-ecs-instance-roleaws iam detach-role-policy --role-name ploomber-ecs-instance-role \
--policy-arn arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceforEC2Roleaws iam delete-role --role-name ploomber-ecs-instance-role

Hi! My name is Eduardo, and I like writing about all things data science. If you want to keep up-to-date with my content. Follow me on Medium or Twitter. Thanks for reading!

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.