Follow us

AWS Batch Job to run your work load in backend (either manual or automated)

AWS Batch is used for run your job either on on-demand or set this automated to run on certain time frequency or on some event.

 

Most common use case is, like you need to process some data, like import export. Generally we have the scenario where we need to read file from S3 and then process those record and insert back to database.

 

So now we can say, we have Lambda function as well which we can use to achieve these requirements.

Yes, we have lambda function but there are some key points with lambda which we never forget when we decide to use, those are:

  1. Execution timeout of the Lambda function. The Lambda function’s hard limit is 15 minutes. If the processing we want to implement will take longer, we should think about another solution.
  2. Amount of memory that we can allocate. We can use a maximum of 10240 MB. This number may be too small in some scenarios.

 

So by considering above situation, if you have some work which you think might take more than 15 minutes or memory utilization and in most of backend batch process job run more than 15 minutes, so in that case we should go with AWS batch.

 

Now, some quick understanding on AWS Batch available options on AWS console tab:

  • Dashboard: Monitor all the jobs process and its status.
  • Jobs: submit job to run, so where you select your job definition and it start new job instance to run.
  • Job definitions: define your job here, so your docker file will be setup where and the role require to run and do all job tasks as defined with docker file.
  • Job queues : setup job queue configuration like how multiple jobs will be processed according to priority you setup. While create this, map job queue to Compute environment you created, so that queue job will process with that environment to run the job.
  • Compute environments : setup environment which will run your job like EC2 instance or fragment.

 

Keep note, whenever you go for setup, you have to go from bottom to up i.e. create first "Compute environment" > then "Job Queues" > then "Job Definition" > then "Jobs".

 

Other note I wanted to share, AWS Batch where you will run your code just like lambda function but your code will run through docker file. Let see how in demo.

 

Now, its time for demo:

In this demo, I created file(.py) python code and whenever we will run this code it will just insert 5 record into DynamoDB and we will run this code through batch job.

 

Python code sample:

8 
bulkload.py Dockeffile 
import bot03 
session 
d ynamcdb 
Cable — 
bot03 . session. Session (region name= 
session . resource ( ' dynamoclb ' 
dynamcdb . Table ( 
"AWSBacchJobTescTab1e") 
"Raj eev" 
—while (i < S): 
print (i) 
Cable . put item( 
'Name ' 
'filler' •

 

Created docker file and provided command to run this python code file.

bulkloadpy 
Dockefflle 
FROM python 
RUN pips install boC03 
RUN mkdil / SIC 
copy . 
" " / src/bulkload . 
CMD [ "python ,

 

Just to note, so that you do not face any docker command execution error, both of my above file inside same folder name as "aws-batch-job-demo1" on my system:

This pc 
Name 
Extra Learn. & Entert. (F:) 
POCs 
aws- batch-job- demo I 
Date modified 
31-07-2021 07:16 PM 
31-07-2021 04:38 PM 
Type 
Python File 
bulkload.py 
Dockeffile

 

Now, open CMD to build docker file and push into docker hub.

Here, I make sure, you have docker installed and running on your system, so that you can run docket command and also you should have docker hub account with one repository, so that we can push build docker image form local to docker hub.

You can download window docker to install on your system from here : https://docs.docker.com/docker-for-windows/install/

Docker hub url: https://hub.docker.com/

 

Use command "cd F:\POCs" to move into docker file location.

Run "docker build -t aws-batch-job-demo1 ." command to build docker file. Here "aws-batch-job-demo1" is my docker image name, I am creating.

F: build -t aws-batch-job-demol 
[+] Building 8.gs (18/18) FINISHED

 

Run "docker images" command to see all the build images available on your local docker.

F: \POCs\aws-batch-job-dem01>docker images 
REPOSITORY 
TAG 
aws - batch - job-demol 
latest 
INAGE ID 
Idb2ea5632f6 
CREATED 
19 seconds 
ago 
SIZE 
9621-1B

 

Run "docker login --username <your-docker-hub-username-for-login>" command to login into docker hub, once you enter it will ask you to provide password.

 

Run "docker tag 1db2ea5632f6 sarajeevraj/aws-batch-job-demo1:latest" command to create tag for your docker image. Here yellow bold highlighted one is the docker image id which you can pick from the "docker images" command output and here "sarajeevraj" is my docker hub repository and "aws-batch-job-demo1:latest" is the docker image with ":latest" is tag name.

 

Run "docker push sarajeevraj/aws-batch-job-demo1" command to push image to docker hub. And on successful push, your image will be available to your docker hub repository, refer below screenshot:

C hub.docker.com/repositories 
*docker hub Search for great content (e.g„ mysql) 
Explore 
saraJeevraJ 
Q Search by repository name 
sarajeevraj / aws-batch-job-demol 
Updated 2 minutes ago

 

Now, login to your AWS console and go to AWS batch service and get started:

 

Create "Computer environment", this is where I will setup my instance to run my job. I did setup EC2 instance with default VPC and gateway for this demo:

While you create this compute, it either create self new IAM role or you can create self and then select here, so mainly this role by default created with having all ECS access permission but probably you neeither create self new IAM role or you can create self and then select here, so mainly this role by default created with having all ECS access permission but probably you need to add other permissions as well as per your job code criteria, like in my case I was wring data to DynamoDB, so this instance was require to have dynamo db put item permission with that role.

AWS Batch 
New Batch experience 
Dashboard 
Job definitions 
Job queues 
Compute environments 
Wizard 
x 
AWS Batch > Compute environments > Create compute environment 
Create compute environment 
Compute environment configuration 
Compute environment type 
O Managed 
AWS 
Compute environment name 
awr»ba h -job-demo I 
Unmanaged 
You and manage the 
and Qing 
up to letteÆ and nd we 
O Enable compute environment 
Additional settings: service role, instance role, EC2 key pair

 

AWS Batch 
New Batch experience 
Dashboard 
Job definitions 
Job queues 
Compute environments 
Wizard 
O 
On-de mand 
EC2 instants 
billed semi 
by using Spot 
be with a two minute 
Et2 —is the capuity back. 
Minimum vCPUs 
By kæping this to O you will not is to be If set this will maintain 
ttut of at tims_ 
Maximum vCPlJs 
limits the *CPUs 
Desired vCPLJs - optional 
Allowed instance types 
Choose options 
optimal X 
Clear instance types 
the ht of "4, C4. md R4 wailable in dehne their vCpu md —y at

 

Now create "Job queues" and provide the priority of the job and also map the compute instance you created above:

AWS Batch 
New Batch experience 
Dashboard 
Job definitions 
Job queues 
Compute environments 
Wizard 
Priority 
Job with a 
A whole O and 1 
Additional configuration 
Tags - Optional 
A is a Label that assign to an AWS Each Gg of a key and m optional value. Gn tags to filter 
or track AWS 
No tags associated with the resource. 
Add tag 
You an add up to 50 tags. 
Connected compute environments 
@ When connecting compute environments to a queue, each environment must be of the same type, either 
EC2/EC2 Spot. or Fargate/Fargate Spot 
Select a compute environment 
Click to select 
Compute environment order 
C) 
Name 
aws-ba ob-demo I -ec2 
VAUD 
ENABLED

 

Now create "Job definition" and here mainly you have specifying your docker file and command for your docker file which will run your job start. Here with container image property, I am providing my docker hub image i.e. "my-docker-hub-repository-name/my-docker-image". And with command I am providing the same command which I have with my docker file, so here it is not mandatory but at least you have to provide the language source text i.e. "python".

While you create job definition, it will ask for IAM role and that role must be "ECS" task type and then make sure you have all other required permissions which need for that job to perform like that role must have cloud watch log write permission and also dynamo db put item permission, so attach all of them with policy and use that role here.

AWS Batch 
New Batch experience 
Dashboard 
Job definitions 
Job queues 
Compute environments 
Wizard 
x 
AWS Batch > Job definitions > Create job definition 
Create job definition 
Job definition 
Name - required 
up to letteÆ and twphens. nd are fwst job definition that is with 
ttut name gim a of 1 _ Any job that registæd with that we an 
aws-batch-job-demo I -job-definition -ec2 
Platform 
a job deinitim EC2 
Fargatc 
a job for Batch jobs Fugate

 

AWS Batch 
New Batch experience 
Dashboard 
Job definitions 
Job queues 
Compute environments 
Wizard 
Container properties 
When a job defin ition. must specify a list of that to the Docker a 
instanæ the jca is 
Image 
The image to start a This string the in the Docke Hub 
wailab.le by default can *Rify other repositoriE up 2SS letws 
numbers, hyphens, undeæs colons. penods, fOMard and numLYr sigtE are allowed 
sarajeevraj/aws-ba tch-j ob-demo I 
Command syntax 
The want to when job 6 tun. 
O gash 
o 
JSON 
The want to when job 6 tun. 
python /src/bulkload.gy 
Result of command conversion to 'SON 
["python", " / src/bulkload.py"]

 

Now, create jobs, so basically here I am submitting/manually-triggering my job to run. So while creating, select your job definition and job queue you create above:

AWS Batch 
New Batch experience 
Dashboard 
Job definitions 
Job queues 
Compute environments 
Wizard 
x 
AWS Batch > Jobs > Submit new job 
Submit new job 
Job run-time 
Name - required 
h-job-demo I -j ob-ec2 
up to letteÆ and nd we 
Job definition - required 
The job by this job 
aws-batch-job-demo I-job-definition-ecz I 
Job queue - required 
aws-batch -job-demo I 
Execution timeout 
The time duration in secomis from the job attempts timætamp) after "lich AWS Satch unfin&hed 
The minimum value fu the is

 

Now, you can go to dashboard and from there track the progress, as per status count digit, you can click on that and will redirect to that particular page to see the detail, for example in case of fail, you can click on that fail count and it will redirect you to job page where you can see how the status resign along with cloud watch log data as well:

AWS Batch 
New Batch experience 
Dashboard 
Jobs 
Job definitions 
Job queues 
Compute environments 
Wizard 
AWS Batch > Dashboard 
Dashboard 
Jobs overview 
RUNNABLE 
Job queue overview 
Job queue 
aws-batch-job-demoI-job-queue-ec2 
Compute environment overview 
Last updated: 07:57:56 PM. Auto-refreshes every 60 seconds. 
FAILED 
4 
C 
Priority 
RUNNING 
SUBMITTED 
SUCCEEDED 
PENDING 
Instance types 
optimal 
RUNNABLE 
Status 
VALID 
STARTING 
RUNNING 
SUCCEEDED 
Desired vCPUs 
FAILED V 
Maximum vCPUs 
Name 
aws-batch-job-demoI-ec2 
Type V 
MANAGED 
Provisioning model 
State V 
ENABLED 
Minimum vCPUs

 

AWS Batch 
-job-ec2 
New Batch experience 
Dashboard 
Job definitions 
Job queues 
Compute environments 
Wizard 
aws-batch-job-demol 
Job status 
@ SUBMITTED 
@ PENDING 
Job information 
Name 
aws-batch-job•demol -job-ec2 
ARN 
RUNNABLE 
@ STARTING 
@ RUNNING 
1860393713109:job/gabbac3b-f3g2-4b6e-9g9a-92a3b1050ag7 
Job queue 
I I-job-queue-ec2 
Created at 
Jul 2021 
Status 
SUCCEEDED 
Log stream name 
aus.batch-job-dem01-job-definition-ec2/defau1t/213dgb2bgc704691b3673ecg553f68be 
@ SUCCEEDED 
Status 02ason 
Essential container in task exited 
Started at 
Ju131 2021 
Stopped at 
Ju131 2021 
Total run time 
O days O hours O minutes O seconds 
Tags

 

After job success, you can see data got inserted into my DynamoDB:

Note: Here I had first created my dynamo db table "AWSBatchJobTestTable" with Id field in advance before this job run, so this job just inserted record to existing table:

DynamoD3 
Dashboard 
Tables 
Backups 
Reserved capacity 
Exports to S3 
P artiOL editor 
Dashboard 
Subnet groups 
P Ma meter groups 
Try the preview of 
the new console 
Create table 
Delete table 
WSBatchJobTestTable 
Overview Items Metrics 
Alarms 
Indexes 
filler 
{ "Name 
{ "Name" 
{ "Name" 
{ "Name" 
{ "Name" 
GloW Tables 
'"Rajeev'}} 
{"S "RajeeV'] 
{"S "RajeeV'] 
{"S "RajeeV'] 
{"S "RajeeV'] 
Backups 
Q Filter by table name 
Choose a table . 
AbVS8atchJobTestTab1e 
BatchJobTestTabIe 
scan: [Table] AWSBatch.JobTestTab1e: Id 
Scan 
O Add 
Start 
Rajeev 
Rajeev 
Rajeev 
Rajeev 
Rajeev

 

Categories/Tags: AWS Batch~AWS Batch Job

Recent Articles

1

AWS Saving Plan - Cost optimization tips

2
3

AWS RDS Key Concepts & Why you should use it?

4
5

Open-Search/Kibana - Multi Tenancy Setup

See All Articles