AWS re:Invent 2017

LFS309 - Highthroughput Genomics on AWS


00. Prerequisites

01. Containerizing applications and AWS Batch

02. Encoding genomics workflows

03. Summary and clean-up

These are the materials for the AWS re:Invent 2017 workshop, LFS309 - High Throughput Genomics on AWS

During the hands-on sessions, workshop attendees will build the following typical workflow for genomics analysis.

Typical genomics workflow

Prerequisites for the Workshop

Be sure to complete the prerequisites before attending the workshop.

For this workshop you will need:

  1. A valid AWS account
  2. Access to the Oregon region (us-west-2)
  3. Administrative rights to configure and create resources in the following services:
Service Use? Admin? Description
Amazon Virtual Private Cloud (VPC) Yes Maybe All compute resources will launch in one of your VPCs in the us-west-2 region. You can either create a new VPC specifically for this workshop (recommended) or leverage one of your existing VPCs.
AWS CloudFormation Yes No Used to execute CloudFormation templates to create the resources in other AWS services.
AWS Identity and Access Management (IAM) Yes Yes IAM Roles will be created and used within the other services, such as Amazon EC2, AWS Batch, and AWS Lambda
Amazon Simple Storage Service (S3) Yes Yes Bucket will be created for output of results.
AWS Batch Yes Yes A new AWS Batch environment is created during this workshop. If you already have a Batch environment you can utilize it, but this will not be supported during the workshop
AWS Lambda Yes Yes Lamdba functions will be created and executed within the workshop
AWS Step Functions Yes Yes Step Functions will be created and executed during the workshop
Amazon Elastic Compute Cloud (EC2) Yes Yes Instances will be launched by AWS Batch
Amazon EC2 Container Service (ECS) Yes Yes AWS Batch relies on ECS to distribute the Docker containers on the instantiated EC2 instances
Amazon EC2 Container Registry (ECR) Yes Yes We will be creating a ECR repository for hosting the Docker containers in this workshop

If you are not able to have administrative access to the above services, please pair up with a table mate who can to accomplish the hands-on-labs.

Slides

A link to the slides will be shared at the workshop.

Hands on Labs

You will have the opportunity to implement a genomics workflow across three hands-on-labs. These are:

Lab 1: Application containerization and setting up AWS Batch

The first lab consists of creating a Docker container for an application, and the AWS Batch environment that will be used to perform the individual units of work. You will also run a example task on data to test out the environment.

Lab 2: Creating and executing a genomics workflow

The second lab builds on the second by creating the AWS Lambda and AWS Step Functions necessary to build out the full environment.

Lab 3: Summary and Cleanup

The last exercise summarizes the previous two labs, and has instructions on how to break down the AWS environment you created so that ongoing charges are not incurred.