Getting Started with AWS Linux for Deep Learning

This article is broken up into two sections. First I go over some of the AWS web interface/dashboard menus. In the final section we head into the console and ssh into our EC2 instance and get it up and running. I assume you know how to use conda to activate a virtual environment, and have used at least one deep learning framework. I will not be going over either of these in this article.

Amazon pioneered the IaaS (infrastructure as a service) business with its AWS (Amazon Web Service). AWS is highly entrenched in the day-to-day activities for many data scientists and machine learning engineers. Knowing your way around the AWS web interface, as well as how to set up an EC2 instance is a valuable skill to learn, and you can transfer some of this set up knowledge to other IaaS services such as Google Cloud. Here I attempt to pull back the fold and reveal a simple starter workflow to get your basic deep learning machine up and running. Sounds good? Ok let’s jump in!

First if you do not already have an AWS account, now is the time to sign up for one. Luckily Amazon does provide free EC2/S3 resources when you sign up to help you get acquainted with their cloud platform. You will not be able to perform any GPU deep learning for free, but you can set up a free EC2 to practice and play around. AWS charges by the second and you don’t want to waste too much time when you are paying for GPU compute time.

Setting Up your S3 Bucket

Sign into the AWS console and navigate to S3. The following images will show you how to create a bucket, add a folder and upload some data.

Create a bucket by clicking the blue button
Give it a name, I kept the default region, click next
Your default permissions should look something like mine. Click next and then create the bucket
You will see your bucket now in S3. Click on the bucket name
Create a folder, click the folder to enter it, and then click upload
I dragged in some XML data from my desktop and clicked upload
Awesome! we now have some data in the cloud in our S3 bucket.

Generating your .pem Key File

Now that you have some data in S3, it’s time to navigate back to the AWS main page and look for EC2 by searching for ‘EC2’ or clicking on the link.

Before we start getting into the weeds you need to create a secret key file (.pem file) that you will link with the EC2 instance you create. This key file allows you to ssh into your EC2 instance.

Look at the left hand menu where it says Network & Security. Click on Key Pairs and then on Create Key Pair. Follow the steps below.

  1. Enter a name for the new key and click Create.
  2. The key file (.pem file) will download and you will need to save it in a secure place. I put it in my home directory: ~/.secret/my_key.pem
  3. You cannot download this key again. If you lose access to it, or delete it, you will not be able to log into the EC2 instance that is associated with this .pem file. NOTE! Never commit/push your .pem file to github…
  4. If you plan to SSH into the EC2 via MacOS or Linux you will need to set the permissions of the .pem file to be read only by the owner. Navigate to the directory where the .pem file is located and run:
$ chmod 400 my-key-pair.pem #gives owner, you, read access

Creating and Launching your EC2 Instance

Great. Now that your .pem file is securely stored you are ready to create an EC2 instance. Bring back the EC2 Dashboard and click on the blue button to launch an instance:

EC2 Dashboard

STEP 1.

Now you will be presented with many options for selecting an AMI (Amazon Machine Image). An AMI is like a template virtual computer. It contains an operating system, applications, and other necessary software. It also includes permissions and security rules/roles as well as the ability to map additional devices for storage (e.g. EBS volumes which kind of act like an additional hard drive you might plug into the SATA port of you physical computer, if you wanted more space).

As this article is geared towards setting up an EC2 for deep learning, I searched for deep learning under the community AMIs. You cant go wrong with either Ubuntu or Amazon Linux version 10.0.

  • Amazon Linux is maintained/built by AWS, so it probably has better all around support and comes with a lot of preinstalled AWS management tools. Big Bonus!
  • Ubuntu (based on Debian Linux) has a richer package/software repository and a larger user base. Depending on what specific software you need, Ubuntu could be the better choice.

Once you select an AMI (step 1) you will be able to choose the hardware.

[here](https://s3-us-west-2.amazonaws.com/kmshannon.com/images/2018-06-25-setup-aws-ec2-s3-deep-learning/list-amis.png)

Step 2.

If your goal is to just test out AWS, you can start with the free instance type: t2.micro. I will continue with GPU ready hardware to make use of the deep learning AMI we choose previously. The simplest one (p2.xlarge) is an Nvidia k80 and is priced around $0.90 per hour. This is a really powerful and fast machine! The other options start to climb in $$$ and include multiple GPUs allowing you parallelize your deep learning models -Fancy! Amazon pricing has been “sort of mystical”, but they do provide you with an online calculator to help you get an idea for how much your you might bd looking to spend.

Choosing your EC2 Compute Specs

Step 3.

Next step is to configure general info about your instance. For now you shouldn’t need to mess around with these settings. The important ones to keep an eye out for though are:

  • subnet: Make sure you select the specific availability zone where your S3 bucket is stored.
  • Request spot instance pricing can save you money on EC2s. You are essentially bidding on open EC2 space. If you are outbid you can have you machine pulled temporarily offline. So this can become a gamble. Remember to checkpoint your models… Matthew Powers has a nice write up on spot pricing (though the writeup is a few years old now).
  • Finally, make sure your shut down behavior is set correctly. I have it set to stop instead of terminate. As much as I enjoyed Terminator I & II, I do not want my EC2 terminated… Unless I am doing the terminating. >:D

Step 4.

Here you can increase the storage of your root drive, as well as add additional EBS volumes. I typically add on another EBS drive for data and so on, because you can detach these volumes and reattach/mount them to another EC2 instance. Very useful! General Purpose SSD is perfectly reasonable choice to go with for now.

Adding Additional Volumes

Step 5.

Not pictured, but this step you can simply add tags to your EC2 instance, as well as your volumes. AWS documentation has a great writeup on tagging strategy . For now its really of minimal concern. But be sure to give the AWS docs a look over!

Step 6.

Security. Security. Security. First and foremost, I will reiterate this again. Do not make your .pem file publicly available (github etc), nor your secret access keys… In this section you can set the access rights to your EC2. You can set it to your IP address, or leave it open so you can easily log in from anywhere. I typically VPN into my home network, so that is an option as well. For now I say keep it simple. You want to lower as many barriers when first starting out. But for production level environments and real work, yFou will want to look more into setting up the your EC2 properly/securely.

Security Rules for your EC2

Step 7.

Finally you will need to choose a key (.pem file) to associate with this EC2 instance. This key is required to ssh into the instance.

Click Launch Instance and congratulations, you have AWS EC2 services starting up. While you wait, you can navigate to the directory where your .pem file is located. Then once your instance is running (green light), you will want to click the connect button.

Ready to use EC2

This opens up a modal where you can copy paste the ssh command that will allow you to log into the EC2.

Run the .ssh command from the connect screen. This command is unique to each EC2 and will look slightly different from the one shown above. Once you run the command you may be asked yes/no, type “yes” and press enter.

Boom. You should be in! Congratulations.
Now that you are in your EC2 you are the master of your domain.

Setting up your EC2 Instance

First you should update and upgrade your EC2 software. Depeding on if you are using Amazon Linux or Ubuntu, this will be different (e.g. apt vs yum) Amazon docs have a good write up on performing this on their linux AMI. If you are using Ubuntu you can do the following:

$ sudo apt update && sudo apt upgrade

Now that you are in, you should be able to run lsblk to list the block level devices:

$ lsblk
    NAME    MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
    xvda    202:0    0   8G  0 disk
    └─xvda1 202:1    0   8G  0 part / #root volume (OS is here)
    xvdb    202:16   0  10G  0 disk   #Additional EBS volume
$ sudo file -s /dev/xvdb #Your vol might have different lettering!
    /dev/xvdb: data

Now you need to create a filesystem and mount the extra EBS drive you attached to the EC2 when you launched it.

# ----> WARNING if you run this command on a drive with data already on it, the data will be wiped...
# create an ext4 filesystem on the EBS drive
$ sudo mkfs -t ext4 /dev/xvdb
    mke2fs 1.42.12 (29-Aug-2014)
    Creating filesystem with 2621440 4k blocks and 655360 inodes
    Filesystem UUID: 32dc04b7-5788-4796-95c5-ebcaa3ab4757
    Superblock backups stored on blocks:
     32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632
    Allocating group tables: done
    Writing inode tables: done
    Creating journal (32768 blocks): done
    Writing superblocks and filesystem accounting information: done

Now it is time to set the mount point for your EBS volume.

# Create a mount point, basically an empty directory
$ sudo mkdir <name-of-your-dir> #I will call mine test-ebs
# Mount the new volume!
$ sudo mount /dev/xvdb /<name-of-your-dir>

Now we are going to make some changes to your fstab file so the EBS will be auto mounted if you happen to shut down/restart this EC2:

# back up your fstab file....
$ sudo cp /etc/fstab /etc/fstab.bak
# Open fstab and make an entry
$ nano /etc/fstab.bak
# Add the following to the bottom of your fstab files
  /dev/xvdb /test-ebs ext4 defaults,nofail 0 0
$ cat /etc/fstab.bak
    #
    LABEL=/     /           ext4    defaults,noatime  1   1
    tmpfs       /dev/shm    tmpfs   defaults        0   0
    devpts      /dev/pts    devpts  gid=5,mode=620  0   0
    sysfs       /sys        sysfs   defaults        0   0
    proc        /proc       proc    defaults        0   0
    /dev/xvdb   /test-ebs   ext4    defaults,nofail 0   0
# Should see nothing from this command, means fstab changes are good. mount -a causes all filesystems in fstab to be mounted as indicated.
$ sudo mount -a

We also want to change who owns your EBS drive from Root to your user.

# find out your user name, typically ec2-user or ubuntu
$ whoami
$ sudo chown -R <username>:<username> ~/<name-of-your-dir>

You can now set up your EBS volumes as you like. For a project, I typically need to git clone a repo, install postgreSQL, or some database software, along with other tools like Apache Spark and so on. Now would be a good time to check your conda environments:

# should see something similar to this...
$ conda env list
    # conda environments:
    #
    base                  *  /Users/ubuntu/miniconda3
    py2                      /Users/ubuntu/miniconda3/envs/py2
    py3                      /Users/ubuntu/miniconda3/envs/py3
    pytorch               /Users/ubuntu/miniconda3/envs/pytorch
    ...

For example, I typically use git clone to add some repositories I need, as well as any additional software through conda or apt. You can check your current conda environments, if using a deep learning AMI you should see several envs each for a different frame work, e.g. TensorFlow, Caffe2, PyTroch and so on.

# should see something similar to this...
$ conda env list
    # conda environments:
    #
    base                  *  /Users/ubuntu/miniconda3
    py2                      /Users/ubuntu/miniconda3/envs/py2
    py3                      /Users/ubuntu/miniconda3/envs/py3
    pytorch               /Users/ubuntu/miniconda3/envs/pytorch

And see a list of all the conda environments you have. You may see PyTorch, TensorFlow, Caffe2, and several others.

You can also run this command to make sure your Nvidia GPU is accessible by your deep learning frameworks via CUDA and cuDNN:

$ nvidia-smi
Output of nvidia-smi

Sending Data Between EC2 and S3

Finally we are going to ensure you can send files between EC2 and S3. First we will set up the AWS CLI (Command Line Interface). You need to get your AWS Access Key ID and AWS Secret Access Key ready. These can be generated from the security dashboard (image below) or are provided to you if your account is managed for you/you are in an IAM role (IAM roles are a large/important topic for another article, find out more about them here ).

Also please do not put these keys in your github repo and push them. You wouldn’t believe the number of stories I have heard…

Creating your Access Keys, press the blue button…
# you might need to install the aws cli depending on the AMI you choose. 
$ pip install awscli --upgrade --user

For more info on installing the AWS CLI check out the relevant AWS docs . the AWS CLI has a lot of commands very similar to Linux, e.g. cp, sync, and so on. It is a very useful tool and I highly suggest you read up a bit on it and learn how to use it. It is a one stop shop for interacting with many AWS services. If you already have the AWS cli you can continue. You need to add those secret keys onto your EC2 instance in the AWS configure step:

# might need sudo here...
$ aws configure
AWS Access Key ID [None]: ENTER-IT
AWS Secret Access Key [None]: ENTER-IT
Default region name [None]: # press return key
Default output format [None]: # press return key
# Make sure these got saved....
$ aws configure list
        Name                    Value             Type    Location
       ----                    -----             ----    --------
    profile                <not set>             None    None
 access_key     ****************AAAA shared-credentials-file
 secret_key     ****************AAAA shared-credentials-file
     region                <not set>             None    None
# Now you should see your AWS bucket(s)...
$ aws s3 ls
    2018-01-04 22:05:57 my_bucket_1
    2018-01-04 22:07:26 my_bucket_2
    2018-01-04 22:06:21 my_bucket_3
# Lets test copying a file from S3 to our mounted EBS drive
aws s3 cp s3://my_bucket/LUNA16/csv-files/strings.xml ~/test-ebs/test-transfer/
# Check your EC2 volume to make sure the file is in there.
# If all is working then you are good to go!
# If you want to transfer over more than one file you should use sync
$ sudo aws s3 sy

Awesome! Now you can transfer data from S3 to your EC2 instance and vice versa. With these tools you are ready to begin training a deep learning model in the cloud with AWS.

Other software you should look into is a terminal multiplexer. I recommend Tmux. When you start tmux, it creates an interactive session on your EC2. You can attach and detach from the session as well, and it will still be there. Make sure to begin training your model while attached to a session. Why do this? Well if you don’t have an interactive session eventually your ssh pipe will break, you will be kicked out, and your model will fail to finish training. This can be a costly mistake.

Now you can begin training while in your favorite coffee shop, then head home and jump right back in!

Tmux is easy to install on linux and pretty straightforward to use once you get the hang of it… like anything I suppose. Alex Shnayder wrote a good intro article you should definitely check out!

One last piece of advice and this will hopefully save you from a costly $$$ mistake. If you do not manually shut down your EC2 instance it will run and run and run. Like this little guy

these commercials seemed like forever ago!

Amazon will gladly charge you too. Don’t let that happen to you. You work hard for your money. One team of students from my masters program ‘accidentally’ burned through $2000 because of this mistake. You can shut down (not terminate) your EC2 easily from the EC2 dashboard.

Ok. I think we got through plenty. Hopefully this was helpful. Thanks for sticking with me!

Cheers,

Kyle