AWS -Launch first EC2 instance and Learn Basic S3 Operations

Priyanka Biswas
12 min readMar 24, 2019

--

AMAZON WEB SERVICES (AWS)

SUMMARY: The AWS document can be utilized as a reference guide which answers most of the commonly asked questions like How to create account in AWS. How to connect and launch an EC2 instance using PuTTY and Basic S3 Operations

Writer: Priyanka Biswas

Amazon Web Services (AWS) is a subsidiary of Amazon.com that provides on-demand cloud computing platforms.They include Amazon Elastic Compute Cloud, also known as “EC2”, and Amazon Simple Storage Service, also known as “S3”. As of 2017, AWS has more than 100 services, spanning a wide range, including compute, storage, networking, database, analytics, application services, deployment, management, mobile, developer tools and tools for the Internet of Things. Amazon markets AWS as a service to provide large computing capacity quicker and cheaper than a client company building an actual physical server

Follow the login instructions. In the payment page you have give your credit card information. 2 Rs will be deducted from your card, which will be refunded within 5 days.

Check the card details and in My account section

If you get any message like this means you have not completed the registration process. If needed contact Amazon support. Click on Contact support and write your query.

LETS GET STARTED:

Account is created and we are ready to launch EC2 Instance in cloud

WHAT IS AMAZON EC2?

Amazon Elastic Compute Cloud (Amazon EC2) provides scalable computing capacity in the Amazon Web Services (AWS) cloud. Using Amazon EC2 eliminates your need to invest in hardware up front, so you can develop and deploy applications faster. You can use Amazon EC2 to launch as many or as few virtual servers as you need, configure security and networking, and manage storage. Amazon EC2 enables you to scale up or down to handle changes in requirements or spikes in popularity, reducing your need to forecast traffic.

On the top left side click on Services.. Compute.. EC2

It will navigate you to other page, then click on Launch Instance

STEP 1: CHOOSE AN AMAZON MACHINE IMAGE (AMI)

Launch instance will take you to page like this. Here we have to select which AMI we need. Several AMI’s are available in Amazon market place. But what is an AMI?

AMAZON MACHINE IMAGES (AMI)

An Amazon Machine Image (AMI) provides the information required to launch an instance, which is a virtual server in the cloud. You specify an AMI when you launch an instance, and you can launch as many instances from the AMI as you need. You can also launch instances from as many different AMIs as you need.

An AMI includes the following:

  • A template for the root volume for the instance (for example, an operating system, an application server, and applications)
  • Launch permissions that control which AWS accounts can use the AMI to launch instances
  • A block device mapping that specifies the volumes to attach to the instance when it’s launched.

Here I am selecting Amazon Linux AMI 2017.03.0 (HVM), SSD Volume Type — ami-c58c1dd3 which is an EBS-backed, AWS-supported image. The default image includes AWS command line tools, Python, Ruby, Perl, and Java. The repositories include Docker, PHP, MySQL, PostgreSQL, and other packages.

Launch only those instances which are Free tier eligible.

STEP 2: CHOOSE AN INSTANCE TYPE

Amazon EC2 provides a wide selection of instance types optimized to fit different use cases. Instances are virtual servers that can run applications. They have varying combinations of CPU, memory, storage, and networking capacity, and give you the flexibility to choose the appropriate mix of resources for your applications.

Select t2.micro which is free tier eligible and click on NEXT: Configure Instance Details

STEP 3: CONFIGURE INSTANCE DETAILS

Change the fields values according to the requirement or else click on Next: Configure Instance

Details

STEP 4: ADD STORAGE

Your instance will be launched with the following storage device settings. You can attach additional EBS volumes and instance store volumes to your instance, or edit the settings of the root volume.

By default 8gb storage is given. Click on Next: Add Storage

STEP 5: ADD TAGS

Currently resource will have no tags. Give your key value pair or click on Next : Add Tags

STEP 6: CONFIGURE SECURITY GROUP

A security group is a set of firewall rules that control the traffic for your instance. On this page,you can add rules to allow specific traffic to reach your instance. For example, if you want to setup a web server and allow Internet traffic to reach your instance, add rules that allow unrestricted access to the HTTP and HTTPS ports.

Click on Review and launch

STEP 7: REVIEW INSTANCE LAUNCH

Review your AMI Details, Instance type, Security Groups, Instance details, Storage, Tags. You can go back and update if anything is required or click on Launch. A pop up window will occur like this.

For security purpose, we have to create a key value pair. From the dropdown select Create a new key pair

Give name of the Key pair and click on Download key pair.. click on launch instance.

Key pair will be downloaded in your downloads folder

Do not click on Launch instance without downloading the key pair otherwise you will not be able to connect to your instance.

Congratulations ! you have launched your first EC2 instance.

You can see something like i-0d9****** that’s your instance id. Click on that

CONNECTING TO YOUR LINUX INSTANCE FROM WINDOWS

USING PUTTY

Before you connect to your Linux instance using PuTTY, complete the following prerequisites:

• Install PuTTY

Download and install PuTTY.If you already have an older version of PuTTY installed, we recommend that you download the latest version. Be sure to install the entire suite.

• Get the ID of the instance

You can get the ID of your instance using the Amazon EC2 console (from the Instance ID column).

• Get the public DNS name of the instance

You can get the public DNS for your instance using the Amazon EC2 console (check the Public DNS (IPv4) column; if this column is hidden, choose the Show/Hide icon and select Public DNS (IPv4)

• (IPv6 only) Get the IPv6 address of the instance

If you’ve assigned an IPv6 address to your instance, you can optionally connect to the instance using its IPv6 address instead of a public IPv4 address or public IPv4 DNS hostname. Your local computer must have an IPv6 address and must be configured to use IPv6. You can get the IPv6 address of your instance using the Amazon EC2 console (check the IPv6 IPs field).

• Locate the private key

Get the fully qualified path to the location on your computer of the .pem file for the key pair that you specified when you launched the instance.

• Enable inbound SSH traffic from your IP address to your instance

Ensure that the security group associated with your instance allows incoming SSH traffic from your IP address.

CONVERTING YOUR PRIVATE KEY USING PUTTYGEN

PuTTY does not natively support the private key format (.pem) generated by Amazon EC2. PuTTY has a tool named PuTTYgen, which can convert keys to the required PuTTY format (.ppk). You must convert your private key into this format (.ppk) before attempting to connect to your instance using PuTTY.

TO CONVERT YOUR PRIVATE KEY

1. Start PuTTYgen (for example, from the Start menu, choose All Programs > PuTTY > PuTTYgen).

2. Under Type of key to generate, choose RSA.

If you’re using an older version of PuTTYgen, choose SSH-2 RSA.

3. Choose Load. By default, PuTTYgen displays only files with the extension .ppk. To locate your .pem file, select the option to display files of all types.

4. Select your .pem file for the key pair that you specified when you launch your instance, and then choose Open. Choose OK to dismiss the confirmation dialog box.

5. Choose Save private key to save the key in the format that PuTTY can use. PuTTYgen displays a warning about saving the key without a passphrase. Choose Yes.

6. Specify the same name for the key that you used for the key pair (for example, my-keypair). PuTTY automatically adds the .ppk file extension.

Your private key is now in the correct format for use with PuTTY. You can now connect to your instance using PuTTY’s SSH client.

STARTING A PUTTY SESSION

Use the following procedure to connect to your Linux instance using PuTTY. You need the .ppk file that you created for your private key.

To start a PuTTY session

1. Start PuTTY (from the Start menu, choose All Programs > PuTTY > PuTTY).

2. In the Category pane, select Session and complete the following fields:

a. In the Host Name box, enter user_name@public_dns_name. Be sure to specify the appropriate user name for your AMI. For example:

  • For an Amazon Linux AMI, the user name is ec2-user.
  • For a RHEL AMI, the user name is ec2-user or root.
  • For an Ubuntu AMI, the user name is ubuntu or root.
  • For a Centos AMI, the user name is centos.
  • For a Fedora AMI, the user name is ec2-user.
  • For SUSE, the user name is ec2-user or root.
  • Otherwise, if ec2-user and root don’t work, check with the AMI provider.

3. (IPv6 only) To connect using your instance’s IPv6 address, enteruser_name@ipv6_address. Be sure to specify the appropriate user name for your AMI. For example:

a. For an Amazon Linux AMI, the user name is ec2-user.

b. For a RHEL AMI, the user name is ec2-user or root.

c. For an Ubuntu AMI, the user name is ubuntu or root.

d. For a Centos AMI, the user name is centos.

e. For a Fedora AMI, the user name is ec2-user.

f. For SUSE, the user name is ec2-user or root.

g. Otherwise, if ec2-user and root don’t work, check with the AMI provider.

4. Under Connection type, select SSH.

5. Ensure that Port is 22.

6. In the Category pane, expand Connection, expand SSH, and then select Auth. Complete the following:

a. Choose Browse.

b. Select the .ppk file that you generated for your key pair, and then choose Open.

c. (Optional) If you plan to start this session again later, you can save the session information for future use. Select Session in the Category tree, enter a name for the session in Saved Sessions, and then choose Save.

d. Choose Open to start the PuTTY session.

7. If this is the first time you have connected to this instance, PuTTY displays a security alert dialog box that asks whether you trust the host you are connecting to.

8. (Optional) Verify that the fingerprint in the security alert dialog box matches the fingerprint that you previously obtained in step 1. If these fingerprints don’t match, someone might be attempting a “man-in-the-middle” attack. If they match, continue to the next step.

9. Choose Yes. A window opens and you are connected to your instance.

AWS COMMAND LINE INTERFACE

APPLY ALL UPDATES

$ sudo yum update

This will apply all the updates

CONNECT WITH PYTHON:

$python

This will take you to python shell

AWS CONFIGURE

Configure AWS CLI options. If this command is run with no arguments, you will be prompted for configuration values such as your AWS Access Key Id and you AWS Secret Access Key. You can configure a named profile using the — profile argument. If your config file does not exist (the default location is ~/.aws/config), the AWS CLI will create it for you. To keep an existing value, hit enter when prompted for the value. When you are prompted for information, the current value will be displayed in [brackets]. If the config item has no value, it be displayed as [None]. Note that the configure command only work with values from the config file. It does not use any configuration values from environment variables or the IAM role.

Note: the values you provide for the AWS Access Key ID and the AWS Secret Access Key will be written to the shared credentials file (~/.aws/credentials).

aws configure [ — profile profile-name]

To create a new configuration:

$ aws configure –profile user1

  • AWS Access Key ID [None]: accesskey
  • AWS Secret Access Key [None]: secretkey
  • Default region name [None]: us-west-2
  • Default output format [None]:

S3 BUCKET

LIST FILES IN S3 BUCKET

This will list all the files present in your sub folder

$ aws s3 ls s3://mybucket/ — -profile profilename

COPYING FROM S3 TO HDFS

We will copy the scene_list.gz file from a public S3 bucket called landsat-pds to HDFS:

1. First, let’s check if the scene_list.gz file that we are trying to copy exists in the S3 bucket:

hadoop fs -ls s3a://landsat-pds/scene_list.gz

2. You should see something similar to:

-rw-rw-rw- 1 cloudbreak 33410181 2016–11–18 17:16 s3a://landsat-pds/scene_list.gz

3. Now let’s copy scene_list.gz to your current directory using the following command:

hadoop distcp s3a://landsat-pds/scene_list.gz .

4. You should see something similar to:

________________________________________________________

[cloudbreak@ip-10–0–1–208 ~]$ hadoop distcp s3a://landsat-pds/scene_list.gz .

16/11/18 22:00:50 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, overwrite=false, skipCRC=false, blocking=true, numListstatusThreads=0, maxMaps=20, mapBandwidth=100, sslConfigurationFile=’null’, copyStrategy=’uniformsize’, preserveStatus=[], preserveRawXattrs=false, atomicWorkPath=null, logPath=null, sourceFileListing=null, sourcePaths=[s3a://landsat-pds/scene_list.gz], targetPath=, targetPathExists=true, filtersFile=’null’}…

.

.

5. Now let’s check if the file that we copied is present in the cloudbreak directory:

hadoop fs -ls

6. You should see something similar to:

-rw-r — r — 3 cloudbreak hdfs 33410181 2016–11–18 21:30 scene_list.gz

Congratulations! You’ve successfully copied the file from an S3 bucket to HDFS!

CREATING AN S3 BUCKET

In this step, we will copy the scene_list.gz file from the cloudbreak directory to an S3 bucket. But before that, we need to create a new S3 bucket.

1. In your browser, navigate to the S3 Dashboard https://console.aws.amazon.com/s3/home.

2. Click on Create Bucket and create a bucket:

For example, here I am creating a bucket called “domitest”. Since my cluster and source data are in the Oregon region, I am creating this bucket in that region.

3. Next, navigate to the bucket, and create a folder:

For example, here I am creating a folder called “demo”.

4. Now, from our cluster node, let’s check if the bucket and folder that we just created exist:

hadoop fs -ls s3a://domitest/

5. You should see something similar to:

Found 1 items

drwxrwxrwx — cloudbreak 0 2016–11–18 22:17 s3a://domitest/demo

Congratulations! You’ve successfully created an Amazon S3 bucket.

COPYING FROM HDFS TO S3

1. Now let’s copy the scene_list.gz file from HDFS to this newly created bucket:

hadoop distcp /user/cloudbreak/scene_list.gz s3a://domitest/demo

2. You should see something similar to:

______________________

[cloudbreak@ip-10–0–1–208 ~]$ hadoop distcp /user/cloudbreak/scene_list.gz

s3a://domitest/demo

16/11/18 22:20:32 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, overwrite=false, skipCRC=false, blocking=true, numListstatusThreads=0…….

.

.

3. Next, let’s check if the file that we copied is present in the cloudbrek directory:

hadoop fs -ls s3a://domitest/demo

4. You should see something similar to:

Found 1 items

-rw-rw-rw- 1 cloudbreak 33410181 2016–11–18 22:20 s3a://domitest/demo/scene_list.gz

5. You will also see the file on the S3 Dashboard:

Congratulations! You’ve successfully copied the file from HDFS to the S3 bucket!

NEXT STEPS

1. Try creating another bucket. Using similar syntax, you can try copying files between two S3 buckets that you created.

2. If you want to copy more files, try adding -D fs.s3a.fast.upload=true and see how this accelerates the transfer.

3. Try running more hadoop fs commands

CLEANING UP

Any files stored on S3 or in HDFS add to your charges, so it’s good to get into the habit of getting rid of the files.

1. To delete the scene_list.gz file from HFDS, run:

hadoop fs -rm -skipTrash /user/cloudbreak/scene_list.gz

2. To delete the scene_list.gz file from the S3 bucket, run:

hadoop fs -rm -skipTrash s3a://domitest/demo/scene_list.gz

--

--