Part two: Continuously collect LIVE data from a REST API using AWS Cloud9 and Lambda

Taylor Bickell
Analytics Vidhya
Published in
6 min readJan 30, 2020

Reminder: This is part of a two part series. Should you need to reference it, part one can be found here.

Serverless is an appealing option and the best direction for some projects. Seamless scalability, less management and a pay for what you consume model are a few reasons for its appeal. Amazon Web Services(AWS) has developed an attractive set of products that satisfy serverless computing with offerings like Cloud9 and Lambda.

When it comes to continuously collecting and automating data collection, AWS RDS, Cloud9 and Lambda prove to be an effective combination.

AWS Cloud9

Cloud9 is a cloud-based IDE that fits with ease into serverless development. Prepackaged with essential tools for popular programming languages like python, it’s an obvious place to get started building projects.

How do you create a Cloud9 environment?

Step 1- If you’re not already in the AWS console, login here. Find and select the Cloud9 service. You should end up on a page that looks like this.

Step 2- To get started creating an environment, click “Create environment.” Proceed to provide a name for the environment. If you’d like to add a description you can, however it’s optional. When done, select “Next step.”

Step 3- Make sure you’ve configured the settings of the environment to match what is shown below. Click “Next step” once you’ve verified them.

Step 4- Make a final review of the environment settings, and select “Create environment” once you’ve confirmed.

Step 5- It may take a few minutes to create a Cloud9 environment. Once the environment has been successfully created however, you should be guided to a page similar to what’s shown below. Welcome to your Cloud9 environment. 😁

AWS Cloud9 for AWS Lambda

Step 1- It’s now time to create a Lambda Function within Cloud9. You should see “Create Lambda Function…” located near the left center of your screen. It may be helpful to reference the image from the last step directly above. By clicking on “Create Lambda Function…”, you’ll be led to a screen that prompts a name for the Lambda Function. Click “Next” after completing this.

Step 2- The blueprint I’ll select for my function is “empty-python.” This could vary based on the language and function you’re trying to create. After selecting a blueprint, click “Next.”

Step 3- At this time, we won’t worry about a function trigger. Select “Next” to proceed.

Step 4- The default settings for memory and role are adequate for what’s being accomplished here, click “Next.”

Step 5- Review the function settings one last time before selecting “Finish.”

Step 6- When you select “Finish,” you should see a screen that looks like this. The Lambda Function has been successfully created.

Lambda Function Example (Collecting OHLC data from the Cryptowatch API)

Based on the Lambda Function I’m using for collecting data from the Cryptowatch API, there are packages that need to be installed prior to deployment. To do this properly, right click the outside folder. My folder is entitled “insertdata.”

Select “Open Terminal Here” located toward the bottom of the menu.

Close the terminal that’s open in the lower half of the page and drag the new terminal down to where the old terminal was. When you’ve done that, type the command source venv/bin/activate into the terminal. Follow that by pressing “Enter” on the keyboard.

There are two packages in particular that need to be installed for this function to execute: psycopg2-binary and requests. I will install them both.

After installing the above packages, I inputted the code for the Lambda Function into the lambda_function.py file. Once you’ve installed the necessary packages and your code is included in the .py file, the Lambda Function is ready to be deployed.

The entire code for the Lambda Function I use can be found directly below.

https://gist.github.com/tcbic/1ba9b96f8d122f1725b1ac1066d6c7b0

For additional detailed information regarding setting up a Cloud9 environment and deploying a Lambda Function, I suggest taking a look at this post here. In addition to further detail on how to deploy a Lambda Function, information regarding configuring an event trigger, an important piece to automating data collection, is discussed.

Verify data is being inserted into the PostgreSQL database

You can verify data is being inserted into the database directly within pgAdmin, however, below is an example of a query using python and SQL code that can be run in a Jupyter or Google Colab notebook to check and confirm the most recent database entries.

# Define credentials.credentials = {"POSTGRES_ADDRESS" : "example.xxxxxxxxxxxx.us-east-1.rds.amazonaws.com",
"POSTGRES_PORT" : "5432",
"POSTGRES_USERNAME" : "test123",
"POSTGRES_PASSWORD" : "xxxxxxxxx",
"POSTGRES_DBNAME" : "example",
"API_KEY" : "xxxxxxxxxxxxxxxxxxxx"}
# Import statementsimport psycopg2 as ps# Establish a connection to the database.

conn = ps.connect(host=credentials['POSTGRES_ADDRESS'],
port=credentials['POSTGRES_PORT'],
user=credentials['POSTGRES_USERNAME'],
password=credentials['POSTGRES_PASSWORD'],
database=credentials['POSTGRES_DBNAME'])

# Create a cursor.

cur = conn.cursor()

# Create and execute a query to get the most recent 50 rows.

query = '''SELECT * FROM example1.coinbase_pro_eth_btc ORDER BY closing_time DESC LIMIT 50;'''

cur.execute(query)

recent_50_rows = cur.fetchall()

conn.commit()

for row in recent_50_rows:
print(row)

conn.close()

Running the above code will print the most recent 50 candlesticks in the PostgreSQL database for the coinbase_pro_eth_btc table within the example1 schema. You’ll want to make sure the schema and table name match what you have entered within your database.

Note: Closing time, in this case, is in epoch time. To convert time into a more human readable format, I recommend this website.

Primary Accomplishments 👏

· Create an AWS Cloud9 environment.

· Create a Lambda Function within Cloud9.

· Verify data is being inserted into the PostgreSQL database using python and SQL code.

I’d love to connect! The best place to find me is on LinkedIn. :)

--

--