Troubleshooting in AWS: Elastic Beanstalk and RDS

Aaron Watkins Jr
The Startup
Published in
5 min readOct 23, 2020

--

Potential steps to resolve dreaded “Degraded” and “Severe” status

Working with AWS offers some useful tools for API deployment, but certainly don’t make a project immune from API-crashing bugs. This post will walk through some potential issues users could face when managing APIs and databases through Elastic Beanstalk (EB) and RDS, and hopefully provide solutions when your EB Health looks scary!

AWS Health status will display red for “Degraded” or “Severe” health

When your EB environment is running well and your front-end developers have no issues utilizing your API, noticing your environment health change to this kind of status can be panic-inducing. But upon inspection, this kind of status offers an opportunity to identify a problem with your app/deployment.

Assuming the steps to deploy the environment had already been satisfied, monitoring the environment (and similarly, a database instance hosted on AWS RDS) through the AWS GUI dashboards provides insight on memory/CPU utilization, database connections, and more. The AWS docs would suggest checking the EB console > Environments > Env-Name > Health page to learn more about the problem. Having said that, finding something like the image below on that page might not be very insightful.

However, what we can learn from this is that a memory issue exists on our API. Depending on the functions that exist for each endpoint, the problem could have something to do with:

  • Files in the codebase that are occupying too much memory
  • Predictive models that are occupying too much memory
  • If API uses a database: database connections limit reached

If there had been any recent code pushes to the root repository of the project, it could be useful to inspect the additions. In my case, our team’s repo included three other subdirectories that contained `.ipynb`, `.csv`, and other files that consumed a lot of memory. The solution, in this case, was to add those subdirectories to the `.ebignore` file. If none of the files in a subdirectory are being utilized by the app itself, then they can safely be added like so:

Files can be added to .ebignore in a similar fashion as .gitignore

If using predictive models in the API, the ideal solution to a memory problem would be to pickle the model(s). However, if this is not an option for some reason, or if using dynamic modeling that involves database caching, then this might be a good time to turn attention toward the database.

Runaway connections can cause “Severe” or “Degraded” health by causing endpoints to fail

The AWS RDS dashboard provides similar monitoring capabilities as the EB site and can be a useful way to identify database issues. When faced with high-connection problems, the first step should be reviewing all code in the app that involves a DB connection, and be sure to also close those connections via `.close()`. In light of this, it may be useful to have a separate function to handle database connections for you if you find yourself with multiple files touching the DB directly. If connections from the codebase aren’t the issue, then it may be necessary to inspect the DB directly via SQL queries. If using a PostgreSQL instance on AWS RDS, then the following query will return all active connections to the DB:

SELECT *
FROM pg_stat_activity
WHERE datname='postgres'

When inspecting the connections, the idea would be to find an IP address that has several connections. Checking the “query” column could add an extra level of confidence as deliberate, user-generated connections should have an associated query. Upon finding an IP address with several connections, none of which have associated queries, it is possible to terminate those connections via SQL command or write a program to monitor those connections for you.

# SQL query for terminating 'culprit addresses'kill_switch = """SELECT  pg_terminate_backend(pid)FROM  pg_stat_get_activity(NULL::integer)WHERE  datid = (    SELECT      oid    FROM      pg_database    WHERE      datname='postgres' AND client_addr='culprit_address');"""

In other events, it is possible to add a dependency/file to an EB environment which causes discrepancies with previous versions. When this happens, deployment attempts may be met with a `Failed to deploy application` error.

One way to resolve this is to update the `.elasticbeanstalk` subdirectory that was added to the repo after using the `eb init` command. In particular, the `config.yml` found in this sub-dir will be referenced by the environment during deployment, so resetting it can clear certain discrepancy issues preventing deployment. Deleting the `.elasticbeanstalk` sub-dir and then running `eb init` (or whichever variant you ran to initialize the app, e.g. `eb init -p docker YOUR-APP-NAME — region us-east-1` if using docker) will rebuild the `config.yml` file, and you should have something along these lines (again, if using docker):

The steps above should help resolve issues with EB deployment/degraded Health status, but if for some reason the needed option is to create an entirely new EB environment, it may be helpful to do so through the AWS GUI. Particularly if the API will be utilized by a front-end dev team, adding SSL configuration may become necessary for certain endpoints and this would involve the use of a subdomain alias. The key step would be specifying the Load Balancer Type as `Classic Load Balancer`, and this should allow you to “add a listener” to the EB env and configure the appropriate options.

Hopefully, the steps above can help save you time in any EB / RDS debugging, but AWS offers some useful resources as well!

--

--

Aaron Watkins Jr
The Startup

I am a Data Scientist and Software Engineer, particularly interested in predictive modeling, sports and cinema.