The unofficial guide to upgrading Sentry on-premise 9 to Sentry 10

Thomas Wunderlich
8 min readDec 21, 2020

--

This is the guide to upgrading Sentry On-Premise from Sentry 9 to Sentry 10 that I wish I’d had. While Sentry has made it possible to run their software yourself (major major props to the Sentry team for this), running sentry on-premise isn’t officially supported and so therefore the upgrade is under documented, especially considering the major increase in complexity. The good news is that the upgrade unlocks major functionality increases including APM. This is complex enough that if it is possible, I would recommend migrating to sentry.io, their hosted version.

Overview of Sentry 10 architecture

While Sentry 9 was a standard django app in terms of architecture and easy to support, Sentry 10 is a microservice architecture and an order of magnitude more complex. The best resource is Sentry’s architecture page, which has their High Level Overview and the Event ingestion pipeline. The biggest change is the introduction of a preprocessing proxy service called Relay. This service is responsible for rate limiting, dropping traffic based on filters (either pre-built rules such as drop traffic from known crawlers or custom rules you’ve created such as dropping errors stemming from chrome extensions), and PII scrubbing while responding faster to the client sdks than Sentry could. This service is backed by Kafka and Zookeeper, both enterprisey tools. The other big change is the addition of Snuba, Sentry’s new storage and query service for event data, backed by Clickhouse, an open-source columnar database. This service is the main backend for the new APM/Sentry Performance functionality.

For anyone wondering “How do I run all these new services and databases?”, the Sentry team has been kind enough to open source a way to run everything on one server using docker-compose, including a script to install and update the containers to the latest release. This has the benefit of being simple to run, without requiring expertise in container orchestration (no need to deal with various versions of Kubernetes). The tradeoff is that your self-hosted Sentry is NOT highly available.

Server Setup

To run Sentry 10, you are going to need a larger more powerful server with plenty of disk space, especially if you’re planning to enable perf data (remember, the server is now responsible for running 3x the services that Sentry 9 required). While I ran Sentry 8 and 9 on a t2.small for 3 years with 50GB of disk space plus a database with 50GB of disk space, I had to upgrade the server to at least a large instance with 500GB of diskspace once we enabled performance data.

OS Setup

Install docker and docker-compose on the server. Preferably install git, and then clone the sentry onpremise repo. This repo should be owned by a user with docker group permissions since docker will be running the containers. That’s all that’s required.

Config Files

For Sentry on-prem, there are several config files that you’ll want to update. The best way to handle this is to take the example config files from the repo and then update them with the changes you’re making. The first set of config files are for the Sentry application, which are located in the onpremise/sentry directory. The best practice here is to templatize the example config files using your configuration management system, and then write them to the host without example in the name (ie sentry.conf.example.py should be renamed sentry.conf.py). You most likely already have some custom changes made to these for your existing application, so you’ll want to port those over.

Sentry App config files

sentry.conf.py

This file has python config code. You’ll mostly want to just grab the example file, and then update any database connection info, which you’re not planning on running inside a container (ie if you’re using RDS, you’ll need to update the host, password and user for postgres). Next uncomment the SSL/TLS options if you’re using a reverse proxy to enable HTTPS. Some integrations such as GSuite SSO will have information that needs to be added to this file.

##########################
# GSuite SSO Integration #
##########################

GOOGLE_CLIENT_ID = "{{sentry_gsuite_client_id}}"
GOOGLE_CLIENT_SECRET = "{{sentry_gsuite_client_secret}}"

Finally, if you previously had updated ALLOWED_HOSTS, in your config file, you’ll need to either change this to a wildcard or do some testing.

#############
# Security
############
# Sentry assumes that ALLOWED_HOST is a wildcard, even though it's against best security practice. If you want to specifically create an allowlist be sure to allow localhost, the host for each dockerized service run by docker-compose (kafka, redis, memcache, zookeeper, relay, etc) as well as the hostname that your Sentry instance is registered to
ALLOWED_HOSTS = ["*"]

config.yml

This file is a yaml file which contains some database connection information again, and lets you set configs for email as well as some integrations. The current example file is up-to-date, you’ll typically want to update the Slack config, set up the mail config and maybe set up the Github integration.

Relay config files

Next up, you’ll want to update the config files for relay, especially if you’re planning to turn on Sentry APM functionality. The config.yml file lives in onpremise/relay. The example file needs some TLC, since its missing some recommended options. There are two sets of documentation that are useful to read, the first is the operating guidelines: https://docs.sentry.io/product/relay/operating-guidelines/,the second is the full list of config options https://docs.sentry.io/product/relay/options/. These pages are so useful that I normally add the following comment to the top of the config file

# Please see the relevant documentation:
# Performance Tuning: https://docs.sentry.io/product/relay/operating-guidelines/
# All config options: https://docs.sentry.io/product/relay/options/

The first thing to do is to update the troubleshooting configuration. Step one, let Relay submit errors to Sentry (I can troubleshoot Sentry internal errors in Sentry? Yes Please!).

sentry:
enabled: true
# This configures relay to send errors to the Sentry internal project
dsn: '{{ sentry_internal_project_dsn }}'

Next up, update logging level and format for relay. Note that if json logs don’t add value (ie you’re not shipping and parsing logs via a centralized system like Cloudwatch or Elasticsearch), leave the format off to get human-readable logs.

logging:
level: info
format: json

Finally, you can enable statsd metric monitoring, if you already have that (note that many metrics solutions including datadog have statsd-compatible interfaces). Note that there are several more config options available per the documentation.

metrics:
statsd: {{ statsd_hostname }}:{{ statsd_port }}
prefix: relay

Finally, you want to update the performance tuning configs for Sentry. When updating max_concurrent_requests ensure that you’ve also updated your linux server’s max number of file descriptors and connections (I recommend reading through the answers to https://stackoverflow.com/questions/410616/increasing-the-maximum-number-of-tcp-ip-connections-in-linux for advice here)

cache:
event_buffer_size: 20000 # This is the length of the queue. Default is 1000
project_grace_period: 30 # default project_grace_period is 0
limits:
max_concurrent_requests: 10000 # This is queue throughput max. Default is 100.

Testing and running the Sentry upgrade

You will want to do a dry-run of the upgrade, especially if you’ve only recently upgraded to Sentry 9.1.2. The Sentry team have fixed most of the known bugs, but they do NOT guarantee backwards compatibility with 9.1.2 or earlier versions and they do not have enough sample data to be able to test the upgrade path easily. As such, there’s a decent chance that you may run into errors while running the database migrations. Once you’re ready to do the full upgrade, you will need to schedule maintenance, as the upgrade requires that the old Sentry app be disabled.

  1. Take a database backup of your current Sentry 9.1.2.
  2. Restore the database to a new database server and update the sentry config files to point to this database.
  3. If you are running this as the live upgrade, turn off the old
  4. ssh to the new server that you’re running sentry on
  5. cd path/to/onpremise
  6. `time bash install.sh`. This will likely take between 1–2 hours. Keep an eye on the logs for errors and warnings.
  7. You’ll be prompted to run `docker-compose up -d` once it finishes.
  8. Check that all the containers come up correctly after 2–3 minutes by running docker-compose ps. All containers should be up for close to a minute. If any containers are restarting, that indicates that there is a problem. Next check the logs, by running docker-compose logs and look for any warnings or errors.
  9. Next check that the health checks all pass. If you haven’t set up automated monitoring you can do this manually by running `wget 127.0.0.1/_health` and checking the results. It should return a 200.
  10. Check that the Sentry webapp has come up. Try logging in, especially if you have SSO enabled to ensure that the SSO integration works.
  11. If you have a test or staging environment using the sentry sdk, update the host to point to the new Sentry.
  12. If everything has come up properly update the dns to point to the new Sentry instance

Monitoring Sentry

You definitely want to add monitoring to Sentry, as there are now a lot more moving parts that could go wrong.

Health Checks

For the Sentry app itself add a HTTP health check against {sentry_host}}/_health. This should return 200, and will let you know if the Sentry web app is up or not. This particular health check endpoint is undocumented, but you can find the relevant code in Sentry’s middleware. If you’re using an ALB/ELB this is what your health check should be against. Relay has two health checks, one for checking that relay is up and running, the other to confirm that it can communicate with the upsteam, both documented here: https://docs.sentry.io/product/relay/monitoring/#health-checks.

Standard infrastructure metrics

Make sure that you set up monitoring for standard infrastructure metrics including disk space free percentage, cpu utilization percentage and memory utilization percentage. Sentry 10 uses significantly more of each and you will likely need to spend some time monitoring and right-sizing as you turn on new features in Sentry and as your own applications continue to grow in size and usage.

Sentry Error Monitoring

Sentry comes with an internal project already enabled. By default Sentry uses this to report any errors with the webapp. You can also configure relay (and probably all the other containerized services to report to Sentry any errors). I highly recommend that you turn on alerting for any new error messages that come in.

Integrations

Integrations with self-hosted Sentry typically don’t work out of the box, and usually requires multiple additional steps and admin permissions. For example the Jira integration requires that you switch your Jira instance into development mode 😱. One other thing to be aware of is that you’ll frequently need to do whitelist access from each integration so that its not blocked by your firewall or security proxy.

Below are the best guides for the integrations I’ve come across:

For IP/CIDR whitelisting

Keeping Sentry on-prem updated

The Sentry team has made it easy to keep your sentry on-prem updated. To update your sentry instance manually:

  1. ssh into the server
  2. cd path/to/onpremise
  3. docker-compose down
  4. bash install.sh
  5. docker-compose up -d

I would recommend automating this to run weekly after hours from your CI platform of choice.

--

--