drew
drew
Apr 9, 2017 · 38 min read

Disclaimer: During the day, I work for Twitter on computers and infrastructure.

Today it looked like this outside so I decided to stay in-

Sailboat.

Idea: Send me a specific person’s Tweets by Email either daily or weekly.

Purpose: Whether we like it or not, Email is here to stay. I personally like to digest specific content in this manner, on my schedule, and it is fairly flexible. In addition, I would like to give @redsoprano23 and @ftdoctor a more familiar way to engage with content from Twitter.

No patience? End product here: simplebirdmail.com.

On 2017–07–09 I decided to disable the site since it costs time/money and I don’t have the cycles to maintain it further at this time.

Below is a fairly thorough walk-through of how I went from an idea to deployed in < 24hrs. It is not every button click but it is nearly every screen view, and every piece of code and configuration. This assumes a moderate level of comfort with all of the systems involved. The finished code is published to GitHub here in addition to the inline Gists.


Name selection

Come up with a name and domain name. I don’t like to start on a project until the domain name is purchased (then it is pressure to make it real).

[X] Completed: Simple Bird Mail

[X] Domain Registered: simplebirdmail.com

Rough platform and architecture

Decide on the platform and rough architecture. I am familiar with building infrastructure on several platforms and always looking for something new and likely challenging. For a 24hr project, working on a less familiar platform is a little risky given that you won’t know the edges, limitations, et cetera.

Google Compute Platform’s updated Free Tier looks pretty cool, so I will give that a go. I will likely go a little bit over the Free Tier but as a start, it looks good.

[X] Platform Decided: Google Compute (primarily App Engine)

For rough architecture, I find it helpful to put together a very simple, high-level list of services and what the flow will look like.

This is very rough but this was the initial intent.

[X] Architecture Decided

Account setup

For this project, I needed to setup accounts with the following companies to start:

  • Google Cloud
  • MailChimp
  • Twitter

All are pretty easy to setup with some minor gotchas. Here are some of those details.

Google Cloud Account

Head over to https://cloud.google.com and sign-in with any ol’Google Mail account. Once signed up, you probably want to bookmark the Cloud Console at https://console.cloud.google.com/ since cloud.google.com will just put you into their docs/pricing/information pages.

Create a new project and new billing agreement if you would like (for various API limit increases and Service usages). In my case, I had various projects and different accounts which I wanted to keep separate so I created a new billing agreement.

Under Billing > Budget and alerts, you probably want to setup an alert for a Budget amount in case your project unexpectedly starts to spend more money than you expect. I chose $100 as an initial Budget level.

Under App Engine > Settings, you probably want to setup a Daily spending limit, again in case your project unexpectedly starts to spend more money than you expect. I chose to set it to $100 as well.

Initially on your dashboard for App Engine, it will show that the limit is $9,223,272.03 which scared my pants off. It appears after ~an hour and some refreshing this problem goes away and it is set to your Daily spending limit.

You will want to click around the Console to the Compute > App Engine tab and choose your region and setup your first project. You can also wait to do this and on the first deploy via CLI as it will prompt you for this same information.

Note: There are two types of App Engine environments which are described in detail here called Flexible and Standard. The Flexible environment runs in Docker containers and takes minutes to startup instances so I will choose the Standard environment for the frontend which does not use Docker containers and takes milliseconds to startup instances. Note: I will use the Flexible environment later on; both are used in this walkthrough.

In addition, we will be using Google’s KMS service to encrypt and decrypt secrets so you will want to enable that as well: IAM & Admin > Encryption Keys.

MailChimp Account

Account setup at https://mailchimp.com was pretty easy. The only weird thing is due to laws and such you have to give them your address even without a credit card. We will likely need to dig this out of templates further down the line (or futz with it a little so it looks real but is not).

To create an API key after the confirmation mail, go to: Extras > API keys > Create A Key.

Make sure to give the key a Label to make it easier to identify later.

It should look roughly like this once you are all set (copy the API key for securely storing later).

MailChimp > Extras > API keys > Create A Key

Twitter

Account setup was https://twitter.com was pretty easy. After creating a user under a new Email address (and confirming), I headed over to https://apps.twitter.com to create an API key. Registering a new application required a name, description and URL. On the Keys and Access Tokens tab, scroll down to the bottom and Create my access token.

Since you are on this page, you may as well change the access to Read only since we won’t be writing anything. Click the modify app permissions link next to Access Level, change to Read only and Update Settings at the bottom. Note: You may get an error message on initial creation but a page refresh clears it; unsure why.

It should look roughly like this once you are all set (copy the Consumer key + secret, and Access token + secret down for securely storing later).

Twitter App Registration

Configuring development environment

I am going to choose Python, which I am fairly familiar with to keep the number of additional complexities to a minimum since I already chose to use an unfamiliar platform.

Follow the guides from Google Cloud either here and here to get started.

The first gotcha: There are *two* SDKs involved here which initially was confusing to me. You need both the Google Cloud SDK with Python components and the App Engine SDK for Python.

On OS X I chose to install the Cloud SDK in /usr/local/bin with a couple symlinks. The latter you need to install and launch, choosing to create symlinks.

PyDev (webapp framework) w/Eclipse

If you choose the Eclipse route, it will not work out of the box with the initial project templates because of this little note in the documentation:

Note: The PyDev starter projects are out of date and use the Python 2.5 webapp module.

To fix this you need to change the runtime in your app.yaml to python27 from python. While you are at it, you should also add threadsafe: true which is required for python 2.7 projects.

In addition, the webapp example will work but the webapp2 framework has been out for a bit so you probably want to switch to it instead (or go the Flask route below, which we will eventually do anyway). If you choose the webapp2 route, you will want an app.yaml and main.py that looks something like this to get started:

You can test this locally but configuring a local configuration in Eclipse.

Right-click your project > Run As > Run Configurations…

Right-click PyDev > New

Name it Run Local (or whatever), set the Main Module to wherever dev_appserver.py is installed (in my case: /usr/local/google_appengine/dev_appserver.py). In the Arguments tab, set the Program arguments to “${project_loc}”.

This should look something like this:

Main tab
Arguments tab

Choose Run in the bottom right and in the Console you should see the startup messages letting you know that http://localhost:8080 is setup so check it out!

Flask w/o Eclipse

If you choose the Flask route w/o Eclipse you will want to follow this guide or you can simply clone the repo here and take a look at the appengine directory under flexible or standard. See the note above linking to the documentation on the differences.

For starting (and continuing forward in this tutorial) I will choose the appengine/standard/flask/tutorial project which is a reasonable starting point.

To start, I will create a separate private repository on Google Cloud (since it is Free in Beta at the moment). Go to Tools > Development > Get started. Choose a creative name of course.

I like to copy any tutorial into the base of my new repository to get started.

Clone the empty repo to your local source tree somewhere:

Copy the tutorial project into this new repo:

You may want to rename the root folder from tutorial to something more descriptive, I chose frontend.

Similar to how we configured Eclipse for being able to run locally, we can setup a local run configuration.

First, install the requirements into a lib directory as mentioned in the docs:

Note: As mentioned in the docs, you are going to have trouble if you use a Homebrew version of python. I might recommend switching away from that and to pyenv which works fairly well instead.

Startup the local appserver by using dev_appserver.py which should be in your path if you installed both SDKs and let the gcloud tools append your shell’s configuration file(s) or do it manually if you so choose.

The app should now be available locally at http://localhost:8080/form.

This would be a good time to commit the initial tutorial code to your code repository before we make any changes, so let’s do that.

Let’s deploy this application to App Engine so that we get the hang of that too in addition to our local environment.

There are two commands you will need to use, as documented, gcloud app deploy and gcloud app browse. The first deploys your application and the second launches your web browser at the project URL.

You should be able to browse to /form at the URL and view the same form that you saw locally.

If you have gotten here, take a break and enjoy some soylent. Come back in a couple minutes.

Yummy!

Recap

We have created accounts with Google Cloud, MailChimp, and Twitter. We have a local development environment setup and deployed a Flask tutorial to Google App Engine. We have a code repository setup and have committed the tutorial code. So what is next? We have two paths we can start on- the frontend or backend. Let’s start with the frontend.

Frontend Design Setup

We need to transform our tutorial form into an actual frontend that will enable the functionality we desire for users. In the case/purpose of what I would like to do, we need to capture a couple pieces of information- the Twitter User that the person would like to receive Emails about and their Email address. That is it (sorta).

For all quick projects, I like to get just the basic UI elements together and work on polishing it with fancy California Style Sheets later (Jenn Schiffer gets credit for that verbiage; which is amazing).

This is truly all we need to get started with a Minimum Viable Puppy* (MVP):

Graph paper, bad drawing (sorry).

*May or may not stand for Puppy. It could be “Product” but that is less fun.

We have 90% of this already and we didn’t even have to do anything!

Modifying the form.html a little bit, we can get this pretty close to the super fancy design.

Let’s see where we are at:

form.html

From the perspective of, “are all the things there”, the answer is pretty close to yes.

Now, this may be contentious but I would probably go a little crazy if this didn’t look a little nicer at this point. The Bootstrap framework should be able to help out here; let’s stick with the tried-and-true v3 while v4 is still in Alpha.

Since this will be hosted on the Public Interwebs and there isn’t a great reason to bundle dependencies at this time (in my opinion) we will point to MaxCDN for the include (and other CDNs as well).

Making a couple modifications to fit the Bootstrap template and features, we are looking quite a bit cleaner:

form.html

The frontend design is livable for the moment. Now to add functionality.

Frontend Functionality Setup

You may have noticed that the form that was designed is pretty close to a MailChimp sign-up form. Although this could likely be done completely with a MailChimp form and no separate App Engine-hosted frontend, that would be nowhere near as interesting, fun, or as much of a learning experience. In addition, limits will start to be hit pretty quickly functionality-wise after the MVP ships (think: additional options, external data/user storage, typeahead, et cetera), and it is a lot of eggs in one basket if at some point we would like to move to another platform (although lock-in is entirely inevitable and it is a total illusion if you think otherwise).

Note: Since it wasn’t stated above, using a platform to handle Email is generally a good practice. It is painful and there are tons of rules and situations that you don’t want to have spend time with/on.

We will be designing a form to capture the majority of the data, and injecting the data that will be hosted on the site replacing most of our form.

Start by creating a List in MailChimp.

You will want some sort-of contact@ address. Most <good> domain registrars (<cough> Gandi! </cough>) let give you some amount of mail forwarders for free and you can setup a contact address.

Use @gandibar, they are fantastic.

To be able to send mail from a particular address you will need to perform the verification process.

We now need to make a form for sign-up that mirrors most of the functionality of the form we created.

We create a form with the Email Address, Frequency and a Hidden Twitter Handle field. We don’t want to expose the Handle input because we will be adding an auto-complete to the current field and injecting the form instead.

Head back to the Embed Forms page and grab the HTML. We need to slice this up a bit to make the form match our styling, keeping the Twitter Handle initial field separate.

The HTML, CSS, and updated form:

form.html

It is close enough, that we can move on to implementing the typeahead using typeahead.js. Taking a look at how the library presents the Twitter handle typeahead, they have a proxy server setup on Heroku to make the requests to Twitter. We will need to stand this up and this will now extend our architecture a little bit by adding this requirement.

The quickest and lightest way to do this will be to use the twitter-proxy (it is Node, but let’s ignore that for now). It would *not* be okay to point at the one mentioned above even though it appears to be completely open with no limitations (except rate limiting upstream I have to assume/hope).

The package.json and server.js respectively:

To run this on Google App Engine as well, we will want to create a separate Twitter Application (and associated credentials), a separate service name to house the App Engine application, and need to put together an app.yaml configuration. Both steps were previously documented, continuing with those assumed complete.

An app.yaml and credentials.json respectively:

And you can probably now realize you need a .gitignore file:

With all of these steps completed, we can now test locally and then deploy if all is successful. Run npm start locally:

Hop over to a browser window and put in: http://localhost:5000/1.1/users/search.json?q=mediocrity

This should show a 200 in your console:

This should download a JSON file that contains content for a number of users:

Now that this is working, we will upload it to App Engine with gcloud app deploy. The first deploy will take a few minutes but the same URL as above, save for the base, should work. Try it out at: https://<your_app>.appspot.com/1.1/users/search.json?q=mediocrity

A quick note about security and such... This is one of those, “Do as I say, Not as I do” sort-of moments. For a 24-hr project a lot of corners are being cut here. One of those corners is launching a Publicly Accessibly proxy for another API. This isn’t cool. App Engine does *not* offer any sort of controls for restricting access to your service. Ideally, this should be launched in such a way that it could be restricted to only your frontend application. For the sake of time and simplicity, we will move on with this noted. If time allows, we will revisit. By not addressing this now, you risk rate limiting by the API, unexpected cost!, et cetera.

Wiring The Typeahead

We have an API proxy setup and typeahead.js added to our form. Now we need to wire it up to the first input box on our form.

Before we do this, close the blinds because of the glare.

Sunset.

We can copy most of the code from the front page of the typeahead.js site and make out modifications as needed. Starting with the JS file (main.js), we are going to keep most of it but remove the handlebars.js include and point it to our host:

In addition to removing handlebars.js, I removed the prefetch and sync function because our quickly installed proxy doesn’t provide much value there. The query path changes a little bit as well.

On the HTML front, it is pretty much a copy/paste as well:

There are also several additions at the form.html update step including jQuery and swapping to use the corejs-typeahead library instead of typeahead.js (the former is maintained). The full HTML file now looks something like this:

As for the CSS front (css/main.css), there is a lot of copy/paste, just copy the entire file starting at the first .Typeahead declaration to start.

Back in our running server, we have an autocomplete form with Twitter search typeahead now functioning:

Let’s re-deploy to make sure everything still works when deployed…

.. dramatic pause ..

And it does, fantastic!

The last step with frontend functionality will be to send the selected (or typed Twitter handle) to MailChimp as the hidden merge field that we setup originally. To do this, we are going to attach a handler that updates the form action for MailChimp. This is described here.

Since we have jQuery installed at this point we will find and modify the element whenever the Typeahead value changes. In our main.js, we will add the following:

And we need to add the field to the action attribute:

Now we will capture the input handle, which is a required field, on form submission to MailChimp.

Testing

Let’s redeploy and start to test how this works before we build the backend…

.. dramatic pause ..

It works!

But you immediately notice the user experience is pretty bad on form submit where you have some green text, the Handle isn’t cleared, the confirmation of success isn’t super clear, et cetera.

To fix this, we will add a click handler for the form submission:

A success row in our HTML:

And wrap the previous rows in a div id presubmit:

Now it looks fancy, check it out!

Video

Let’s check the data on MailChimp. Browse to the List > Subscribers (you need to accept at least one invite). Huh, the Twitter Handle is missing. It looks like the good ol’replace the variable in the Form Action isn’t quite working.

Let’s add a hidden input field and update the change handler:

And now re-deploy…

And re-signup…

.. dramatic pause ..

And, it works!

I am not quite sure why the Form Action swap doesn’t work but looking at the Parameters passed in the Dev Console, it was just a blank value.


Custom domains

Since everything on the frontend is working, we should start the process of getting our domain setup to point to the App Engine site. To do this, you have to add the site under App Engine > Settings > Custom domains.

Depending on where your domain is hosted, you need to likely login and add a TXT record. The Google tooling walks you through this. It can take a couple minutes to a few hours for this process.

Once ready, you will be able to proceed by adding mappings.

And then finally getting to update your zone file on your DNS provider.

Once that is completed, you will need to wait up to 24hrs for DNS to propagate.

SSL and HTTPS serving

While this is propagating, you can start to setup a SSL Certificate. I generally like to use Let’s Encrypt these days given the capabilities, community involvement, et cetera. Not to mention it is free (but you should really donate)!

I setup a SSL certificate and in addition, wanted to enable HTTPS-only such that users will be redirected if they try to access HTTP. This is actually quite easy but we have to setup the certificate first.

The easiest way to setup a certificate is with certbot (offered by the EFF for free too, you should really donate)!

On your local machine, issue a command like the following (obviously filling in your domain{s}):

Now we need to setup the serving of this file for verification. The easiest way to do this is modify our app.yaml to add a handler.

Add the handler like so:

And write an acme01.txt and acme02.txt file with the content from your console. After deploy, you should be able to see the contents of this file at that URL.

As in the example case, we have two domains, so we will need to do this for a second file as well. After you click Enter the first time, you will notice a new path and contents that you need to write. Add an additional handler, text file, and re-deploy. After doing so, you should be able to click Enter to continue.

If you did that correctly, you should see certbot generate a key and csr locally:

We need to run one command and then upload the files to App Engine. We need to take the key and convert to RSA format. In addition, let’s copy the full chain to our current directory.

We will now upload (and then delete local copies) these two files in the Settings > SSL Certificates > Add a new SSL certificate dialog. Give the certificate a name, and choose (or paste) the fullchain.pem in the first box, and private key (RSA format) in the second box.

Once that is done, check the boxes next to your domain name(s) to enable the certificate and choose Save.

If that all went to plan, you can now view your site over HTTPS.

Zone apex
WWW

Now that we have HTTPS enabled, let’s turn off non-HTTPS and redirect users because a safe web is a happy web. This is incredibly simple with App Engine, simply add secure: always to your static and root handler in app.yaml as such:

This (as documented here) handles the redirection for you. After you deploy, you should notice that if you browse to non-HTTPS, you get 302'd to HTTPS:

Dev console

Recap

At this point we have a working frontend and proxy service running on App Engine. We are HTTPS-enabled and redirecting users. It is time to put down the frontend and start to work on the backend. The backend in our case will be a Scheduled Job (e.g. Cron) on App Engine (doc here).

Just before we switch to working on this, we should probably update our initial rough architecture diagram. We ended up not implementing KMS yet but we plan to use it on the backend to store/fetch our Twitter and MailChimp API keys. Related to keys, it is important to add a backlog item to switch from storing the Proxy API credentials for Twitter in a non-checked in JSON file to something better (in addition to restricting the API). We also ended up having to host an open Twitter API proxy as well. Here is an updated diagram:

Current backlog list at this point:

  • [proxy] Secure the API proxy
  • [proxy] Move credentials in API proxy to KMS
  • [frontend] More UI cleanup / work
  • [frontend] Minify / cleanup CSS/JS to limit request size
  • [frontend] Favicon route + file
  • [both] Add logging/monitoring/metrics

Backend

To start, let’s clone the sample project (here) and take a look around inside the gae directory. It is a webapp2 project with a cron configuration that calls a URL. We have several moving pieces already, so we probably want to stick to Flask but we can certainly adapt this project.

Let’s copy the gae directory into a new directory in our repo for these scheduled jobs.

We don’t have a use for the PubSub feature at this time, so remove that. We should add a service name to the app.yaml. The current functionality is quite simplistic so this should be pretty easy to convert to a Flask project. We will be adding the login: admin setting in the events handler (documented here).

Note: the admin login restriction is also satisfied for internal requests for which App Engine sets appropriate X-Appengine special headers.

Note: for some reason, your first project *must* use default as the service name on GAE. I hate this but when you create a new project and try to use a non-default naming of the service, it errors. So if you are following this tutorial, your frontend will always be called default and there is nothing you can do about it at this time.

We will revisit the details of this (specifically the headers) at a later time.

Setting up a couple test endpoints and converting to a Flask application is pretty quick-

A note about the cron.yaml, you have to also specify a target parameter as discussed here if you don’t want it to use the default application, as in our case and will likely be common for others with multiple applications on GAE.

The updated cron.yaml is now:

We could have added the configuration and endpoints to the application we are calling the frontend but since in this case there is no shared code between the two, it doesn’t really make sense and isn’t needed. I also wanted to provide a more practical example because organizations that do use GAE will likely have hundreds of applications and will need to specify targets.

Let’s get started by making sure we can test locally with dev_appserver.py app.yaml.

After launching, you will need to check the box to login as an admin (remember we set login: admin in app.yaml). And after that we should see our simple header line.

Great! Before we get super deep, we need a high-level with psuedo-code diagram to understand what we need to do here. What we will build will look something like this:

Rough Flow of Backend

The bottom right is TBD and is likely pretty involved but we can get all of the data fetching and parsing setup.

First interface: MailChimp API. The documentation at http://developer.mailchimp.com/ is very well put together and it makes browsing and finding what you need fantastic.

Let’s start with getting our API key inserted and fetch’able from Google KMS and making/printing a single API call to MailChimp. The documentation for KMS is located here but I will walk through the basics.

We are going to create a KeyRing (grouping of CryptoKeys), a CryptoKey, and setup a Service Account to decrypt from our application. Although a separation of duties is recommended (see here), I am one person and protecting myself from myself doesn’t make much sense in this context.

And a service account:

To grant this service account access to encrypt/decrypt, we need to perform the following:

Location is set to global in our case. We could restrict it further but there just isn’t a great need to do so at this time for our fun app.

With a service account setup and appropriate access granted we need to create a service account key so that we can utilize it outside of Google Cloud Platform (to encrypt).

Let’s add the MailChimp API key to KMS using the Service Account we created. In the python-docs-samples repo (previously cloned) there is a small KMS client that can be used for this purpose: doc.

Create a small shell script as in the aforementioned doc that looks something like this:

Set the environment variable GOOGLE_APPLICATION_CREDENTIALS to the path of your key.json file you previously created:

And run it (with your MailChimp API key echo’d)!

It should echo out the key you echo’d in. Now we can include this secret (/tmp/test_file.encrypted) in our project, store it in Google Cloud Storage (GCS), or do whatever we would like with it. Lets’ store the file in GCS.

We will be switching from the gcloud to gsutil command-line tool for interacting with GCS which you should have installed if you previously installed the SDKs.

Note: It seems like every Public Cloud has this problem initially of separate command line tools. Albeit, AWS was far worse with one for every API call (exaggeration), it still keeps occurring.

Create a bucket, copy the file, and move it to the desired location as follows:

In this case I am showing the path static/mailchimp_api.encrypted but it can be wherever you had your small script write the file (/tmp/test_file.encrypted if using the example).

For reading from GCS we have a couple options but the google-cloud python library looks pretty good (here). We are going to read the file we uploaded, which is very small, and would like it back as a string. This library has a download_as_string() function, fantastic. The blobs docs are located here.

Here is what we will need to do:

Adding the from google.cloud import storage at the top and the BUCKET and filename, and we should be good to go.

We also need to update our requirements.txt:

Deploy and wahhhhla! You can see the encrypted content by visiting your /test endpoint confirming that you can successfully access your object from GCS as a string.

Another gotcha with Scheduled Jobs on GAE is that you have to specify the cron.yaml file on deploy. The job will not get scheduled if you do not. Just add it at the end of you “gcloud app deploy” so it will be “gcloud app deploy cron.yaml” as documented here.

Let’s get that string decrypted so that it can be used to call the MailChimp API. We will be adding the google-api-python-client requirement to get access to the discovery packages (doc). In addition, we will add the mailchimp3 import which provides a nice wrapper around their API.

We will need to grant access for the App Engine Service Account to access our keys as follows:

Updating our main.py to decrypt and make an initial API call:

Your browser should now show a string (that is from the JSON that was returned):

The next step is to start to return the Tweets for all of the subscribers. This will be a request to our API Proxy for each handle. We should also parse and just return the Tweet ID strings since that is probably all we will need. This is from the /statuses/user_timeline.json endpoint (doc here).

First, adding requests to our requirements.txt:

And main.py:

Several additions have now taken place. There are a couple new functions that call the API Proxy, there is a bunch of iterating, parsing, and jsonify has been added to make the responses easier to view while we are still developing.

If you re-deploy and go to your cron service’s /run in your browser, you will get something like the following:

Response

This is the past 50 (see constant in main.py) Tweet IDs by screen_name for each handle set by each subscriber; pretty much everything we need at the moment to be able to start to put together email templates in MailChimp.


Email Templates

Email is tricky. Specifically, Email clients and templates are tricky. There are entire companies built around making Emails look good and templat’ing them with various value-added services. The only client that I will focus on for shipping the MVP is the Google Mail Web UI in a reasonably wide format (> 800px). If I can get that to work reasonably, I will call that good enough.

Unfortunately, MailChimp doesn’t have just a simple social merge tag to include any random Tweet (they do offer some tags but not for this purpose). This lack of feature alone (ability to embed a Tweet via a tag/URL in an email) will take nearly the remained of the tutorial.

Pretty much the only way to use MailChimp integration to send these mails will be to use the Mandrill API (docs). You could of course abandon MailChimp at this point and setup your own form; use SendGrid, Mailgun, et cetera. That wouldn’t be quite as ideal because of analytics tracking or keeping things simple for my use case (obviously debatable) and generally I haven’t used the MailChimp or Mandrill APIs whereas I have used SendGrid and Amazon SES and would like to learn something new.

To add the free trial of Mandrill to your MailChimp account follow the guide here. Head over to Profile > Account >Transactional > Add Mandrill > Start Trial. With the Free Trial you get 2,000 emails which should be plenty for learning about it and deciding if you want to stick with it (and pay) or switch to a cheaper solution.

Once you enable the trial you will need to setup your domain by clicking on the Launch Mandrill. Add your domain like by entering it and selecting + Add:

Setup Domain

Start with the View details under Verified Domain to enter an Email address and receive a verification:

Verify Domain

Once you click the link from the Email, you should see that it is now verified:

Verified Domain

Next you will need to setup DKIM so that Mandrill is authorized to send mail for you. Open the dialog and copy this TXT record into your DNS provider’s settings:

DKIM Record

In addition, you will also want to add/modify a/your SPF record by following the instructions under Show DKIM/SPF Setup Instructions.

If both are added and have propagated after some time (low order minutes > hours, typically, depending on TTLs) you should be able to lookup both records:

Verification

To use Templates in Mandrill, you will likely want to choose Handlebars templat’ing (doc here). It is fairly standard will provide the easiest and most flexible template structure.

Under Sending Defaults in Mandrill, we will want to set HTML only, no Inline CSS (this is critical for success with the GMail WebUI later), and to Add Unsubscribe Footer. We can change all of this later but it will be a good start:

Sending Defaults

Under SMTP & API Info, create an API Key for Mandrill:

Add API Key

Under the Outbound menu bar on the left, head over to Templates to create our template. Give it a name, from address, subject, et cetera. All can be changed later.

The big question is, what do we put in the template? We can’t use the common Twitter embed options like described on https://publish.twitter.com because they require loading of widgets.js, and Email + Javascript do not go well together. We can’t use the standard Tweet embed code (Tweet > Embed Tweet) because it also relies on Javascript. We can’t use the oEmbed API (link) either because it too requires Javascript.

To follow the brand guidelines (here), if we can’t use all of the aforementioned options, we should probably just look at the source of one of the Emails that Twitter already sends. If you look at the source of one of the mails, it may appear a bit daunting but it actually isn’t:

Email Source

If you strip away the ancient encoding, remove the plaintext version at the top, and reformat the HTML and CSS a bit, it is actually pretty usable. What we need to do is come up with a template and format so that we can use it in Mandrill.

I won’t bore this walkthrough with the process by which this was completed partially because it just isn’t that interesting and partially because I don’t know what I would do other than video record a sped-up version of restructuring text in a non-understandable way. It took a little over an hour and it was terribly boring. The steps were roughly: remove encoding, cleanup and properly space the HTML, add newlines to see what was actually happening, and simplify the styling where possible. The full template is published on GitHub here.

There are three parts to the Email Template that we will build:

  1. User (the Twitter user)
  2. Tweet Content (the Tweet)
  3. Actions (Reply, Retweet, Favorite)

Very kindly of whoever built this Email, there are some comment tags throughout that already help delineate these pieces.

Several new fields will need to be parsed in our parse_tweets function in main.py. We will need the following to start:

  • Profile image (user.profile_image_url_https)
  • Retweet count (retweet_count)
  • Favorites count (favorites_count) not to be confused with user.favourites count which is a User’s like count. For some reason this is not documented in statuses/user_timeline (doc).
  • Verified (user.verified)
  • Name (user.name)

Our current structure is a dictionary of handle arrays. We will need a little bit more structure instead of an array and will store each Tweet ID inside of another dictionary along with the above data.

The function parse_tweets will turn into this for now:

This gives us a data structure that looks like this:

Data Structure

Now that we have the data that we need, we need to use Handlebars templat’ing to put together a template based on the aforementioned Email Template process (doc here).

We did just create a super nice new data structure but we will need to change it a bit more to be able to call the Mandrill API. We will be using merge_vars which are per-recipient values that we can set (doc here). For each rcpt (recipient Email address) we will pass a vars array in with name and content for each variable we would like to pass to the template.

It looks like content at least supports nested dictionaries, so there is that. In our Email template, you can imagine a structure roughly as follows:

Handlebars Template Structure (rough).

To prove out our rough idea of a template, we will start with a rough template that just shows we can set and pass the appropriate values. We will worry about styling (discussed a little previously) with the base from an actual received Email as phase two.

To test a template, we should setup a small test script that we can use to our delight until we are ready to add it to our cron task in GAE. To keep it extra simple, this will just be a small script that we can execute locally and not have to keep re-running/deploying on GAE. This will allow us fast iteration on a test template and the data that we will be passing as merge_vars once we are ready.

Quick iteration is absolutely critical for a 24hr project. If you can find a way to iterate quickly and not wait on infrastructure shenanigans, do it! :)

The test template that I put together looks like this:

Publish it in Mandrill like so:

Mandrill Test Template

To send this template to a test user (you), you can use a small test script as follows:

You need to install the mandrill package for Python locally (you could of course also use a virtualenv):

And fill in the couple variables at the top of the small test script. You can run it with just a python mailme.py and it should send you an email that looks like this:

Email Screenshot

Notice a couple things about the template and the script. We have some variables as global_merge_vars and some as merge_vars and the empty strings are used as false’y with Handlebars.

Now we know (roughly) the data structure we will use so we will move our parse_tweets function to return data in this format. While we are making this change, we will also discard any Tweets that were created greater than 7 days ago since we will have a daily and weekly frequency mailer and there is no need to have those stored. We will finesse this later.

We will be adding one requirement at this phase on python-dateutil since it will give us the magic parser function for dates.

This is a great time to note that you must update your requirements.txt when you add a dependency. Otherwise on application startup, you will see something like this (for example, if you forgot to add python-dateutil):

Debugging

We are going to take a slight detour into debugging GAE applications since it was at this point our application has grown to a couple hundred lines and you will likely start to occasionally hit (as I did) the occasional An internal error occurred. after a deploy.

GAE has some pretty easy to use debugging tools. The foremost is the Stackdriver request and log viewer. From App Engine > Services select your service under Diagnose > Tools > Logs:

Diagnose through Logs

The logs are nicely icon’d with Info (blue), Any (no color), and Error (orange). You can expand each and use the Show More or Show All buttons to view more of what was logged:

Log view

Anything that you print() will be visible in these logs which can be quite helpful. It can be helpful to add a print() statement before your application is having a problem to view what is happening and where. It takes usually ~5 seconds after making a request for the log to be visible.

In addition to viewing logs through the console, you can also view them on the command line which in many cases is more preferred and absolutely fantastic! This is done through the gcloud beta logging command. There is documentation available here but here us an example since the tab completion and help text for the command is a bit broken at the moment it appears:

To figure out your logName run the list command first:

The part that is confusing with these commands (at least to me) is that some are prefixed with the “logs” group and some commands are not (such as “read”).

To quickly read your failure logs that you want to debug you can use the jq utility which is available from their site or from brew (brew install jq).

Adding a couple parameters, you can look at just the 500s and just the log lines:

In addition to the log diving, you also have a live debugger available under the Tools > Debug menu. For our application, this will not work at this stage for various reasons but could be made to work in the future.

In addition there is the Stackdriver Error Reporting console which is a bit more holistic and dive-in view. I have not found it as useful as pulling logs but I can imagine if you have a large application it might be helpful:

Stackdriver Error Reporting

From this view you can see at a glance each application (or all) on GAE, status codes, groups errors, and can even turn on notifications, mute a particular alert or link to an issue.

Once you dive-in to particular alert, you get pretty much all of the log details just in a more visual page:

Stackdriver Error Reporting

Any Scheduled Jobs (cron) that you have setup also have a special view under App Engine > Task queues > Cron Jobs. You get a high-level view of all your jobs:

Cron Jobs

And you also get one-click access to logs via Log > View:

Log View

Email Templates

Back to email templates and data formatting after our slight digressing.

One of the more interesting formatting bits is that you must be aware of encoding given that Twitter allows emoji / unicode characters in names and text (not in handles or urls it appers). We must be aware of this especially during string manipulation. I have purposely chosen a user with a special character in their name to show that this works.

Here is what we ended up with for making our data a bit more ready for the Mandrill API:

We added a new format_data_for_mail function along with some changes to the parse_tweets function. We are creating the two structures we will need for the Mandrill API and the output looks as expected:

Email Template Formatted Output

With the data in a good and reasonable format, we can start to button-up a couple necessary items to make this functional. The first is to swap the profile image URL to the one we will use in the mailer. Instead of one with _normal in it, we will switch to the _reasonably_small variant with the replace string method.

Instead of this image:

https://pbs.twimg.com/profile_images/3756620250/89414fa14149981c265064e6c1084348_normal.png

We will use this image:

https://pbs.twimg.com/profile_images/3756620250/89414fa14149981c265064e6c1084348_reasonably_small.png

This will be done with the following:

Next, we are a little tired of seeing the 404 status code for the favicon.ico file so we will fix that by adding a handler in our app.yaml and a placeholder image:

Place a favicon.ico in the root of the cron directory and re-deploy. It can just be a single color square since it won’t be viewed by anyone (you can use something like http://www.favicon.cc/ if you wish).

Next, we would like to be able to display images (and later videos, et cetera) that are included with a Tweet inline in the mails. Reviewing the previously mentioned statuses/user_timeline documentation (here) you will notice that we don’t have any of the entities we might expect in the Tweet objects such as a media_url.

Although the aforementioned documentation on statuses/user_timeline doesn’t mention the include_entities parameter, if you pass include_entities=true you do appear to get the entities!

What this means is that you will get a media object that you can then use to expand and include in your eventual Email:

For right now we will start with the media_url_https and use that. This is under entities > media > media_url_https. We should also be careful with parsing since obviously only some number of Tweets will have this object.

We will add the following to the parse_tweets function:

Finally update our template in Mandrill to look something closer to the mail we would like to see. As mentioned previously, I am not going to do a walkthrough of creating this as it was extremely boring and it was a bunch of trial-and-error:

After copy/pasting that into a Mandrill template and updating the small mailme.py script you have been using to test with something similar to the following:

You should be able to generate a pretty good looking Email as follows:

Email Template Test

Wow, this looks pretty good! There are plenty of rough edges but for the most part we have it filling in all of the data, we have correct text and hyperlinks, and the correct retweet and favorite counts. This looks to follow the display guidelines pretty well too and far better than a number of sites that I have seen.

So, what is left? Let’s look at our previous backlog:

  • [proxy] Secure the API proxy
  • [proxy] Move credentials in API proxy to KMS
  • [frontend] More UI cleanup / work
  • [frontend] Minify / cleanup CSS/JS to limit request size
  • [frontend] Favicon route + file
  • [both] Add logging/monitoring/metrics

Let’s simplify this to the following launch criteria:

  • [proxy] Secure the API proxy: We can secure the backend proxy used for the mailers but there is essentially not much we can do for the typeahead since it must be publicly exposed. We could add some referrer checking and endpoint filtering but both will require forking the npm module and creating more work than we wanted to take on with this layer. It is quite important that we set the budget and daily limits previously so that if it ever gets hammered, we won’t get charged and it will just stop accepting requests. Maybe we will just lower these further prior to launch.
  • [proxy] Move credentials in API proxy to KMS: There is nothing publicly exposed and the credentials are separate than the ones used by the mailer (for rate-limiting and general good practice of separation). There is a JS library so it wouldn’t take too much to implement later. It would still be good to do at some point but will be passed on at this time.
  • We are going to forego the rest of the frontend tasks for now as those are super easy to revisit later. We already took care of the favicon annoyance!
  • We have a reasonable start to logging. We will forego monitoring and metrics at this time and defer to the built-ins.

This is a 24hr project, we can’t take on too much!

Cron Scheduling & Frequency

Now that we have a somewhat usable template and data to fill it we need to schedule the cron to match the desired frequencies and update our application to support the frequency paths. We will allow Daily and Weekly upon sign-up. We need to schedule for this and make our application understand which is being run.

We will first update our cron.yaml and then thread this through our small application:

Next, we will update our various functions to understand frequency.

Preface: I have wired this through in a very quick manner. This is not clean nor optimal. If you are looking for beautiful code, look elsewhere please, seriously.

Notice we now have two endpoints that tell us which mailer is which along with several frequency variables and arguments being passed around.

The new endpoints should render properly only with Tweets from the frequency and only with the Subscribers who have selected the particular frequency.

Mandrill Connection

The last piece of work before we call this somewhat “done” is to wire up Mandrill starting with our starter code in the mailme.py script and build that into our Scheduled Job (main.py).

First we will add our Mandrill API key and file to KMS by following the original steps back up in the Backend section. Creating a new cryptokey (mandrill) and uploading the encrypted file to GCS. We will also grant the IAM roles the same (giving encrypt/decrypt access to the SA). With this, we will update our credentials fetching function from KMS to be more generic as follows:

And update the calling functions to pass in the two parameters.

Our mail function will be fairly simple since we have already sliced the data as we required. It will look something like this:

And that is pretty much it, honestly.

If anyone wants the full repo with all sample files, it is on GitHub here. Note this needs minor changes to actually run. The locations are fairly easy to update.

Final Architecture

We ended up with a picture that looks something like this:

Architecture Diagram

Limitations

There are some pretty serious limitations to scale of the current design given how quickly it was put together. Luckily, most of the price / tiers of the services being used will be hit first:

  • Mandrill: 2,000 emails (free trial)
  • MailChimp: 2,000 subscribers (forever free)
  • Twitter: 1500 requests/15 minutes (not sure the process to raise)
  • Google Cloud Platform: I set a $100/day limit on GAE (and $100/month budget on GCP)

Architecturally there are several problems with scaling this; primarily with the setup of the Scheduled Job. It does all operations synchronously and therefore will not scale given the fetching and passing of various API requests. These would all need to be split up and threaded; likely some sort of constant polling and updating of a cache to store the data from the various APIs so it wouldn’t need to call them live. Re-writing it likely would make the most sense since it is also pretty messy at this point.

I am not worried about any of this at this time as I could quickly shut the whole thing off (it is one button in GAE, Disable Application, already tested).

Wrap-up

Well this was pretty cool, in ~24hrs and learned a bunch of stuff on Google Cloud Platform: Google App Engine, Stackdriver, Google KMS, gcloud and gsutil, and various SDKs. In addition, we wired up MailChimp, Twitter, and Mandrill APIs. This was not anything revolutionary but it was a fun project to build and to attempt and document.

Feedback

I would genuinely be surprised if anyone made it this far but if you have I hope this was interesting, inspiring, and/or helpful in some way. Feel free to reach out to me on Twitter (@mediocrity) and let me know what you think.

Google Cloud Platform - Community

A collection of technical articles published or curated by Google Cloud Platform Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.

drew

Written by

drew

@mediocrity

Google Cloud Platform - Community

A collection of technical articles published or curated by Google Cloud Platform Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade