Searching the College de France — Part 3

Timothé Faudot
9 min readSep 13, 2017

This is the third part of a series about building a way to search audio transcripts of the College de France, if you haven’t already you might want to start reading from part 1.

This part is about the frontend, you can check the live result here:

I must admit I feel more confident when talking about backend technologies, not so much frontend as I spend most of my time at work building backends, APIs, servers, batch jobs, pipelines etc.

Frontend is not my territory but again this whole exercise point for me is to learn, so here is what I built: a website, hosted on GCP in the same cluster that the rest of our jobs are running in, using Angular 2, Typescript, Golang, bought a domain name for it and encrypted traffic using Let’s Encrypt.

Let’s go through all of that in this post!

Building the frontend in Angular 2 + Typescript + Golang

I chose angular over React for the simple reason that I already built a few internal frontends at work using angular 1 and I wanted to see how angular 2 changed… and I was not disappointed! to say the least it changed a lot, and in good overall.

I started with a regular download of NodeJS, followed by the angular CLI and then I was left with an app that I could hack around fairly quickly. The official tutorial is one of the best tutorial I’ve seen online, it is fairly comprehensive and leads you towards good patterns that stay in your head after you’ve finished it. I was fairly confident I could build the app I wanted after I was done with the tutorial which is usually a good sign.

To get started, just follow these simple steps:

# Download the angular client line interface
npm install -g @angular/cli
# Create a new app
ng new my-app
# Switch to the newly created directory and serve the app on localhost:4200
cd my-app
npm start

Typescript

I wasn’t fairly up to date with the latest ECMAScript 2015 and 2016 but turns out I didn’t really have to care because by default angular supports Typescript as a language! After doing the official angular tutorial, I had the basics in my head and I didn’t find it hard at all to build the rest of the app, it was a breeze in fact, this language makes a lot more sense than plain Javascript although it requires a compiler (to be compiled to ES2016 in my current settings) the overhead of compilation (~10s on my macbook pro initially, then ~1second for small incremental changes) totally over compensate the time usually lost in javascript debugging why a undefined/null/typo error happened at some point. In fact while I maybe spent a few hours building this frontend, I didn’t get any undefined error during the whole time which I find just amazing.

Proxying API to a Golang server

The JS part of the frontend talks to a server written in Golang which in turn talks to the Google cloud services like datastore, logging, etc.

By default the dev server than npm start (or ng serve) starts is not suitable for production so we need a proper server, the Golang standard http server fits a production environment with a few tweaks and is consistent with the rest of my cluster jobs so I went with it. To run the frontend and the Golang server I simply do:

term1> npm start
term2> go run server/server.go --address=127.0.0.1:8080

And we need to proxy the calls made to the server to the /api/ endpoints from the frontend in our dev stack, we can do this via a proxy file named proxy.config.json that we create at the root of our frontend and that contains:

{
"/api/*":{
"target":"http://localhost:8080",
"secure": false,
"logLevel": "debug"
}
}

Then in our package.json we can change the start command to take this proxy into account:

..."scripts: {
...
"start": "ng serve --proxy proxy.config.json",
...

Now when we start the dev server via npm start we’ll get all the calls made to localhost:8080 proxied to our golang server without having to care about setting up CORS or anything like that!

> ng serve --proxy proxy.config.json** NG Live Development Server is listening on localhost:4200, open your browser on http://localhost:4200 **
10% building modules 3/3 modules 0 active
[HPM] Proxy created: /api -> http://localhost:8080
[HPM] Subscribed to http-proxy events: [ 'error', 'close' ]
Hash: e7572b59f604d12c9028
Time: 20830ms
chunk {0} polyfills.bundle.js, polyfills.bundle.js.map (polyfills) 178 kB {4} [initial] [rendered]
chunk {1} main.bundle.js, main.bundle.js.map (main) 31.1 kB {3} [initial] [rendered]
chunk {2} styles.bundle.js, styles.bundle.js.map (styles) 44.9 kB {4} [initial] [rendered]
chunk {3} vendor.bundle.js, vendor.bundle.js.map (vendor) 4.14 MB [initial] [rendered]
chunk {4} inline.bundle.js, inline.bundle.js.map (inline) 0 bytes [entry] [rendered]
webpack: Compiled successfully.
[HPM] GET /api/lessons?cursor=&filter= -> http://localhost:8080

You can see on the last line of this sample terminal log here how the api requests are logged when they go via the proxy we set up.

Alright now we have all the setup we need and all that’s left is actually coding the frontend :-) I am done with the MVP which supports browsing the lessons, and searching for terms using the simple elastic search query syntax. There are still a bunch of open issues that I will be working on next but for now I believe it is important to put something out first, then iterate.

Buying the domain name and setting up DNS records

Next thing I did was buying a domain name, I already had another domain that I bought at Gandi so I just went there and bought college-audio.science for 1000 yens (~10USD/EUR) for 2 years which was a pretty good deal because they had a sale on the dot science domain at that time.

To setup a DNS record, we need a static IP to point it to, fortunately when you setup a balancer in GCP, you tell tell your ingress to use a specific static IP for it, this is done by reserving a static IP:

gcloud compute addresses create fe-static-ip --global

GCP only charges you if you leave your static IPs unused, so we’ll use ours right away in our ingress config:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: fe-ingress
annotations:
kubernetes.io/ingress.global-static-ip-name: fe-static-ip
...

Now we have a static IP that our frontend is served from, we can point our DNS to it directly in Gandi or use Google’s own highly replicated DNS servers which is what I did, the cost is really low for such a small project (a few cents per month), it does provide significantly faster updates and is replicated all around the world. Here is the dig result of my setup:

dig college-audio.science; <<>> DiG 9.8.3-P1 <<>> college-audio.science
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22543
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;college-audio.science. IN A
;; ANSWER SECTION:
college-audio.science. 300 IN A 35.186.210.248
;; Query time: 88 msec
;; SERVER: 192.168.1.1#53(192.168.1.1)
;; WHEN: Wed Sep 13 15:14:51 2017
;; MSG SIZE rcvd: 55

And on the Gandi side, I told it to use Google’s DNS servers via NS config that points to these servers:

ns-cloud-c1.googledomains.com.
ns-cloud-c2.googledomains.com.
ns-cloud-c3.googledomains.com.
ns-cloud-c4.googledomains.com.

Setting up encryption using let’s encrypt

Last part is obviously to setup encryption, we’re in 2017 and with Let’s Encrypt there is absolutely no reason to not use it. We’re using kubernetes as always to take care of our cluster so I used the amazing kube-lego to set it up.

All it took was to add a new deployment, a config map and a few annotations to my frontend ingress and I was done!

You can check out the full frontend deployment config

Here are the important bits:

Make sure to use the staging API of let’s encrypt first by setting up your config map like this:

apiVersion: v1
metadata:
name: kube-lego
data:
# modify this to specify your address
# lego.email: "myemailaddress@myprovider.com"
# This is the staging api.
lego.url: "https://acme-staging.api.letsencrypt.org/directory"
kind: ConfigMap

I am using the GCP L7 balancer for my ingress, not a custom nginx server in front of it because I don’t need yet another layer, this is supported by kube-lego out of the box, just specify that you want the GCE ingresses only via these environment variables in the kube-lego deployment:

- name: LEGO_SUPPORTED_INGRESS_CLASS
value: gce
- name: LEGO_SUPPORTED_INGRESS_PROVIDER
value: gce

Then we need our ingress to be marked as one that needs certs fetched via kube-lego:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: fe-ingress
annotations:
kubernetes.io/ingress.global-static-ip-name: fe-static-ip
# Enable kube-lego automatic renewal of ssl certs via let's encrypt.
kubernetes.io/tls-acme: "true"
kubernetes.io/ingress.class: "gce"
spec:
tls:
- hosts:
- college-audio.science
secretName: fe-tls
rules:
- host: college-audio.science
http:
paths:
- path: /*
backend:
serviceName: fe
servicePort: 80

Once you apply the deployment for the first time, it will create the kube-lego service, create the necessary account on Let’s encrypt , fetch a cert for the services that need it and store it in a Kubernetes secret.

You can turn on the debug logs at first when using the staging API via the log level environment variable

- name: LEGO_LOG_LEVEL
value: debug

to verify that everything works as expected then you can switch to the production API of Let’s Encrypt. When you do so, you must first delete the existing secrets that kube-lego created! else you’ll get obscure errors and loose time on stack overflow or reading github issues of people who did the same mistake trying to understand what’s going on…

If you’ve done everything right, you should now have kube-lego deployment

> kubectl get deployments kube-lego
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
kube-lego 1 1 1 1 13d

the corresponding secrets that contain the certificates and account information:

> kubectl get secrets
NAME TYPE DATA AGE
...
fe-tls kubernetes.io/tls 2 13d
kube-lego-account Opaque 2 13d

the kube-lego GCE service:

> kubectl get services
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
...
kube-lego-gce 10.43.245.5 <nodes> 8080:30492/TCP 13d

and finally our ingress should now support TLS via port 443:

> kubectl get ingress
NAME HOSTS ADDRESS PORTS AGE
fe-ingress college-audio.science 35.186.210.248 80, 443 19d

All of that done for us by kube-lego, thank you Jetstack!

Tuning the Golang server

Finally let’s see what we can do to the golang server to make it able to serve the angular app, first things first, we need a proper router, we’ll use gorilla’s mux here and tell it to serve the files built via the command

ng build -prod

that end up in a /dist directory using a http.FileServer:

r.Handle(
"/{[a-z0-9.]+.(js|html|css)}",
http.FileServer(http.Dir("dist"))).Methods("GET")

We also need to redirect all the URLs that are handled by the angular router before the application is actually loaded to handle copy&paste of URLs.

The official docs says to redirect all such URLs to /index.html and have the angular router then pick it up, which is done via this router config:

appHandler := func(w http.ResponseWriter, r *http.Request) {
http.ServeFile(w, r, "dist/index.html")
}
for _, route := range []string{"/search", "/lesson{*}", "/about", "/"} {
r.Handle(route, http.HandlerFunc(appHandler)).Methods("GET")
}

It feels a bit brittle as I have to keep in sync my frontend URL paths with the ones here but because my app is so small right now it doesn’t really matter, for bigger apps, we’ll want a proper regexp that catch everything and redirects to dist/index.html.

With that done, we can configure our server with proper request and response timeouts as Golang doesn’t have any by default! And also use middlewares to log the request and responses as well as handle gzip compression to serve this monstruous vendor.js blob generated by angular…

srv := &http.Server{
Handler: handlers.CombinedLoggingHandler(os.Stderr, handlers.CompressHandler(hsts.NewHandler(r))),
Addr: *hostPort,
// Good practice: enforce timeouts.
WriteTimeout: 15 * time.Second,
ReadTimeout: 15 * time.Second,
}
log.Fatal(srv.ListenAndServe())

The hsts mentioned in this snippet is a package I created to add HTTP Strict Transport Security which is a way to tell a browser that this website supports https and if it didn’t get to it via https this time, next time it should definitely do (and it will, all modern browsers support it). This gives us a nice A+ score on SSL labs :)

Screenshot of result from https://www.ssllabs.com/ssltest/analyze.html?d=college-audio.science

We could tweak a few server-supported cyphers to make it even slightly higher (TLS_RSA_WITH_3DES_EDE_CBC_SHA (0xa) is reported as WEAK) but that will be for next time, right now let’s just enjoy our encrypted frontend!

Screenshot of chrome developer tools Security tab. All green :)

The chrome audit page tells me I still have a lot of things to tune to make my webapp progressive but the rest is pretty satisfying so far:

Screenshot of chrome developer tool audits (lighthouse) overall results.

As always, the code of the frontend is available on Github:

In the next installment of this series, I’ll finally talk about the economics of running this website and how I plan on making it scale, stay tuned!

--

--