Adventures in Shibboleth and Nginx (Part 1 of 2)

How the Engineering Hub room booking system’s login system works, and how you, too, can build something cool with UCL authentication!

Chris Hammond
UCL API
9 min readDec 30, 2016

--

Update: Part 2 is now available with full instructions on how to achieve all of this.

The white on black text is weirdly mesmerising after a while…

If you’ve been following along with TechSoc recently, you will have seen that we launched the Engineering Hub Room booking system over at enghub.io. If you’re a UCL Engineering student that has not seen this site yet, feel free to head over and take a look as it might just save your group project by helping you to book a room so you can work together.

One of the key technical challenges with building this system was the authentication with UCL. As you may know, UCL uses a system called Shibboleth to handle Single Sign-On (SSO). Shibboleth is used by universities and businesses all round the world to provide applications with a common interface to secure authentication of users against either a central user database, or the user databases of a collection of connected Identity Providers (called a federation).

Shibboleth can be an absolute pain in the <insert your favourite swear here> <insert your favourite word for rear end here>. It has been labelled the “least fun thing in [his] professional career” by a DevOps professional. Yes, it really is that irritating.

The idea is that a developer can create an application (known as a Service Provider, or SP for short) that can be logged into by users at UCL (or another participating institution with whom encryption keys and metadata have been exchanged) via the Shibboleth Identity Provider / IdP. The purpose here is that the institution handles authentication so that students, staff and other employees can use their Active Directory/network logins to access resources without having to create a brand new account for each application alone. Additionally, the IdP can provide the application with extra information, known as attributes. In our case, we receive a list of which groups a member is part of so that we can deduce whether or not a student is an undergraduate member of the Engineering Faculty, along with the user’s full name including any given names. I have provided my data as an example, because I don’t really mind you, as the reader, knowing my name. Be aware, however, that since personally identifiable information is being sent to the application that it must be stored and processed securely. The full list of attributes we receive is in the image below:

Almost everything at UCL uses Shibboleth, from the UCL Library’s online access to UCLU’s website. If you’ve bought membership for a society, rented a book online or viewed your personal timetable, you have used Shibboleth whether you knew it or not. It looks a bit like this…

UCL’s Shibboleth IdP Single Sign-On Service. Possibly the most common pages you’ll see whilst at UCL…

As I’m sure you can imagine, to get an old and convoluted system based entirely on page redirects to work with (our lovingly nicknamed) Roomie McRoomface — a documentation first, fully RESTful API backed system proxied via Nginx — we had to jump through a number of hoops. Since I helped to jump through a lot of these it seemed fitting that I would deliver back to the world some insight into how we got this done, along with some information about how you can also do something similar.

This pair of posts will serve two purposes:

  1. Explain how our implementation works round the many pitfalls of Shibboleth with a “pretty” diagram or two
  2. Show you how you can set up your very own Nginx + Shibboleth setup (using only free and open source software, just to make the Stallman fans happy!)

Firstly, Wat?

A key limitation in Shibboleth is that there is one fairly fixed way of authenticating. I say fairly because there are options on how this process encodes or sends data, but the flow is pretty much static.

  • The user clicks login in the application they want to log into.
  • The application generates some sort of temporary token or state data.
  • This token / state data is somehow encoded into the callback URL.
  • The user is sent by the application to Shibboleth IdP (hosted by the authentication provider, e.g. UCL or your university/company).
  • The user logs into IdP.
  • If authentication is successful, the user will be sent back to the calling application using the callback URL provided by the application.
  • The application continues the login process from the state information encoded into the callback URL. User data is stored in HTTP headers which can be read by the server; these contain information such as the user’s username, full name, email address and any other fields the IdP elects to send on. This is set up on a per-application basis.

This login workflow is perfect if, for example, you write your application in PHP. This is because in PHP you have your frontend and backend coupled together and it’s easy (and completely okay, even from a UX point of view) to redirect your user around and store data behind the scenes for security. As PHP is losing traction, we need a better way of doing things that champions the backend and frontend separation whilst also supporting Shibboleth’s antics.

We opted to be super cool and use Django for a new-fangled RESTful API; however this does not make use of Shibboleth any easier. In fact, using Django and React together (for maximal disruption) meant we created an entirely decoupled frontend and backend. This is great, but it means we could not have used the traditional method of redirecting to/from a coupled PHP frontend and backend. Our solution to this is nifty, but takes a bit of head-scratching, hence a diagram!

Our Shibboleth Workflow

Yes, this does look like a mess, but I promise it’s sensible and more than one brain cell went into the creation of this monstrosity!

To explain a little more about what is going on here:

  1. The user says they want to login by clicking on a button on the React Frontend:
Engineering Hub Login button

2. The frontend calls our user.login.getTokenendpoint in the background which generates a random sid starting with shib (actually, the fact that it’s called a sid is from a previous plan where this ID would constitute a session or token ID. This idea was later changed, but we kept it because we all knew what the ‘sid’ was)

3. The frontend is sent a JSON response like the following:

{
"stream_sub_url": "https://enghub.io/api/v1/push.subscribe/shibrwcybnywqzgcudjaqpthssqwfrxyxionjevxoibzgwtdomzrqhgoelvsqexi",
"stream_sub_lp_url": "https://enghub.io/api/v1/push.subscribe_longpoll/shibrwcybnywqzgcudjaqpthssqwfrxyxionjevxoibzgwtdomzrqhgoelvsqexi",
"loginUrl": "https://enghub.io/Shibboleth.sso/Login?target=https%3A%2F%2Fenghub.io%2Fapi%2Fv1%2Fuser.login.callback%3Fsid%3Dshibrwcybnywqzgcudjaqpthssqwfrxyxionjevxoibzgwtdomzrqhgoelvsqexi",
"callbackUrl": "https://enghub.io/api/v1/user.login.callback?sid=shibrwcybnywqzgcudjaqpthssqwfrxyxionjevxoibzgwtdomzrqhgoelvsqexi",
"sid": "shibrwcybnywqzgcudjaqpthssqwfrxyxionjevxoibzgwtdomzrqhgoelvsqexi"
}

4. The frontend now does two things. Firstly, it opens up a new tab to the given loginUrl . Secondly, it creates a long polling connection to the URL given by stream_sub_lp_url . These both sound like completely disparate operations, and indeed this is correct, but they are both key. What we wanted to achieve was an environment where the user can log into UCL, but they never leave the application. We also wanted to think to the future: what if we are creating a native mobile app? Or perhaps a third party may wish to create a frontend better than ours? These are both possible scenarios, but it makes us wonder if it would still be okay to make the user get redirected around web pages? We think not. So what happens instead is that once the user logs into Shibboleth, a push stream message is sent out to the long polling connection (the frontend actually becomes a live stream listener, where the channel name is the sid. It’s basically a short-lived, one-way chatroom! Oh, and as you’ll see later, we do not have to pay for Pusher or a similar service to achieve this…)
Why long polls? They’re dead simple to implement, but the library we used to make this work supports HTML 5 sockets, too.

5. As soon as the user completes the login in the Shibboleth tab, the browser is redirected to the user.login.callback endpoint (this callback address is URL-encoded into the targetGET parameter of loginUrl). Note that this is happening in the secondary browser tab, and the main application window is still waiting on the long polling connection for authentication data. As soon as this callback endpoint is reached, a response like the following is received almost immediately by the frontend on the sid’s streaming channel:

{"id":1,"channel":"shibrwcybnywqzgcudjaqpthssqwfrxyxionjevxoibzgwtdomzrqhgoelvsqexi","text":"eyJlbWFpbCI6ICJ0ZXN0eHl6QHVjbC5hYy51ayIsICJ0b2tlbiI6ICJ3OXpzNHFmZWdzamNzYms0ZnZ5Y3pycWFxNXZtNGRocXE0N3Nta3ZzIiwgInJlc3VsdCI6ICJzdWNjZXNzIiwgIm1lc3NhZ2UiOiAiTG9naW4gc3VjY2Vzc2Z1bCIsICJncm91cHMiOiBbXSwgInNvY2lldGllcyI6IFtbIlRlY2ggU29jaWV0eSJdXSwgInF1b3RhX2xlZnQiOiAxMjF9"}

See that text data? That’s just a base64-encoded response sent from the server when triggered by the callback URL. When we base64-decode this (which is done by the frontend in the browser), we get data that looks like the following:

{"email": "testxyz@ucl.ac.uk", "token": "w9zs4qfegsjcsbk4fvyczrqaq5vm4dhqq47smkvs", "result": "success", "message": "Login successful", "groups": [], "societies": [["Tech Society"]], "quota_left": 121}

At this point, the login process is actually complete! The token can be used in all future requests completely RESTfully, and Shibboleth need never be touched again. Just for final effect, once the redirection to our callback URL has been performed and the data has been pushed to the React app in the browser, the callback tab is closed via JavaScript and React carries on as it did before. Using this method the user never has to close off the main application, and the main app never has to read any data from the popup window directly. The advantage here is that a mobile app could just pop up a small WebView for Shibboleth, and then that WebView can be killed as soon as the Shibboleth response is sent back via push.

Behind the scenes, however, that callback URL does a lot more than meets the eye. As Shibboleth is redirecting to it, it also attaches a bunch of extra information to the HTTP response headers, including the username, email address and some signature data. These are the same atrributes that were mentioned earlier in this article. The data is received by the Django backend and stored as necessary into the Django database. This is used to create an account for the user if one does not already exist, and the user is assigned an authentication token so that the React frontend can make future requests on the user’s behalf. This way we never see the user’s password, and the user never has to type their password into our app. They also never have to manually sign up or login to an existing account; our Django code will handle all this based on the cn (essentially a fixed username / UCL email address in the seven letter format; mine is zcabcah).

What you saw above was fairly unorthodox, and is not usually how we would recommend writing a login procedure. The restriction here, however is that we need to authenticate twice. The user authenticates the Django backend via Shibboleth, which in turn authorises the frontend using a token. To make things more complicated, Shibboleth must do a redirect which passes HTTP headers with raw user data. You cannot store keys and then use them to fetch more data later. We needed a way for the frontend to request a login from the backend, which in turn would request a login from UCL. The user must be present for both stages of this as it’s the user that logs into UCL, and also the user’s frontend that logs into the backend with a token.

Hopefully this has provided some useful and/or interesting insight into why and how our implementation of Shibboleth works (and why on Earth we did it like that!). In the second and final part of this series I’ll show how this is all set up on a clean Linux box, and exactly how Nginx is configured to make this all work!

Thanks for reading, and do let me know if you want to learn any more, or if I made any mistakes!

Chris :)

P.S.: sincere thanks go out to Alfie Duffen, Wilhelm Klopp and Jaromir Latal for proofreading and guidance!

--

--

Chris Hammond
UCL API

Computer Science student at UCL. Lunatic by day, developer all night. Outspoken, but all views are my own. InfoSec is fun. No Node.js on desktop please thanks.