Making network authentication simple in a Bring Your Own Device environment
Here at ViaRezo, our job is to offer a high-speed, affordable and reliable Internet connection to the students of CentraleSupélec at Paris-Saclay. We are a non-profit student organization providing Internet access to ~2000 people living on campus. We manage all the active network equipment, and users simply have access to RJ45 wall sockets and shared Wi-Fi access points. Last year, as we were leaving our historic campus and moving into a freshly built one at Paris-Saclay, we set out to build a more modern and robust network infrastructure for our users.
In this article, we explore in depth the challenges we faced regarding compatibility, security, and user experience, and the solutions we came up with. We explain how we combined 802.1X authentication (wired & wireless) and per-subscriber VLANs to offer our users a quality Internet experience.
Our main objectives were:
- reliable authentication: only people with a currently valid subscription should have Internet access
- optimal user experience: we want to make the initial connection process as simple and transparent as possible for end users
- traceability: we must have a reliable way of matching public IPs to users so that we can respond appropriately to possible law enforcement requests
Contrary to many enterprise networks, we have absolutely no control over the machines that are plugged to the network. That means that we do not provision those devices, and our users expect that their laptop, their connected light bulb or their game console will work with our wired and wireless networks. Therefore, our authentication system must be compatible with all possible devices and OS versions, with as little manual configuration as possible.
Our research on the best ways to authenticate users on both a wired and wireless network turned up numerous documentation pages, yet remarkably few accounts of recent deployment experiences. This is our humble contribution.
The old-fashioned solution to wired authentication
Twenty years ago, authenticating users on a wired network was a very different task from today’s. At that time, there was little we could do other than identifying our subscribers by their MAC address. So here is a brief recap about how wired authentication used to work at ViaRezo.
Our predecessors had created a website for people to enter their MAC address. Then a script would run periodically on the DHCP to sync its whitelisted MAC addresses with those of the website. And the DHCP would provide an IP only to users who entered their MAC address online. To prevent clever users from getting free Internet access by manually setting their IP address, the ISP would activate a switch feature generally referred to as IP source guard, which drops traffic from users whose IP address does not match an IP address provided by the DHCP.
This infrastructure works. Yet it has many shortcomings, the most notable ones being:
- poor user experience: it is painful for a regular user to have to find his MAC address and add it to a website, most likely via mobile networks as he ironically does not have Internet to access the website
- sync delay: duplicating the information between the DHCP and the website requires syncs, typically every 5 minutes or so. Thus, after adding their MAC on the website, users will not be able to connect immediately, which is frustrating
This certainly was an acceptable solution at the beginning of the twenty-first century. Yet in 2017, we thought we could do better.
Our first attempt at wired authentication
As we moved into our new campus, we started engineering a more effective solution. We wanted to avoid the use of DHCP MAC filtering altogether, so we decided to use 802.1X authentication for wired connections. (802.1X is a protocol commonly used for enterprise wireless networks — but its use for wired networks is a lot less frequent.)
In practice, we imagined that when a user would plug a new wired device for the first time, his OS would prompt him for his username/password. And if he enters them correctly, the switch would activate his port, the DHCP would provide him with an IP address and he would be connected. It would just work. Or so we thought.
Wired 802.1X in brief
In case you are not familiar with the way 802.1X works, or if you’d like a refresher, here is what happens when someone plugs an RJ45 cable in his computer:
- the switch and the computer exchange EAP packets, until the computer sends a packet containing username and password
- the switch sends the username and password to a RADIUS authentication server
- the RADIUS server tells the switch to authorize or not the connection
- if the connection is authorized, the switch allows traffic through the port and the computer sends the usual DHCPDISCOVER message
The first issue that arises with this solution pertains to the traceability constraint: how can we match users with the IPs given by the DHCP? Indeed, the DHCP does not know the username associated with each computer. It merely knows their MAC address, which isn’t of much help in identifying someone by name.
The case for per-subscriber VLANs
There are various ways of solving this problem. The solution we opted for was to use a different VLAN for each user. Indeed, compared to having shared VLANs for multiple subscribers, a per-user VLAN architecture has advantages unrelated to authentication, such as home-sharing features, reduced broadcast traffic, and no broadcast-based attacks. It offers users a home-like experience: even when they are at the other end of the campus, working in a public area where we provide Wi-Fi, they are still on their own virtual network. They can therefore send a document to their connected printer, or stream music from iTunes using Home Sharing, while being hundreds of meters away from their apartment.
NB: Per the spec, the maximum number of VLANs on a network is 4094. In our case, it is not problematic, as our subscriber count should not exceed this limit in the foreseeable future. However, if we were to exceed it, we would need to imagine different solutions.
How dynamic VLAN assignment works
When the RADIUS server replies to the switch, it can provide a special field containing the VLAN id (more info here). On receiving this information, the switch can then dynamically assign a VLAN and untag it for this port.
As we obviously do not want to run one DHCP server per subscribers, we activate on all subscriber VLANs a service called DHCP relay agent. This instructs the gateway of each subscriber VLAN to forward DHCP packets to our DHCP server, located on a separate VLAN. Then, to decide from which subnet to make an IP address offer, the DHCP server examines the packet’s giaddr field. It is set dynamically by the DHCP relay agent and indicates the IP address of the gateway for the subnet the user is using. For example:
- a user, Bob, plugs his cable and enters his credentials
- the RADIUS authorizes the connection and assigns him VLAN 42
- the network configuration is such that VLAN 42 corresponds to subnet 10.0.42.0/24, and the gateway IP address of this subnet is 10.0.42.1
- the user’s computer broadcasts a DHCPDISCOVER packet on the network
- the gateway (10.0.42.1) receives it, and the DHCP relay agent forwards it to the DHCP server after adding giaddr: 10.0.42.1
- the DHCP server looks in its (completely static) configuration for a subnet that includes 10.0.42.1 (see sample configuration below)
- it then assigns any available IP address in the 10.0.42.0/24 address range
The traceability issue is therefore solved: we now know that all IPs in the 10.0.42.0/24 range are Bob’s IPs. Obviously, the key point here is to always assign the same VLAN to each user, and have a mathematical formula to translate IPs into VLAN tag IDs and vice-versa. For example:
Interfacing our RADIUS with our subscription system
On our FreeRADIUS server, we use the rlm_python module to delegate most of the authentication logic to a custom Python script. This improves readability, as we prefer Python to Unlang (FreeRADIUS’ processing language), and it allows us to use absolutely any authentication back-end.
Using a REST API, our Python script communicates with our subscription management system to check that both the supplied credentials and the corresponding subscription are valid.
The compatibility issue
At this stage, our solution seemed to work on paper. And actually, it worked more than just on paper: we tested it successfully!
Yet one detail is of interest: our successful testing was done using Linux and macOS computers. Our subscribers, however, are far more likely to use Windows. And suddenly, matters get more complicated, as Windows is rather choosy about the authentication protocols it supports.
While 802.1X for wired connections works out of the box on macOS, it is unnecessarily painful to activate by hand on Windows devices. Even on Windows 10, it requires more than fifteen (!) clicks in obscure areas of the network manager. So much for the “plug-and-play experience” we wanted to offer our users.
But this is just the tip of the iceberg. We would soon find out that Windows 7 only supports two EAP authentication protocol — neither of which we planned to implement.
Brief recap on EAP protocols
Before digging further, here is a short recap about EAP, the authentication protocol on which 802.1X is based. There are several variants of EAP and each one of them defines a different protocol to exchange credentials between a switch and a RADIUS server. The most common ones are:
- EAP-MD5: requires the server to store passwords in plain text. No thanks.
- EAP-MSCHAPv2: requires the server to store passwords as MD4 hashes (sometimes called NT hash), even though MD4 has been broken since 1995. Today, generating a collision for MD4 can take as little as a few microseconds, and a partial preimage attack on MD4 was published a decade ago. So we certainly will not store our subscribers’ passwords as MD4 hashes, as it is possible to get our users’ original passwords from those hashes, and many of those passwords are likely used for other online accounts. The MSCHAPv2 protocol also has obvious flaws — just taking a quick look at the RFC shows that anyone gaining access to the hashes can impersonate whoever he wants on the network. Indeed, hashes are generated on the computer before being sent to the switch and then to the RADIUS server, so an attacker does not even need to find a collision for a user’s hash: he can simply pass the hash.
- EAP-GTC and PAP: passwords can be intercepted in transit, but these protocol work with all types of hashes stored on the RADIUS side
- EAP-TLS: protocol that uses certificates instead of user/passwords. Secure, but all end-users devices must be provisioned with a different certificate, which is cumbersome outside of an enterprise context.
So there is no secure protocol using user/password credentials? Actually, yes there are! EAP-TTLS and PEAP are tunneling protocols: they encapsulate traffic in a similar fashion as SSL/TLS does. Given all this, we decided to support EAP-TTLS or PEAP as outer tunnels with EAP-GTC or PAP as inner tunnels. This allows us to store passwords with any hash algorithm we deem appropriate, like bcrypt, and still have passwords that are protected during their transit.
We thus support four reasonably secure authentication protocols: EAP-TTLS/PAP, EAP-TTLS/GTC, PEAP/PAP and PEAP/GTC. The good news is that macOS, iOS and Android (for WiFi authentication) all support at least one of these protocols! Yet we still have to deal with Windows.
For Windows 7, Microsoft chose to support only EAP-TLS and (P)EAP-MSCHAPv2 as authentication protocols. That is unfortunate: it leaves us to choose between a terrible user experience and a terribly weak hash algorithm. Microsoft eventually added support for EAP-TTLS/PAP in Windows 8/10, but we are hearing reports from users that the feature occasionally breaks with Windows upgrades.
Our second attempt
As we still have too many subscribers running Windows 7, we needed a solution for devices not supporting the EAP protocols we had chosen. Rather than implementing EAP-TLS or EAP-MSCHAPv2, which are both full of compromises, we opted for a third solution: falling back to MAC authentication in case a device cannot connect using 802.1X.
This possibility is generally referred to as MAC Authentication Bypass (MAB) or MAC-RADIUS authentication, and the connection flow is as follows:
- the user plugs in his cable
- the switch sends EAP requests to the device
- if, after a (configurable) delay of 60 seconds for example, the device is still not connected, it probably means that the device does not support the same EAP protocols as the RADIUS, or does not support 802.1X altogether. After this delay, the switch thus directly sends the MAC address of the device to the RADIUS server, and the latter may authorize, or not, the connection
On the RADIUS side, detecting that MAC authentication is being used is straightforward: we just have to check if the Service-Type attribute in the EAP request is set to Call-Check (at least that is true for Juniper and Cisco equipment). So a typical FreeRADIUS configuration for the outer tunnel (for the packets not encapsulated in EAP-TTLS or PEAP) would look like the following:
Here, if we detect a MAC authentication request, we call our custom Python script using the rlm_python module. (If the usual login/password method is used, the
eap module is called to handle the decryption, and the Python script will be called in the inner tunnel.)
We also adapted the
authenticate Python script itself so that it handles MAC addresses:
So we deployed this solution, and it worked… sometimes. But it had two major issues:
UX: even the good folks running Windows 7 do not want to wait 60 seconds each time they plug in an RJ45 cable
- some devices actually stop sending DHCPDISCOVER packets after a delay, so when the switch finally opens the port after 60 seconds, the device does not get an IP address
- after another configurable delay (typically 60 minutes), the device will lose the connection while the switch waits for 802.1X authentication, until it falls back to MAC authentication again
Wait a minute, why not just lower the delay to a more acceptable value, like 5 seconds? Simply because once the switch started MAB, it does not allow for 802.1X authentication anymore. So if a user with a 802.1X-compatible device takes more than 5 seconds to enter his credentials (and he will, at least until they are in his keychain), 802.1X authentication will fail. We tried to find a happy medium between a short timespan to enter 802.1X credentials and a long wait time for MAB, but we were never fully satisfied.
Our third attempt
Eventually, we realized that we could approach this problem the other way round: why not start with a MAC authentication attempt, and if it fails, allow for 802.1X authentication?
Honestly, I am not sure why we didn’t come up with this solution from the outset, except because we misled ourselves by casually calling it MAC fallback.
We tried this and it works amazingly well. From our tests, the MAC authentication attempt only adds about 2 seconds to the 802.1X authentication process, which is acceptable. So our 802.1X users can connect (almost) immediately, and so do our users with 802.1X incompatible devices. Hurrah!
What about wireless authentication?
The same RADIUS server is used for WiFi authentication, and we have to face the same EAP compatibility issues. macOS, Linux, iOS and Android all work great out of the box, while Windows devices need to be provisioned or obscure settings need to be tinkered with.
For incompatible devices, MAC-Radius authentication is also the way to go. Depending on how your Wi-Fi provider implemented it, you may have to create a separate SSID for incompatible devices.
In our case, we use Unifi APs, and the manufacturer eventually enabled MAC-Radius authentication with dynamic VLAN assignment in January 2018. We thus have a WPA2 SSID with the usual login/password prompt for most devices, and another, less secure SSID (without WPA2) that uses MAC-Radius authentication.
If you cannot implement wireless MAC-Radius authentication on your network, you have to resort to the more conventional guest portal, and users authenticated via the guest portal will not be on a dynamically assigned VLAN.
All things considered, how well does our infrastructure perform?
The connection flow of wired 802.1X is excellent on macOS, yet unnecessarily complex on Windows 8/10. We very much hope that Microsoft enables wired 802.1X with EAP-TTLS/PAP by default, so that we can provide a true plug-and-play experience. In the meantime, we are developing an executable that users will download to automatically configure network settings. (For wireless communications, again, the connection process is smooth for most OS except Windows 7/8/10)
Providing you have a means of provisioning Windows devices, the connection flow should be excellent on all modern operating systems. And MAC-authentication can be used in the last resort for devices that remain problematic.
We tried to make the best compromise between security and compatibility. By not using MSCHAPv2, we chose to limit the number of compatible devices. In return, we can hash our users’ passwords with modern and secure algorithms. The compatibility costs will be partly compensated by provisioning Windows 8/10 devices.
Devices that don’t support our authentication protocols, such as Windows 7 laptops, can use our MAC-authentication system. This poses two security risks that could be exploited by an on-campus attacker:
- a reasonably skilled user can impersonate someone else by tinkering with his MAC address (however, they won’t be able to impersonate just anyone, as the attacker will obviously need to find a user who actually owns a 802.1X incompatible device and registered a MAC address online)
- traffic from users using wireless MAC-authentication can be intercepted, as it is unencrypted between the device and the Wi-Fi AP
Windows 7 users thus benefit from less security features than if we were supporting MSCHAPv2. But we are convinced that this is in the general interest, as a small (< 5%) and decreasing part of our user base should not compromise the security of most users. Besides, our users are generally well-intentioned, and our priority is to mitigate threats coming from the outside world (such as password theft), even if it requires compromises about the possibility of on-campus attacks.
To further reduce the risks, we inform Windows 7 users of the security implications of MAC-based authentication and suggest they take actions. When they register a MAC address online, they thus receive a warning encouraging them to upgrade their system if possible, or use a VPN.
In hindsight, we believe that combining 802.1X authentication and dynamic VLAN assignment was the right call. The added benefits of per-subscriber VLANs, such as home sharing, are appreciated and praised by our users. Yet in 2018, the support of the different EAP variants used by 802.1X is still heterogeneous. Thus, while the connection process is straightforward on most devices, it is not yet optimal for Windows. We do hope, however, that Microsoft will improve 802.1X compatibility in Windows. It could become as simple as it is on other platforms, allowing everyone to experience the magical “it just works” feeling.
Interested in what we do? Follow us on Twitter!
Special thanks to my team of reviewers for their kind words and suggestions.