My first SecAdvent post told the story of my journey from SysAdmin to InfoSec, and I’m wrapping up this year’s SecAdvent with a post about one of the more interesting projects I worked on with the Elastic InfoSec team – replacing our VPN with OAuth2.
One of the first access control tools we deployed for Elastic’s infosec team was a VPN. We wanted a way to protect any services from unauthorized access, including rejecting any unauthorized snooping or poking around. We chose OpenVPN out of familiarity, and also because it met our needs. To harden things a bit more, we wanted to add two-factor authentication. The shortest path we found for this is to upgrade to OpenVPN Access Server. All our team’s services were deployed on Google Cloud and were protected by firewalls which only allowed network access from the VPN’s IP address. This solution met our safety needs, but we ran into a few maintenance problems.
Kubernetes runs all of our services, and this worked pretty well for everything except the VPN. VPN software typically requires root/administrator access in order to manage network interfaces, and I never found any way to run the OpenVPN server successfully on Kubernetes. Even if I could get this deployed on Kubernetes, the OpenVPN Access Server license doesn’t allow this, as it binds the license key to the machine where it is activated.
Managing users was manual, and helping folks reset passwords or adding new accounts was a bit tedious. We had dreams of making our infosec services like log collection available to other teams, and with thousands of employees, manual user management was a no-go for me. Further, because we had to run the VPN server outside of Kubernetes, we were kind of stuck with a special case for managing this lonely system differently than everything else.
Third, user burden.
A VPN client is required to be installed on every user’s devices in order to connect to the VPN. This adds a new cognitive load to every user, challenges us to provide useful documentation for three operating systems, and adds support load as folks have problems with it. A frequent support request was usually “Is the log cluster down? I can’t reach it” only to have us determine that this person had simply forgotten to activate the VPN client.
This burden goes against my belief that safety should be the default and easy path, and with VPN clients being so different on each platform and each vendor, it’s hard to become familiar with using one.
Lastly, it was an anomaly.
Elastic is a geographically distributed company with no central corporate network, most other teams didn’t actually use a VPN for access control to their own services. Instead, they relied on our identity provider, Okta, to perform that access control. Wiki? HR systems? Salesforce? GMail? All of that used Okta as the identity provider and access control system. Access to Jenkins routed you through GitHub. Many more systems were configured the same way.
Do we even need a VPN?
I wondered — did we even need a VPN? VPN-protected services were completely inaccessible unless you were connected to the VPN, and it helped ensure that only authorized users could access those services. At the time, all of our services were web-based. Could we provide these same capabilities and also reduce the burdens of our VPN? You bet!
One of our internal teams already had success with a solution – oauth2-proxy using GitHub as the OAuth2 provider to ensure only employees, even on specific teams, could access certain internal services. OAuth2-proxy is a reverse-proxy for protecting HTTP(S) services by using OAuth2 to authorize and authenticate requests. You put oauth2-proxy in front of your service, configure an oauth2 provider, and you’re done: only authorized requests are sent to the protected service.
There are other solutions like oauth2-proxy, such as Google Cloud’s Identity-Aware Proxy and Okta’s Access Gateway. At the time, I don’t think Okta Access Gateway was available. Google’s IAP looked great in the documentation, but after some discussion with their support team, we learned it didn’t support TLS to backend services in their Kubernetes integration. Some of that may have changed since early 2019 when we started this project.
We started using oauth2-proxy and also added nginx to the mix. Why nginx? First, familiarity as a proxy and HTTP router. Second, we would have a number of backend services protected by this, so it made sense to use nginx to route different virtual hosts while still requiring all requests be authenticated with oauth2-proxy. Third, nginx has some great logging capabilities, including writing logs in machine-consumable JSON format, which would help us monitor and audit activity on our protected services.
This proxy gave us a way to authorize and authenticate requests, but I still needed a way to protect the backend services, especially if those services had public addresses. In my research, I found a few different ways to achieve this.
When using a VPN to protect services, I would configure those services with a firewall to only allow access from the VPN IP. You could do the same with this proxy solution, but there are other ways that might work better especially in scenarios where you might not have permanent IP addresses: HTTP Basic Auth (oauth2-proxy’s –basic-auth-password) and SSL certificates. For example, a protected service could be configured to trust only a single SSL certificate that is used by nginx or oauth2-proxy when the proxy requests.
Our implementation with nginx and oauth2-proxy is:
- Nginx configured with multiple virtual hosts, one for each protected service.
- Nginx configured to use oauth2-proxy using auth_request.
- oauth2-proxy using OpenID Connect (OIDC) with Okta.
- oauth2-proxy using –skip-provider-button to skip the landing page and make logins faster.
By using oauth2-proxy as our access control system, we are replacing a network boundary provided by the VPN. This means there is no more “behind the VPN” concept, and it changes how you might deploy services. There is great prior work on Zero Trust networking that is worth reading. Google’s effort in this area is called BeyondCorp. Your cloud or identity providers might already have solutions for this scenario. For example, Google Cloud’s Identity-Aware Proxy and Okta’s Access Gateway.
Transitioning from VPN to oauth2-proxy allowed us to protect against the same kinds of threats while also making some great improvements. We could keep network-level access control and two-factor authentication. For users, there is no extra software to install or support, and they can keep using the familiar web-based authentication flow. For myself, as an operator, maintenance is much easier as this solution deploys neatly on Kubernetes (like the rest of our services) and allows us to shutdown the lonely VPN server. Accounts and group membership were now coming from Okta and managed by the IT team.
Does this interest you? Exploring oauth2-proxy is something you can probably do as an experiment. One nice property of these proxy solutions (oauth2-proxy, Google IAP, etc) is that your protected service doesn’t need any special integration. The proxy takes care of authentication and your service doesn’t even need to know it’s being protected. You can explore this solution a few different ways. If you have a GitHub account, you can get started on your workstation (localhost!) using GitHub as your OAuth2 provider. The oauth2-proxy documentation gives you the steps needed to set this up.