
What's OAuth2 Anyway?
- HackerNews Top 10
- Intro
Content
Have you ever logged into a website using your Google or Facebook account? Or connected an app to access your GitHub data? If so, you’ve already used OAuth2, whether you knew it or not.
OAuth2 is the world’s most popular, extensible authorization framework. It allows you to integrate a couple of systems together by delegating access to your data from one service to another. But here is the thing - most people don’t really understand how OAuth2 really works.
Personally, I’ve implemented several applications that were using OAuth2. The process was so straightforward that I had no need to stop and think about the protocol itself along the way. That’s by design. OAuth2 is built to be super simple to implement client applications, not to wrestle with complex authentication requirements.
But if we pause and dig deeper, there’s a lot to learn from the software engineering point of view.
In this article, we will uncover the “whys” behind the OAuth2 protocol design and break down the most common authentication grants.
Background
It’s helpful to start with the historical context of the problem that OAuth2 was created to solve and consider alternatives we’d have without it.
Imagine we want to build a user-friendly deployment platform like Fly.io or Vercel. Right away, we hit the key problem: how can our customers import their code into our platform?
These days, almost everyone uses Git. We could try building a Git hosting functionality directly into the platform, but that’s a huge piece of work, while our primary business goal is resource management, autoscaling, load balancing, etc. On top of that, most of our customers are probably already using one of the existing popular Git hosting services like GitHub, GitLab, or Bitbucket. Unfortunately, we don’t have any way to convince these platforms to integrate with us.
So, what’s our options? How could we possibly get access to our customers’ Git repositories hosted elsewhere?
User Credentials Sharing
Our customers log into their Git hosting services using their credentials. Why can’t they just share their credentials to us?
We could store their credentials securely and then, when needed, log in to the Git service on their behalf, use their session cookies, and fetch the required Git repositories.
At first glance, this sounds like a straightforward idea to let our platform work with customer data, even when they are not around.
But then we realize, it’s riddled with problems:
- No access control. The platform gets full access to everything that our customers can do without a way to limit or control that, even if we only need to access their repositories.
- No session control. It’s hard to distinguish between sessions created by the users and those initialized by the platform. If the login process is the same for both, it’s hard to implement more advanced login security measures like MFA.
- Hard to revoke. Once shared, the credentials can be cached and leaked in unexpected ways even if you removed them from the platform UI. The only way to fully revoke access is to change your password.
- Brittle. If you change your password, this would effectively break the platform’s access to your data.
- Security risks. The platform must store the credentials securely, which is a significant responsibility. If the shared credentials are managed in a sloppy way, they may be breached and expose the whole customer account.
It’s a problem that one and the same set of credentials with broad, top level permissions are used for two vastly different purposes. What could we do about that?
Personal Access Tokens
Apparently, if we want to do any better, we need to keep the main credentials private. Instead, we could introduce an alternative type of credentials just for using in integrations.
Let’s call these Personal Access Tokens (PATs). Think of them as a static secret string with a relatively long lifespan. Technically, each PAT could have a custom set of permissions assigned to it, limiting what the platform can do with the associated data.
Whenever a customer wants to integrate Git repositories with a new service, like our deployment platform, they would generate a new personal access token with the necessary permissions and share it with the service.
This approach is a great improvement over sharing the plain credentials, since it addresses its major problems. However, there are still a couple of things to keep in mind:
- Keeping track of expiration dates and replacing stale tokens gets tedious very quickly if you need to manage more than a handful of tokens.
- To minimize the management burden, token lifetime could be extended (we are talking about months or even years). Unfortunately, in that case, a token gets compromised, malicious actors will have plenty of time to exploit it.
But do customers really need to manage their tokens for every service and integration they use? Could we simplify this further, so customers will have to do as little as possible to enable new integrations and the whole token management process is automated by the party that needs it?
That’s exactly why we need something like the OAuth framework.
What’s OAuth2?
OAuth2 is a framework that defines how access or permissions are requested or delegated from one an authoritative entity (like the user) to third-party applications.
The core idea behind OAuth2 is to give users the power to decide what applications (beyond those that are natively supported by the resource server) can access their data. It ensures that the access is controlled and convenient, allowing these applications use your data whenever they need to, even when you’re not around, to extend the base functionality of the resource server. Let’s break it down a bit.
Without mechanisms like OAuth2, the resource server essentially controls which applications can access with your data. I imagine this happens through partnerships, where two companies collaborate to integrate their services into each other’s offerings (often in a very custom, non-standardized way).
This approach is centralized because:
- Only allowed partner’s applications are at play
- Everything else is effectively blocked.
OAuth2 introduces a middle ground, allowing the third-party applications to use the resource server, as long as users are willing to grant permissions to their data or functionality. In this model, the resource server decides nothing for end users (unless it’s blocking malicious applications to protect users from abuse).
This creates a powerful form of decentralization that:
- Let resource owner extend the resource server’s functionality in a few clicks
- Help to build an ecosystem of tools and applications around the resource server
Roles
OAuth2 defines three main roles to organize the delegation process:
- Resource Server. This is a service that the client application needs to access, either on user or their own behalf. For example, in our case, it’s the Git hosting provider like GitHub.
- Resource Owner (the user if it’s a person). The entity that holds permissions to the resource server and can grant access to the client application.
- Authorization Server. This service issues resource access tokens for the client application in exchange for various forms of authorization or grants.
- Client Application (a.k.a. the client, OAuth application). An application or service that accesses the protected resource server, typically on behalf of the resource owner.
General Workflow
OAuth2 introduces the Authorization Server, acts as a middleman between the Resource Owner (who has the authority) and the Client Application (that needs some of that authority). The Authorization Server is trusted by the target Resource Server (which provides some functionality based on the authority).
The Client Application asks the Resource Owner for a certain set of permissions to access the Resource Server.
The Resource Owner reviews the permission request and gives a consent to grant the access to the Client Application via the Authorization Server.
Depending on the authorization flow, the Client Application receives an authorization grant in some form and uses it to trade for an access token (or a pair of tokens) from the Authorization Server.
Finally, the Client Application uses the access token to access the Resource Server on behalf of the Resource Owner.
The Resource Server knows how to validate the access tokens issued by the Authorization Server, typically through an internal request to the Authorization Server.
Clients
The journey into the OAuth2 world begins with Client Applications. There are two types of Client Applications, categorized by their abilities to keep secrets:
- Public applications like in-browser JS applications, desktop or native mobile apps. Any secrets embedded in this type of application can be reverse-engineered and extracted, even if you try to obfuscate their distributions or encrypt them.
- Private (or confidential) applications, which are typically any web applications with frontend and, most importantly, backend parts. The backend is capable of securely storing secrets and establishing protected communication with the Authorization Server.
OAuth2 assumes there are much more Client Applications than Authorization and Resource Servers, so it aims to simplify the Client Application side as much as possible. This not only reduces the work to do to implement a Client Application, but also limits opportunity for implementing insecure Clients.
The heavy lifting of keeping the OAuth2 workflow secure is handled by the Authorization Servers.
Client Registration
To plug our Client Applications into the OAuth2 workflow, they first need to be registered with the Authorization Server.
The OAuth2 doesn’t make any assumptions how the registration process should work, but it’s typically a part of the OAuth2 provider website’s settings e.g. functionality to create and manage OAuth apps.
The registration form usually includes:
- Redirect URL(s) - A list of allowed URLs for redirects in interactive authorization flows, such as the authorization code or implicit flows.
- Scopes - A list of delegated access to the Resource Server’s functionality e.g. read Git repositories, create issues, etc.
- Miscellaneous information like application name, icon, privacy and terms of service URLs, etc.
There are other, less popular client registration approaches. For example, I’ve seen:
- Registration via internal admin requests to the Authorization Server like ORY Hydra.
- Declarative registration by creating Kubernetes Custom Resources in the cluster using ORY Hydra Maester.
Client Credentials
At the end of registration, you typically receive the client credentials:
- Client ID (a.k.a App ID) - a public, non-secret identifier of your Client Application.
- Client Secret - a secret password that the Client Application keeps privately.
The client credentials are used to:
- authenticate the Client Application requests to the Authorization Server
- bind a specific authorization flow to the Client Application that has started it. This ensures it’s not possible to finish that flow with completely different Client Application
- add another layer of protection to authorization flows and the obtained refresh tokens, because you cannot leverage them unless you have the right client credentials
The Client Application sends the credentials as the Basic HTTP Authorization header.
The client ID is tied to authorization grants and refresh tokens, so it’s essential to keep it unchanged. Changing it would invalidate all authorizations (e.g. refresh tokens) that you’ve already obtained.
On the other hand, the client secret can, and should be, rotated periodically. Changing the secret would have an effect of “rotation” of all refresh tokens received by the Client Application even though the tokens would not be affected. This is because if the client credentials were leaked along with some refresh tokens, malicious actors would not be able to obtain new access tokens using the old client secret after the client secret rotation.
This significantly simplifies the process of secret rotation as you need to rotate only one secret instead of rotating thousands of refresh tokens for each end user that has ever authorized your Client Application.
AuthZ Servers
The Authorization Server handles the delegation process via multiple authorization flows (including potentially non-standard flows) uniformly and as a result, it issues a special access token (or a pair of tokens). In order to do that, the Authorization Server needs to provide at least the following endpoints required by the OAuth2 protocol:
- Authorization endpoint that starts the interactive authorization code or implicit flows
- Token endpoint that generates access and refresh tokens and used in pretty much all other authorization flows
- Device endpoint if you want to support the device flow
In practice, you may want to also have a bunch of others that are not defined in OAuth2 directly:
- Access token introspection endpoint (it has its own RFC) that returns metadata information associated with the given access token. It can be used by resource servers to validate incoming access tokens.
- Authorization grant revocation endpoint that allows it to revoke the whole authorization grant.
- Token revocation endpoint that allows to revoke the issued access and refresh tokens.
- and a bulk of other endpoints that were introduced in the all follow-up RFCs and drafts if you need that.
Endpoint Discovery
Historically, the Authorization Server OAuth2 endpoints were not fixed nor was there a way to discover them. The endpoints were extracted by the provider documentation and hardcoded in the OAuth2 libraries or Client Application (here is an example from the goth library).
Security
The Authorization Server is represented as a separate component conceptually, but the protocol has no requirements on how it should be implemented under the hood. It could be either a separate microservice, or it can be a part of the Resource Server.
One important assumption that OAuth2 protocol makes implicitly is that one authorization server can potentially handle authorizations for multiple Resource Servers. This means that among all OAuth2 components, the Authorization Servers are the rarest to implement. That’s why they are responsible for handling a lot of security nuances around the Authorization Server implementations. Your OAuth2 is essentially as secure as your Authorization Server.
Access Tokens
The Authorization Server generates access tokens as a result of the successful authorization flow.
The access tokens are a special credential that serves as an alternative method of authentication for the Resource Server. They can be also seen as an abstraction around the exact authorization flow. There could be multiple authorization flows supported by the Authorization Server, but they all will result in access tokens that have the same format. This makes them easier to validate for the Resource Server that doesn’t need to know too much information about how the specific token was obtained.
Access Token Scopes
The concept of access token is also important because we can generate multiple access tokens with different reduced subset of the originally requested scopes. If there was no access tokens as a separate credential and we were using the authorization code, let’s say, for that purpose, it would have all permission scopes requested by the client application at the point of passing authorization flow.
Token Types
OAuth2 doesn’t define how the access tokens should look like. They are opaque strings to the Client Applications and likely Resource Servers too.
Apart from that, when an access token is generated, the Authorization Server indicates what type of token was issued.
Bearer Tokens
The most common access token type is called the Bearer token.
In the wild, Authorization Servers may issue bearer tokens as:
- a unique random string. The string should be non-guessable and not possible to generate outside the Authorization Server.
- or as a self-contained JWT token that includes the signed meta information.
Other types of tokens are theoretically possible, but I have never seen them in the wild.
Token Lifetime
OAuth2 requires Authorization Servers to generate access tokens only. If so, the generated token is considered as a long-lived and that’s not great for two reasons:
- The access tokens are linked to the client application, but they are usually passed to the resource server without any additional proof of token possession. Hence, if they are leaked, the malicious actors would have enough time to exploit them.
- The access token is linked to the original access scopes and there is no way to generate a new access token with a subset of scopes without going through the whole authorization flow again.
To address these concerns, it’s the best practice to keep access tokens short-lived. Along with that, you can generate a separate, long-lived token that generates you fresh access tokens as needed. This type of token is called refresh token.
Scopes
Authorization scopes are a set of functionalities that the Resource Owner delegates to the Client Application, allowing the Client to access resources as thought it were the original owner.
The scopes are simply a space-separated list of strings, where each string specifies a particular access type.
The scope format is not defined in the OAuth2 protocol, but they are normally structured like this: {resource}_{access level}
.
For example,
user_read
may allow the Client to read the current user (e.g. Resource Owner) profile informationrepo_write
may allow the Client Application to commit to the repositories to which the Resource Owner has access to.
As you can see, the scopes are fairly coarse-grained, they don’t grant access to specific resources, but rather work on the resource types and access levels (e.g. read/write/admin).
Scopes are additive, meaning when multiple scopes are requested, they are combined to broaden the Client Application’s or access token’s permissions.
OAuth2 Flows
Authorization flows, also known as authorization grants, are how permissions are delegated to the Client Applications. Regardless of the flow you use, the end result is a set of access tokens that enable the Client Application to directly access the Resource Server.
The main differences between flows are:
- whenever it’s interactive or non
- the number of participants involved (2- or 3-leg flows)
- whenever it’s secure to use it by (public or confidential) clients
Authorization Code
We’ll start by reviewing the most canonical and secure OAuth2 flow called the authorization code flow.
This flow is interactive and works for Client Applications that can keep secrets and perform browser redirects, typical for web services with a backend.
The flow consists of two stages:
- The authorization request, a redirect to the Authorization Server
- The authorization code exchange. Occurs at the client’s callback URL
The whole authorization code flow can be divided into two main parts:
- The interactions that happen indirectly between the Authorization Server and the Client Application using the browser as a mediator. These actions are performed via the frontend channel and can be potentially intercepted or manipulated along the way (e.g. a malicious browser extension may try to sniff the code parameters).
- The interactions occur directly between Authorization Server and the client via trusted backend channel.
The authorization code flow is designed so that it’s not possible to get access delegation by using only information transmitted via the frontend channel.
Authorization Request
In order to start the OAuth2 flow, the client application needs to request the authorization with the needed scopes from the Resource Owner.
This happens by redirecting the resource owner to the authorization server’s authorize
endpoint.
The authorization URL usually contains the following URL parameters:
HTTP/1.1 302 Found
Location: https://auth.example.com/authorize?response_type=code
&client_id=Iv23lilfdg920cAzhcxA
&redirect_uri=https://www.clientapp.com/callback/
&scope=read_user%20write_repo%20read_repo
&state=wSDOBWf0PAC9C7AENIRoCfHnDDSbr-XAbzFLG937m5_u12DkEjQvfj4UsOkVg0uVHZMVcXWcWFr6iQ7XLFopkw==
HTTP/1.1 302 Found
Location: https://auth.example.com/authorize?response_type=code
&client_id=Iv23lilfdg920cAzhcxA
&redirect_uri=https://www.clientapp.com/callback/
&scope=read_user%20write_repo%20read_repo
&state=wSDOBWf0PAC9C7AENIRoCfHnDDSbr-XAbzFLG937m5_u12DkEjQvfj4UsOkVg0uVHZMVcXWcWFr6iQ7XLFopkw==
- The
response_type
is what defines what kind of interactive flow we are going to perform. It’s alwayscode
for the authorization code flow (ortoken
for the implicit flow). - The
client_id
is required as the authorization code is strictly assigned to the client application that has initialized the flow (to prohibit finishing the flow from another Client Application). - The
redirect_uri
is the URL of the client application callback page where the authorization code will be passed after the authorization consent. This URL must be specified in the client registration settings.
The Resource Owner browser should already have a user session (e.g. session cookie) with the Authorization Server (or login otherwise), so the redirect can leverage that to seamlessly show the authorization consent screen.
Let’s note that the client application communicates with the Authorization Server indirectly via HTTP redirects and the Resource Owner browser. This way the Client Application doesn’t have to know about the Resource Owner credentials or session which is itself the key problem the OAuth2 protocol was born to solve.
Because of that, the authorization consent page should not have any client-specific CORS configuration. This remains true for all OAuth2 flows.
Code Exchange
Once the Resource Owner approves the delegation of access to the Client Application, the Authorization Server redirects the Resource Owner back to the Client Application callback URL specified during the authorization request.
The client callback redirect looks like this:
HTTP/1.1 302 Found
Location: https://www.clientapp.com/callback/?code=AUTHORIZATION_CODE
&state=wSDOBWf0PAC9C7AENIRoCfHnDDSbr
HTTP/1.1 302 Found
Location: https://www.clientapp.com/callback/?code=AUTHORIZATION_CODE
&state=wSDOBWf0PAC9C7AENIRoCfHnDDSbr
- the
code
parameter is called the authorization code (it gives this flow its name). - the
state
is returned back if it was specified originally to let the Client Application verify the integrity of the flow.
The authorization code is a one-time-use token that represents the specific Resource Owner’s consent to give to the specific Client Application. It is tied to the client ID that has obtained it, so it’s not possible to exchange it from another Client Application. So even if the code was leaked somehow, you would need to have valid client credentials to turn it into access tokens.
Finally, to finish the flow, we need to exchange the authorization code for the access tokens. This is done via the OAuth2 token endpoint:
POST /token HTTP/1.1
Host: auth.example.com
Authorization: Basic {{ base64(client_id:client_secret) }}
Accept: application/json
Content-Type: application/x-www-form-urlencoded
grant_type=authorization_code&code=SplxlOBeZQQYbYS6WxSbIA
&redirect_uri=https://www.clientapp.com/callback/
POST /token HTTP/1.1
Host: auth.example.com
Authorization: Basic {{ base64(client_id:client_secret) }}
Accept: application/json
Content-Type: application/x-www-form-urlencoded
grant_type=authorization_code&code=SplxlOBeZQQYbYS6WxSbIA
&redirect_uri=https://www.clientapp.com/callback/
- The
grant_type
defines what kind of flow or grant we want to use to trade for access tokens. It’s a universal endpoint used in the other flows too, but in the case of this flow, it’s always going to beauthorization_code
. - The
code
is mandatory to provide in the authorization code exchange. - The
redirect_uri
is an additional security measure to use when multiple redirect URLs are allowlisted to prevent authorization code injections.
In response, if everything went fine, you would get response like this:
{
"token_type":"bearer",
"access_token": "gho_16C7e42F292c6912E7710c838347Ae178B4a",
"scope":"read_user write_repo read_repo",
"expires_in": 3600,
"refresh_token": "ghr_16C7e42F292c6912E7710c838347Ae178",
}
{
"token_type":"bearer",
"access_token": "gho_16C7e42F292c6912E7710c838347Ae178B4a",
"scope":"read_user write_repo read_repo",
"expires_in": 3600,
"refresh_token": "ghr_16C7e42F292c6912E7710c838347Ae178",
}
That’s all. Now you need to persist the access and refresh tokens and use them to access the Resource Server.
PKCE
The analysis of real-world attacks on the authorization flows has shown that it can be further secured. Specifically, malicious actors can intercept the authorization code or try to inject it into the callback URL to do token exchange via unauthorized workflows. These attack vectors are the most probable in public applications like native applications.
To mitigate this, the OAuth2 protocol has extended the authorization code flow with the Proof Key for Code Exchange (PKCE) extension (pronounced as “pixy”).
PKCE is a simple way to prove that the authorization code was obtained via the legitimate authorization request. The beauty of PKCE is that it just slightly extends the authorization & token requests without major changes to the flow.
- The Client Applications generate a random string called the
code_verifier
and then hash it with a cryptographically secure algorithm as SHA256. The hashed value is called thecode_challenge
. - The Client Application keeps the original
code_verifier
privately and shares thecode_challenge
and the hash code (e.g.code_challenge_method
) as query params in the authorization request. - The Authorization Server remembers the
code_challenge
and thecode_challenge_method
. No other changes are needed to the existing Authorization Server responses. - Then, the client sends the
code_verifier
during token exchange. The Authorization Server computes the hash of that value and compares it with thecode_challenge
passed during the authorization request.
PKCE supports two hashing methods:
S256
- the SHA256 hashing algorithm
code_challenge = base64(SHA256(ASCII(code_verifier)))
code_challenge = base64(SHA256(ASCII(code_verifier)))
plain
- the plain text method. It’s basically justcode_challenge = code_verifier
. Theplain
method should be avoided as it doesn’t really introduce any challenges. It can only protect you from attacks where nefarious actors can intercept the Authorization Server responses.
The PKCE extension allows the public clients to finally leverage the authorization code flow securely. However, the Authorization Server must be ready to support PKCE for public clients which boils down to not requiring these clients to provide any client secrets.
Refresh Tokens
The refresh token is an optional but highly recommended additional token that the OAuth2 token endpoint can return to you. Unlike the access token, the refresh token is meant to be a long-lived token (either no expiration time or an extended period of time like half a year) that is sent to the authorization server only.
Essentially, the refresh token is an “internal” authorization grant because it implies the authorization that the resource owner has given to the Client Application.
The refresh token is important for two reasons:
- it allows to keep access tokens short-lived, so minimize the attack surface if they are leaked
- it allows to the generation of access tokens with the reduced access scope that is more limited than the scopes granted to the Client Application during authorization. This enables the clients to implement the least privilege principle on their side.
In order to refresh your access token, you send a request to the OAuth2 token endpoint with the grant_type
set to refresh_token
:
POST /token HTTP/1.1
Host: auth.example.com
Authorization: Basic {{ base64(client_id:client_secret) }}
grant_type=refresh_token&refresh_token=ghr_16C7e42F292c6912E7710c838347Ae178
POST /token HTTP/1.1
Host: auth.example.com
Authorization: Basic {{ base64(client_id:client_secret) }}
grant_type=refresh_token&refresh_token=ghr_16C7e42F292c6912E7710c838347Ae178
The refresh token is linked to the specific client credentials, so it’s not possible to leverage it with an unauthorized client.
The refresh token request generally returns the same response as we have seen in the authorization code exchange.
It contains the new active access token, its expiration time and the actual access scopes.
In some cases (GitHub and GitLab do this, for instance), the refresh token request may actually also refresh your previous refresh token, so if the token response
contains the refresh_token
field and it’s different from your current refresh token, it means that this is your new refresh token to persist and use going forward.
The refresh token request generally invalidates all previous access tokens (and refresh tokens).
Implicit
We have said that the authorization code flow is designed to make it impossible to get Resource Owner delegation by using only information passed via the frontend channel (e.g. the authorization code and client ID). In order to achieve this, that flow requires the Client Application to have a secure backend channel. But what if the application is public and doesn’t have a place to put a secret, so it remains a secret?
The original OAuth2 specification introduced a simplified version of the authorization code flow that makes a significant security trade-off in order to support public applications, first of all, in-browser JS applications like browser extensions or single page applications (SPAs) without backends. It’s called the implicit flow.
The implicit flow is also an interactive, redirect-based flow, but there is no explicit code exchange via the backend channel. Instead, it happens implicitly and the Client Application just receives the access token in the callback URL.
The authorization request looks close to what we have seen in the authorization code but
this time we have to specify response_type
as token
:
HTTP/1.1 302 Found
Location: https://auth.example.com/authorize?response_type=token
&client_id=Iv23lilfdg920cAzhcxA
&redirect_uri=https://www.clientapp.com/callback/
&scope=read_user%20write_repo%20read_repo
&state=wSDOBWf0PAC9C7AENIRoCfHnDDSbr
HTTP/1.1 302 Found
Location: https://auth.example.com/authorize?response_type=token
&client_id=Iv23lilfdg920cAzhcxA
&redirect_uri=https://www.clientapp.com/callback/
&scope=read_user%20write_repo%20read_repo
&state=wSDOBWf0PAC9C7AENIRoCfHnDDSbr
In JS applications, there are a few ways you can do this request:
- Do a full-page redirect to the Authorization Server
- Open a separate popup window and do the redirect there and then close it when the callback URL is hit.
If you specify the authorization state
parameter, the best place to temporarily persist it will be window.sessionStorage.
Once the Resource Owner approves the delegation, the Authorization Server redirects them back to the client callback URL which would look like this:
HTTP/1.1 302 Found
Location: https://www.clientapp.com/callback/#access_token=gho_16C7e42F292c6912E7710c838347Ae178B4a&state=wSDOBWf0PAC9C7AENIRoCfHnDDSbr
HTTP/1.1 302 Found
Location: https://www.clientapp.com/callback/#access_token=gho_16C7e42F292c6912E7710c838347Ae178B4a&state=wSDOBWf0PAC9C7AENIRoCfHnDDSbr
The access_token
is returned right away in the callback URL along with other parameters. This is a simple GET request,
so the sensitive access token is a part of the URL and can be potentially intercepted by other browser extensions, malicious scripts injected via XSS attacks, etc.
Additionally, the whole callback URL is cached in the browser history along with the access token.
That’s the main reason why the implicit flow is considered insecure.
All parameters are returned as URL fragments which means they are intended to be used by browser client applications only (e.g. not shared with any backend servers).
Since in-browser applications cannot keep secrets, the returned access token is super short-lived (like 1-2 hours). For the same reason, OAuth2 requires no refresh tokens in the implicit flow.
Finally, the Client Application can use the retrieved access token to access the Resource Server in the same way we have seen in the authorization code flow. There is one specific though. The Resource Server should be ready to accept these in-browser application requests by having CORS policies configured.
Looking back, there are basically two pieces of information that help to identify validity of the client in the implicit flow:
- Client ID
- Redirect URL
There is no client secret or any other sensitive information to put into the public client application.
Client Credentials
Let’s continue our what-if thought process. What if there is no resource owner and the Client Application wants to act on its own behalf?
This is a common situation when you have a dozen of internal services that have to communicate with each other and you want to secure that communication somehow to create a zero-trust environment.
In this case, there is no Resource Owner involved, so there is no need for the whole frontend channel to be involved. All we need is to make the Authorization Server accept the client credentials as a valid reason to issue the access tokens. Therefore, this flow is called the client credentials flow.
The client credentials flow is a non-interactive flow that enables confidential trusted Client Applications to access the Resource Server (or other internal Client Application). So the only request we need here is to the token endpoint:
POST /token HTTP/1.1
Host: auth.example.com
Authorization: Basic {{ base64(client_id:client_secret) }}
Accept: application/json
Content-Type: application/x-www-form-urlencoded
grant_type=client_credentials&scope=read_user%20write_repo%20read_repo
POST /token HTTP/1.1
Host: auth.example.com
Authorization: Basic {{ base64(client_id:client_secret) }}
Accept: application/json
Content-Type: application/x-www-form-urlencoded
grant_type=client_credentials&scope=read_user%20write_repo%20read_repo
- The
grant_type
is set toclient_credentials
to indicate that validity of the client credentials is the reason to give us an access token. - The
scopes
are optional but recommended to achieve the least privileged access.
That’s it. The response is the same as in other flows. There is no big reason to issue refresh tokens here, because the client credentials act as one, so it’s generally omitted.
Another thing is the access scope. Since the Client Application acts on its own behalf, it may not be limited to the resources available to a specific Resource Owner. There has to be a way for the Resource Server to differentiate this level of access versus regular resource owner delegation. I have seen two ways of doing this:
- use a separate set of scopes to mark such an internal, wide access
- add a custom claim to the JWT access token and account for it during access token validation
Resource Owner Credentials
The most paradoxical flow out of all OAuth2 standard flows is the resource owner credentials (ROC) flow. It’s paradoxical because it was discouraged from use since day one of the OAuth2 protocol, everyone says it’s a very bad idea to use it, yet still it made it into the specification. Why did that happen?
Theoretically, there might be situations where you absolutely have to use your username and password all around to access some resources. Without the ROC flow, you would be even less secure than if you used it. This is because the flow limits the credential exposure over the network which reduces the chance of credential leakage. Also, it allows you to limit the access scope (rather than giving the client absolutely all access you have).
In this flow, the Resource Owner passes their credentials (e.g. username and password) directly to the Client Application. Then the application uses the credentials as an authorization grant to issue a pair of access and refresh tokens. The resource credentials are then discarded and the client uses the tokens solely to access the protected resources going forward.
This is a backend channel only flow, so the Client Application exclusively communicates with the token endpoint of the Authorization Server:
POST /token HTTP/1.1
Host: auth.example.com
Authorization: Basic {{ base64(client_id:client_secret) }}
Accept: application/json
Content-Type: application/x-www-form-urlencoded
grant_type=password&username=dwight.shrut&password=bearbeetsbattlestargalactica&scope=org_admin
POST /token HTTP/1.1
Host: auth.example.com
Authorization: Basic {{ base64(client_id:client_secret) }}
Accept: application/json
Content-Type: application/x-www-form-urlencoded
grant_type=password&username=dwight.shrut&password=bearbeetsbattlestargalactica&scope=org_admin
- The
grant_type
has to bepassword
to indicate the ROC flow - The credentials e.g. username and password are passed as a part of the token request
- it’s possible to pass the
scope
param to reduce the delegated access level. Otherwise, it would be the full access that the Resource Owner has (whatever that means for the given Resource Server).
In which cases this flow could make some sense?
- You should have a high degree of trust to share your main credentials with the Client Application. Ideally, it should be something you control (e.g. the first-party client).
- Your Client Application is highly privileged. For example, it does some actions on behalf of your tenant or organization admins. This is how Microsoft Entra supports it. At the same time, personal accounts could not use this flow (e.g. partially because there are other protections in order to login like MFA).
Device Code
Apparently, the primary target of the original OAuth2 specification was the browser application use case, but after OAuth2 gained popularity, it has found its way into other contexts. For example, not every environment has an ability to open a browser and do the redirect-based flows like authorization code. A few examples:
- When you have got a new TV and you want to watch Netflix on it, you need to authorize that device to access your account and subscription.
- When you want to analyze your Snowflake data in a cloud-hosted, containerized Jupyter notebook, there might be no easy way to open a browser (it’s a headless linux under the hood).
- When you try to connect to your game portal from a console that may have a browser, but only limited input capabilities (e.g. no full-fledged keyboard)
Thankfully, there is an extension to the original OAuth2 specification that codifies so-called the device authorization flow.
The device authorization (or device code) flow is a special kind of interactive flow that doesn’t assume any direct interactions between the Client Application residing on the device and the Resource Owner’s browser.
Instead, the Client Application instructs how the resource owner can authorize it via browser indirectly by showing the verification URL to visit, QR code to scan or just a call to open the provider’s mobile application.
Device Authorization Request
In order to implement the device authorization flow, they introduced a new endpoint for kicking off the flow called the device authorization endpoint (because it has a completely different semantic than the standard, browser-based authorize
endpoint):
POST /device_authorization HTTP/1.1
Host: auth.example.com
Accept: application/json
Content-Type: application/x-www-form-urlencoded
client_id=Iv23lilfdg920cAzhcxA&scope=read_user%20write_repo%20read_repo
POST /device_authorization HTTP/1.1
Host: auth.example.com
Accept: application/json
Content-Type: application/x-www-form-urlencoded
client_id=Iv23lilfdg920cAzhcxA&scope=read_user%20write_repo%20read_repo
- The
client_id
is required to identify the Client Application. - There is no client secret because the device client is close to the public clients in terms of the ability to keep secrets e.g. any built-in secrets can be extracted.
The device authorization endpoint returns something like this:
{
"device_code": "GmRhmhcxhwAzkoEqiMEgDnyEysNkuNhszIySk9eS",
"user_code": "WDJB-MJHT",
"verification_uri": "https://auth.example.com/login/device",
"expires_in": 1800,
"interval": 5
}
{
"device_code": "GmRhmhcxhwAzkoEqiMEgDnyEysNkuNhszIySk9eS",
"user_code": "WDJB-MJHT",
"verification_uri": "https://auth.example.com/login/device",
"expires_in": 1800,
"interval": 5
}
- The
verification_uri
is where the end user should go to type in theuser_code
. The URL should be short enough to type in manually. Alternatively, the Authorization Server may give another URL to transform into a QR code, for example. That URL generally contains the user code as a query param. - The
device_code
is what the device client application keeps secretly in memory and then uses as a grant during polling the token endpoint.
The device code serves as a proof of starting the authorization flow. If there was no device code and the device client had only client ID as the client identifier, attackers may figure out that ID and then try to send the token requests to get the access & refresh tokens before the real device that has requested it.
The resource owner has to trigger (or retrigger if the previous request has timed out) the authorization flow, but at the same time, we pass no information about that user during initializing the authorization request. The authorization server can only match the resource owner with the corresponding client ID after typing in the user code on the verification page. To be fair, we pass no Resource Owner identifier directly in other interactive flows, too, but the authorization redirect leverages browser cookies there, so the Authorization Server can identify the end user right off the bat.
Then, the user code is shown somehow to the Resource Owner. Generally, it’s just printed on the device screen, so the user can type it from there.
Access Token Polling
The device authorization is a time-bound process (the lifetime is specified as expires_in
field in the response).
The authorization lifetime is typically around 15 minutes.
Because there is no way for the Authorization Server to tell the device client when the authorization is granted (that role is played by the callback URL in the other interactive flows) and it’s a big assumption that the device can accept inbound requests, the protocol only assumes that the device is connected to the internet and can do outbound requests.
With these assumptions, the device can poll the token endpoint every so often until the authorization is granted, the authorization request is expired or denied. The default polling interval is 5 seconds.
The polling happens against the token endpoint:
POST /token HTTP/1.1
Host: auth.example.com
Accept: application/json
Content-Type: application/x-www-form-urlencoded
grant_type=urn:ietf:params:oauth:grant-type:device_code&device_code=GmRhmhcxhwAzkoEqiMEgDnyEysNkuNhszIySk9eS&client_id=Iv23lilfdg920cAzhcxA
POST /token HTTP/1.1
Host: auth.example.com
Accept: application/json
Content-Type: application/x-www-form-urlencoded
grant_type=urn:ietf:params:oauth:grant-type:device_code&device_code=GmRhmhcxhwAzkoEqiMEgDnyEysNkuNhszIySk9eS&client_id=Iv23lilfdg920cAzhcxA
- The
grant_type
usually has to indicate what flow we are trying to complete. In this case, the flow code is unusual which means that the flow name is not standard (or custom). - The
device_code
is also sent to verify the device that is trying to obtain tokens.
The specification doesn’t require client authentication when accessing the token endpoint, but it’s possible and some providers use that (e.g. Google’s Device Authorization flow implementation). In that case, it’s still true that you cannot persist the client secret on the end device and should probably have a backend service somewhere to poll the token endpoint for the device.
It’s very likely that the device client would need to poll the token endpoint a couple of times before the end user actually authorizes it. In this case, the token endpoint should return a special error indicating that the authorization is not yet granted:
{
"error": "authorization_pending",
"error_description": "The authorization request is still pending as the end user hasn't yet authorized the device."
}
{
"error": "authorization_pending",
"error_description": "The authorization request is still pending as the end user hasn't yet authorized the device."
}
If the client polls it too eagerly, another special error is returned that says to expand the polling interval by 5 seconds:
{
"error": "slow_down",
"error_description": "The client should wait before polling the token endpoint again."
}
{
"error": "slow_down",
"error_description": "The client should wait before polling the token endpoint again."
}
When the authorization is finally granted, the token endpoint should return the regular token response we have seen in the authorization code flow.
More Grants
What else can we trade for access & refresh tokens? The OAuth2 specification defines a way to extend the standard grant types with a custom one. This is called an extension or third-party assertion grant.
The assertion grant is a backend channel only flow where the Client Application sends the Authorization Server a special third-party assertion that proves the client’s rights to access the protected Resource Server.
As with any other backend channel only flow, this one only uses the token endpoint:
POST /token HTTP/1.1
Host: auth.example.com
Accept: application/json
Content-Type: application/x-www-form-urlencoded
grant_type=urn:roma-glushko:oauth:jwt&assertion=eyJhbGci[..omitted for brevity..]I6IkpXVCJ9.eyJzdWIiOi[..omitted for brevity..]MDIyfQ.SflK[..omitted for brevity..]adQssw5c
POST /token HTTP/1.1
Host: auth.example.com
Accept: application/json
Content-Type: application/x-www-form-urlencoded
grant_type=urn:roma-glushko:oauth:jwt&assertion=eyJhbGci[..omitted for brevity..]I6IkpXVCJ9.eyJzdWIiOi[..omitted for brevity..]MDIyfQ.SflK[..omitted for brevity..]adQssw5c
The grant type in this case is a unique string in a form of URN that includes the organization name and other grant type information. For example:
urn:organization:oauth:grant-type:custom-name
urn:organization:oauth:grant-type:custom-name
It’s only important that the target Authorization Server recognizes it and knows how to validate it.
The assertion is usually a self-contained secure token that is cryptographically signed by the assertion provider. Practically, there are two types of assertions you can see in the wild:
- JWT assertions (e.g. defined as urn:ietf:params:oauth:grant-type:jwt-bearer)
- SAML assertions (e.g. defined as urn:ietf:params:oauth:grant-type:saml2-bearer)
- custom assertions like urn:bitbucket:oauth2:jwt that are likely to be JWT-based too.
The client authentication may be optional in this case (if so, the refresh token may not be issued as that grant requires client authentication).
Which Flow to Choose?
After we have reviewed all main OAuth2 flows, which one should you choose for your specific application?
I have tried to come up with the following decision tree that asks the main questions to help you.
- Always try to use the authorization code flow with PKCE if possible, no matter if it’s a public or confidential client application. This may not be possible because your provider may not support it yet.
- If PKCE is not supported, then the authorization code is only good for private clients unless the dynamic client registration is supported. For public clients, you should go with the implicit flow and dive into the number of recommendations and considerations to implement as securely as possible.
- If your client application cannot open a browser with the resource owner session or is limited in terms of input capabilities, and your users don’t really trust it, then go with the device code flow.
- Before falling back to the resource owner credentials flow, try to see if API keys can help you achieve the same goal.
Conclusions
Thinking about why OAuth2 protocol has been designed the way it is, turned out to be a great exercise in threat modeling with immediate, straightforward and practical approaches to mitigate these threats. They can be reused to solve similar security concerns in other contexts outside of OAuth protocol, so you can benefit from a deep understanding of the protocol even if you are not a security expert who has to know the ins and outs of OAuth2.
Apart from that, OAuth2 is such a vast area that we have been able to only answer the fundamental why questions and review the most popular delegation grants in this article.
A lot of interesting OAuth2 extensions are just briefly referenced, but you would not see them that often in the wild yet, so that was acceptable to leave them out for now.
If you would like to see follow-up articles on OAuth2 protocol and its extensions, please let me know.
References
- [RFC-6749] The OAuth 2.0 Authorization Framework
- [RFC-6750] The OAuth 2.0 Authorization Framework: Bearer Token Usage
- [RFC-7523] JSON Web Token (JWT) Profile for OAuth 2.0 Client Authentication and Authorization Grants
- [RFC-7662] OAuth 2.0 Token Introspection
- [RFC-7663] Proof Key for Code Exchange by OAuth Public Clients
- [RFC-8252] OAuth 2.0 for Native Apps
- [RFC-8628] OAuth 2.0 Device Authorization Grant
- [GitHub] OAuth Authorization
- [GitLab] OAuth 2.0 identity provider API
- [Bitbucket] OAuth 2.0 Enterprise Provider API
- [OAuth.net] OAuth 2.1 Draft
- [Manning] OAuth 2.0 in Action
- [Microsoft] Identity platform and OAuth 2.0 implicit grant flow