visit
Hold on tight: the HTTP protocol is terribly flawed(*) and when it comes to user authentication this problem screams loudly.
For a long time we, as developers, fought with it: sometimes with good results, sometimes not, but we thought we were happy.
Unfortunately, the web goes fast and many of these solutions was getting old too quickly.
He who hesitates is lost…
Later on, a group of people realised that it was time to stop fighting with the “problem” and try to embrace it. The result of that epiphany is called JSON Web Token (JWT for short) and here we will try to tell its story…
Suppose you have a (e.g.
GET /orders
) and you want to restrict access to the authorized users only.The HTTP protocol is stateless, that means a new request (e.g.
GET /order/42
) won’t know anything about the previous one, so we need to reauthenticate for each new request (fig.1).Fig. 1 — Due to the stateless nature of HTTP protocol, every new API request needs a complete authentication.
The traditional way of dealing with this is the use of Server Side Sessions (SSS). In this scenario, we first check for username and password; if they are authentic the server will save a session id in memory and return it to the client. From now on, client will just need to send its session id to be recognized (fig.2).
Fig. 2–Using SSS, we reduce the number of authentications towards the Credentials database.
This solution will fix a problem but it will create another one.
Probably bigger.
• vertical scaling – scaling up your infrastructure means merely add more resources to a server. This is an expensive solution with a low upper limit (i.e. the server’s max resources allocation);
• horizontal scaling – scaling out your infrastructure is simpler and cost-effective as add a new server behind a ;
Now it’s seems pretty clear that the second approach will be far most beneficial; but let’s take a look at what may happens.In the initial scenario, behind the load balancer, there’s just one server. When a client will perform a request, using session id xyz, its record will be surely found in server’s memory (fig.3).So far, so good.Fig. 3–One single server behind the load balancer. The session id of the request will be found in memory.
Now imagine that the above infrastructure needs to scale. A new server (i.e. Server 2:2) will be added behind the load balancer and this brand new server will handle the next request issued by
xyz
client…
Fig.4–A new server is behind the LB, it knows nothing about previous session so the user won’t be recognized.
Unauthenticated! The brand new server, has no
xyz
sessions in its memory so the authentication process will fail. To fix this we have mainly three workarounds that can be used:• Synchronize sessions between servers — tricky and error-prone;
• Use an external in-memory database — Good solutions but it will add another infrastructure’s component;
Third: embrace the stateless nature of HTTP and search for a better solution!
JSON Web Token (JWT) is an open standard () that defines a way for transmitting information –like authentication and authorization facts– between two parties: an issuer and an audience. Communication is safe because each token issued is digitally signed, so the consumer can verify if the token is authentic or has been forged.
Each token is self-contained, that means it contains every information needed to allow or deny any given requests to an API. To understand how we can verify a token and how authorization happens we need to take a step back and look into a JWT.
header.payload.signature
Header
The header section contains information about the token itself.
{
"kid": "ywdoAL4WL...rV4InvRo=",
"alg": "RS256"
}
eyJraWQiOiJ -TRUNCATED- JTMjU2In0
Payload or body
The payload is the most important part of a JWT token. It contains information (claims in JWT jargon) about the client:
{
[...]
"iss": "//cognito-idp.eu-west-1.amazonaws.com/XXX",
"name": "Mariano Calandra",
"admin": false
}
The
iss
property is a , it represent the identity provider that issued the token — in this case . Finally, we can add further claims based to our needs (e.g. admin
claim).The payload is then encoded as Base64URL:eyJzdWIiOiJkZGU5N2Y0ZC0wNmQyLTQwZjEtYWJkNi0xZWRhODM1YzExM2UiLCJhdWQiOiI3c2Jzamh -TRUNCATED- hbnRfaWQiOiJ4cGVwcGVycy5jb20iLCJleHAiOjE1N jY4MzQwMDgsImlhdCI6MTU2NjgzMDQwOH0
Signature
The third part of token is an hash that is computed following these steps:
• hash the result using the encryption algorithm specified in
alg
property of the header (in this case RS256) and a private key
; • encode the result as Base64URL;Here we can look at it as pseudo-code:data = base64UrlEncode(header) + "." + base64UrlEncode(payload);
hash = RS256(data, private_key);
signature = base64UrlEncode(hash);
POstGetfAytaZS82wHcjoTyoqhMyxXiWdR7Nn7A29DNSl0EiXLdwJ6xC6AfgZWF1bOsS_TuYI3OG85 -TRUNCATED- FfEbLxtF2pZS6YC1aSfLQxeNe8djT9YjpvRZA
Put everything together
Once we have the encoded header, the encoded payload and the encoded signature we can join everything together simply merging every piece with a dot:
eyJzdWIiOiJkZGU5N2Y0ZC0wNmQyLTQwZjEtYWJkNi0xZWRhODM1YzExM2UiLCJhdWQiOiI3c2Jzamh -TRUNCATED- hbnRfaWQiOiJ4cGVwcGVycy5jb20iLCJleHAiOjE1N jY4MzQwMDgsImlhdCI6MTU2NjgzMDQwOH0.eyJzdWIiOiJkZGU5N2Y0ZC0wNmQyLTQwZjEtYWJkNi0xZWRhODM1YzExM2UiLCJhdWQiOiI3c2Jzamh -TRUNCATED- hbnRfaWQiOiJ4cGVwcGVycy5jb20iLCJleHAiOjE1N jY4MzQwMDgsImlhdCI6MTU2NjgzMDQwOH0.POstGetfAytaZS82wHcjoTyoqhMyxXiWdR7Nn7A29DNSl0EiXLdwJ6xC6AfgZWF1bOsS_TuYI3OG85 -TRUNCATED- FfEbLxtF2pZS6YC1aSfLQxeNe8djT9YjpvRZA
Note: Even if the above token seems encrypted it isn’t! Unlike RS256, Base64URL is not an encryption algorithm, so mind your payload!
JWT validation
Since token is self-contained, we own all the information needed for its validation. For example, we know that token has been signed using RS256 (
alg
property of the header) and a private key. Now we need to know how to get the right public key to perform the validation. Yes, the public key!Note: In the asymmetric encryption, we all know that public key is used to encrypt a message, whereas private key is used to decrypt it.
In a signing algorithm this process is completely switched! Here the message (the data in the pseudo-code above) is signed using the private key and the public key is used to verify that the signature is valid.
The
iss
property of the body represent the endpoint of the issuer (Amazon Cognito in our case, but it should be no great differences with other providers), copy that URI and prepend it to the string /.well-known/jwks.json
. It should look something like://cognito-idp.eu-west-1.amazonaws.com/XXX/.well-known/jwks.json
{
"keys": [
{
"alg": "RS256",
"e": "AQAB",
"kid": "ywdoAL4WL...rV4InvRo=",
"kty": "RSA",
"n": "m7uImGR -TRUNCATED AhaabmiCq5WMQ",
"use": "sig"
},
{...}
]
}
In the
keys
array, search for the element that has the same kid
of the token’s header. The properties e
and n
are the public exponent and modulus that compute the public key.Once we get the it, we can verify the signature. If it’s valid, we can be sure that information contained in the token are trusted.Note: The process of public key calculation or sign verification is not easy and will be out of scope for this post.
At the first access, a client needs to contact the authorization server (Amazon Cognito here, but Microsoft, Salesforce or any other provider should be pretty similar), sending username and password to it. If credentials are valid, a JWT token will be returned to the client that will use it to request an API (in this example Amazon API Gateway endpoint?).
Fig.5 – The complete flow of a real case scenario.
In the above scenario (fig.5), API itself is the only responsible for token validation and it’s able to reject the request if signature seems forged.Going further
Suppose a client wants to invoke a protected API to delete an order (e.g.
DELETE /order/42
) and this action should be only performed by administrators.With a JWT in place, this operation is hard as add a custom claim to the payload body (i.e. the
admin: true
claim of the payload above). When invoked, the API will first verify the signature authenticity and afterward it’ll check if admin
claim is true
.• JWT is self-contained, that means it contains every information needed to allow or deny any given requests to an API;
• JWT is stateless by design, so we don’t have to fight with stateless design of HTTP; • JWT is encoded, not encrypted have it in mind;(*) Stateless nature of HTTP is clearly not a flaw. Just a provocation :)
If you liked this story and want to support my work, please .