base64解析jwt的Payload
JWT介绍
参考文档:
https://medium.com/geekculture/how-to-encode-and-decode-jwt-token-using-python-f9c33de576c5
How to encode and decode jwt token using python
Introduction
JSON Web Token (JWT) is a compact, URL-safe means of representing claims to be transferred between two parties. The claims in a JWT are encoded as a JSON object that is used as the payload of a JSON Web Signature (JWS) structure or as the plaintext of a JSON Web Encryption (JWE) structure, enabling the claims to be digitally signed or integrity protected with a Message Authentication Code(MAC) and/or encrypted.
https://www.rfc-editor.org/rfc/rfc7519
In plain text, JWT, or JSON Web Token, is an open standard used to share information between two parties securely — a client and a server. In most cases, it’s an encoded JSON containing a set of claims and a signature
Python provides multiple libraries to encode and decode JSON web tokens. To name a few,
PyJWT
For our discussion, we will be using PyJWT
as the library.
Encode
import jwt
def encode_user():
"""
encode user payload as a jwt
:param user:
:return:
"""
encoded_data = jwt.encode(payload={"name": "Dinesh"},
key='secret',
algorithm="HS256")
return encoded_data
if __name__ == "__main__":
print(encode_user())
Output:
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJuYW1lIjoiRGluZXNoIn0.7Fwj-RvoEP2-LfB5q05pdTvMl7pFpoQgwXYq3EOLens
The above code is quite self-explanatory. We have a payload as a python dict
i.e. the data to be encoded and a secret key used to encode. Also we have used the algorithm HS256 for signing the token.The algorithms supported by PyJWT is provided here.
Breaking down a JWT token
Let’s try and understand the structure of a JWT token.A JWT token typically contains a user’s claims.These represent data about the user, which the API can use to grant permissions or trace the user providing the token.The different components of a JWT token is separated with a period(.). A JWT token consists of 3 parts.Each section is comprised of base64url
-encoded JSON containing specific information for that token
- Header
- Payload
- Signature
jwt.io can also be used to decode a JWT token and breaks it into the above mentioned components.
Decode
import jwt
def decode_user(token: str):
"""
:param token: jwt token
:return:
"""
decoded_data = jwt.decode(jwt=token,
key='secret',
algorithms=["HS256"])
print(decoded_data)
Output:
{'name': 'Dinesh'}
Decoding JWT using public key
A public key can be used to decode a JWT. Usually these public keys should be made available to tenant using the below URI format. Every open id server has to provide this endpoint. In our case, the public key is called as a JSON Web Key.
The JSON Web Key (JWK) is a JSON object that contains a well-known public key which can be be used to validate the signature of a signed JWT.
If the issuer of your JWT used an asymmetric key to sign the JWT, it will likely host a file called a JSON Web Key Set (JWKS). The JWKS is a JSON object that contains the property keys
, which in turn holds an array of JWK objects.
https://--YOUR DOMAIN----/.well-known/jwks.json
Sample Response:
{
keys: [
{
alg: 'RS256',
kty: 'RSA',
use: 'sig',
n: 'tTMpnrc4dYlD8MtmPnW3xZNbLxkaGCUwTqeKB4dfLg11dEpMyQEc4JRxUvRzp9tz00r6lkZ1ixcvIiuB_eMVckU8VyFSFWBSAxp5duBk6lRpYk-QjK3kEdPxYLxyW84gNzwMi-XW8zxJbsOa-cRM9sCb62Qz2yfWoQfimoFXsCnVHq496kizO7gZ972JefvTce1_n9dd_1p0K6c14qcCXtF6hbA_gQ0N7h3IyloBqiusKyTsV-ZrMZDldZkI-4v7s49TdcRZgEOvSapMz5YyoDvAWzuWGEiljkjkCOo0Mr5Sioi2x0dBm6nJ2WVYfZrwEF5J',
e: 'AQAB',
kid: 'NTY2MjBCNzQ1RTLPQzk3NzczRRTMQ0E4NzE2MjcwOUFCRkUwRTUxNA',
x5t: 'NTY2MjBCNzQ1RTJPLzk3NzczRUNPO0E4NzE2MjcwOUFCRkUwRTUxNA',
x5c: [Array]
}
]
}
Now let’s write a python code to decode a JWT token using python-jose
.
import jwt
import httpx
def decode_access_token(authorisation_token):
# get public key from jwks uri
response = httpx.get(url="your open id wellknown url to fetch public key")
# gives the set of jwks keys.the keys has to be passed as it is to jwt.decode() for signature verification.
key = response.json()
# get the algorithm type from the request header
algorithm = jwt.get_unverified_header(authorisation_token).get('alg')
user_info = jwt.decode(token=authorisation_token,
key=key,
algorithms=algorithm)
return user_info
I have used python-jose here just to show that there is no significant difference between these libraries. The code really doesn’t look different from PyJWT. Does it?
Summary
Please note that anyone can decode the information contained in a JWT without knowing the private keys. For this reason, you should never put secret information like passwords or cryptographic keys in a JWT.
python-jose is a wrapper on top of PyJWT.
Note: The reason I have mentioned both the libraries is, sometimes your build pipeline like gitlab/Jenkins complains(for no reason) of having different/incompatible versions of cryptography with PyJWT.However using python-jose on such scenarios would be a quick solution without changing the code. I personally faced this issue and wanted to share this here.
base64 解析 jwt Payload
直接使用base64解码jwt Payload报错: Incorrect padding
import base64
s = 'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJuYW1lIjoiRGluZXNoIn0.0Cbf0JnPzCkPepFOGijYoCv807h4zgKYMk4GjAlXKBI'
sArr = s.split('.')
print(sArr)
payload = sArr[1]
base64.b64decode(payload)
base64.urlsafe_b64decode(payload)
说明:
base64.b64decode
和base64.urlsafe_b64decode
报相同的错误。
报错:
Traceback (most recent call last):
File "/data/Project/GitLab/LPython3/T/zoro_packages_dep/tmp.py", line 9, in <module>
base64.b64decode(payload)
File "/usr/lib/python3.8/base64.py", line 87, in b64decode
return binascii.a2b_base64(s)
binascii.Error: Incorrect padding
正确方式:
import base64
s = 'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJuYW1lIjoiRGluZXNoIn0.0Cbf0JnPzCkPepFOGijYoCv807h4zgKYMk4GjAlXKBI'
sArr = s.split('.')
payload = sArr[1]
# base64.b64decode(payload)
# base64.urlsafe_b64decode(payload)
r = base64.urlsafe_b64decode(payload + '=' * (4 - len(payload) % 4))
print(r)
下面说明错误原因
Python解码base64遇到Incorrect padding错误
参考文档:
https://www.cnblogs.com/wswang/p/7717997.html https://stackoverflow.com/questions/3302946/how-to-decode-base64-url-in-python
base64转码过程
所谓Base64,就是说选出64个字符—-小写字母a-z、大写字母A-Z、数字0-9、符号"+"、"/"(再加上作为垫字的"=",实际上是65个字符)—-作为一个基本字符集。然后,其他所有符号都转换成这个字符集中的字符。
具体来说,转换方式可以分为四步。
-
第一步,将每三个字节作为一组,一共是24个二进制位。
-
第二步,将这24个二进制位分为四组,每个组有6个二进制位。
-
第三步,在每组前面加两个00,扩展成32个二进制位,即四个字节。
-
第四步,根据下表,得到扩展后的每个字节的对应符号,这就是Base64的编码值。
0 A 17 R 34 i 51 z
1 B 18 S 35 j 52 0
2 C 19 T 36 k 53 1
3 D 20 U 37 l 54 2
4 E 21 V 38 m 55 3
5 F 22 W 39 n 56 4
6 G 23 X 40 o 57 5
7 H 24 Y 41 p 58 6
8 I 25 Z 42 q 59 7
9 J 26 a 43 r 60 8
10 K 27 b 44 s 61 9
11 L 28 c 45 t 62 +
12 M 29 d 46 u 63 /
13 N 30 e 47 v
14 O 31 f 48 w
15 P 32 g 49 x
16 Q 33 h 50 y
因为,Base64将三个字节转化成四个字节,因此Base64编码后的文本,会比原文本大出三分之一左右。
如果要编码的二进制数据不是3的倍数,最后会剩下1个或2个字节怎么办?Base64用\x00
字节在末尾补足后,再在编码的末尾加上1个或2个=
号,表示补了多少字节,解码的时候,会自动去掉。
Incorrect padding错误
原因1: 有可能去掉了编码后的等号
有可能去掉了编码后的等号,可以手动加上,解决方式如下:
def decode_base64(data):
missing_padding = len(data) % 4
if missing_padding != 0:
data += b'='* (4 - missing_padding)
return base64.b64decode(data)
使用该方式解析jwt的payload报一样的错误。
原因2: “url safe"的base64编码
由于标准的Base64编码后可能出现字符+
和/
,在URL中就不能直接作为参数,所以又有一种"url safe"的base64编码,其实就是把字符+
和/
分别变成-
和_
:
使用base64.urlsafe_b64decode
解码:
base64.urlsafe_b64decode(base64_url)
使用该方式解析jwt的payload报一样的错误。
jwt payload 报错原因
- 使用了
"url safe"的base64编码
- 去掉了编码后的等号
测试代码:
import base64
s = 'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJuYW1lIjoiRGluZXNoIn0.0Cbf0JnPzCkPepFOGijYoCv807h4zgKYMk4GjAlXKBI'
sArr = s.split('.')
payload = sArr[1]
print(payload)
payload = payload + '=' * (4 - len(payload) % 4)
print(payload)
r = base64.urlsafe_b64decode(payload + '=' * (4 - len(payload) % 4))
print(r)
输出:
eyJuYW1lIjoiRGluZXNoIn0
eyJuYW1lIjoiRGluZXNoIn0=
b'{"name":"Dinesh"}'