Weak RSA | Pwicz Blog

Background

This is one of the beginner’s HTB challenges. You can find it here.
It is a good challenge to make sure you understand the math behind RSA ;)

0. RSA and why it can be weak

RSA in a nutshell

RSA is a public-key cryptosystem, an asymmetric encryption, which means it uses two separate (but connected!) keys for encryption and decryption. The encryption key also known as the public key is as its name suggests - public. People use it to encrypt messages and send them to the owner of the key. The owner has the other part of the story: the decryption key which is private. It is used to decrypt messages and so it cannot be exposed to the public. In this challenge, we are given a public key from which we have to derive its other, private part. With a strong key it is computationally infeasible - you cannot break it. However, when the numbers behind the public key are weak - we can break it, and this is what we are going to do.

What numbers?

RSA strength or weakness that we are going to exploit in this challenge is based on the prime factorization problem. To put it simple, given a large number n, it is very very difficult to find prime numbers that multiplied together give n: $p*q = n$. However, when n is not large, it is doable and from these primes you can obtain the public key.

When creating a key pair, we choose two large prime numbers p and q. p and q give us the modulus n: $n = p * q$. We choose the public (encryption) component e and its inverse d modulo $\phi(n)$. In this case, since $n = p * q$ and p and q are prime numbers, $\phi(n) = (p - 1) * (q - 1)$. Check out the Euler Phi Function if you want to learn the math behind it. It is worth knowing why!

Having chosen the numbers, we end up with: p, q, n, e, and d. n and e will be given in the public key. Let’s say you have a message M. The encrypted message C (ciphertext) will be $C = M^e \bmod n$. The owner of the key will decrypt it using the private component d: $M = C^d \bmod n$. To get d from the e and n we have to find the p and q which multiplied together give n. Then, p and q can be used to obtain $\phi(n)$ and from there it is easy to calculate the inverse of e which is d. This is how we are going to break this encryption.

1. Extracting data from the public key

We know that they public key is composed of numbers n and e which we use to encrypt the message. However, after opening key.pub we realize that these numbers are encoded. Specifically, it is PEM-encoded (Privacy Enhanced Mail format).
To extract n and e, you can use many of the useful websites such as dcode.fr and their certificate reader:

or you can use a tool called openssl-asn1pars.

I wanted to try to extract the values with the asn1parse (ASN.1 being the language used to define data structures) and this is what I received when running it on the key.pub:

The important part here is the BIT STRING element which represents the actual public key data. To fruther extract it from the element, we use this command (numbers are stripped on the right):

The -strparse is used to here to indicate that we want to parse contents of the ASN.1 object starting at offset 19. The result are the hexadecimal representations of n and e.

1.1 Perhaps a cleaner way

When going through the new Security Engineer path on TryHackMe, I found this command that allows you to extract data about the key in an even easier and more readable way:

I find it funny that I came across it only now. You can read the private key in the same way, just omit the -pubin flag that specifies that a public key is read.

2. Prime factorization

So, what do we need to get the private key? Two prime numbers that make up n. Therefore, we need to factorize n. This operation is computationally infeasible for large numbers, but in our case n is relatively small. Let us use dcode.fr again and their Prime Numbers Decomposition tool.

It is pretty quick as in about 2 seconds we can see our two primes in the section of the website to the left

As you can imagine, this should take the attacker thousands of years, not 2 seconds…

3. Private key creation

Given p, q, and e, we will now find the inverse of e modulo $\phi(n)$ (where $\phi(n) = (p-1)*(q-1)$) which is d. This is enough to create the private key and decrypt the message. I created a simple Python script for that:


from Crypto.PublicKey import RSA
from Crypto.Cipher import PKCS1_OAEP

MODULUS = 0x03303B...
# PUB_COMP is `e`
PUB_COMP = 0x6117C...

P = 204234...
Q = 280647...

def decode():
    # get the inverse of the public component modulo phi of n
    PRIV_COMP = pow(PUB_COMP, -1, (P - 1) * (Q - 1))
    key = RSA.construct((MODULUS, PUB_COMP, PRIV_COMP))
    cipher = PKCS1_OAEP.new(key)

    # read the challenge file
    file = open('flag.enc', 'rb')
    content = file.read()
    file.close()

    answer = cipher.decrypt(content)
    print(answer)

We use the PyCryptodome to manually create the key from our components and use it to decrypt the cipher. It should work but… it does not! cipher.decrypt() throws a ValueError: Incorrect decryption exception. The possible reasons are two: the key components are wrong or the ciphertext is not structured properly.

4. RSA key padding

After ensuring that my numbers are correct (with assertions as demonstrated in the updated snippet), I looked up what PKCS1_OAEP actually is. It is an asymmetric cipher based on RSA with the OAEP padding. Why do we need padding? To encrypt the message, it gets divided into blocks and very often the last block will not be of ideal size - to make it of ideal size we append some padding to it. To keep it short, there are two paddings: OAEP which I assumed is used in this ciphertext and PKCS#v1.5. You can read more about it in this article.

So, I tried using another (legacy) module that decrypts the cipher text with the other padding. This is the updated code:


from Crypto.PublicKey import RSA
# note that this import has changed!
from Crypto.Cipher import PKCS1_v1_5

MODULUS = 0x03303B...
# PUB_COMP is `e`
PUB_COMP = 0x6117C...

P = 204234...
Q = 280647...

def decode():
    # check correctness of the numbers
    pub = RSA.importKey(open("key.pub").read())
    assert(pub.n == MODULUS)
    assert(pub.e == PUB_COMP)
    assert(P*Q == pub.n)

    # get the inverse of the public component modulo phi of n
    PRIV_COMP = pow(PUB_COMP, -1, (P - 1) * (Q - 1))
    key = RSA.construct((MODULUS, PUB_COMP, PRIV_COMP))
    cipher = PKCS1_v1_5.new(key)

    # read the challenge file
    file = open('flag.enc', 'rb')
    content = file.read()
    file.close()

    answer = cipher.decrypt(content, 'argh')
    print(answer)

5. It worked!

It was enough to decrypt the message. I enjoyed this challenge as this was my first time manually breaking encryption as opposed to just using some tools. Even though it is simple, it is a good way to put theory into practice.