The Mossad 2019 Challenge - Part 3
The third challenge begins.
We are given two files: an encrypted file of some sort, and the Windows executable used to encrypt it. The task is simple, reverse engineer the executable and figure out how to decrypt the file.
First off, I’m going to try to encrypt my own file that simply contains “hey” in it, and see what the output would be, to see how the output looks like, compared to the input.
The program output a file with 3004 bytes. The format doesn’t seem recognizable, let’s go ahead and start reversing.
The main function is easily identifiable due to the usage text we saw earlier. Here’s the pseudo code from Ghidra decompiler, with some variable names I personally assigned:
The main function doesn’t do too much. Most of the interesting code happens in the
encrypt_file function which we’ll go through.
Analysis of the encryption function
The function receives two parameters: one is the path to the input filename, the second is a reference to a variable that stores the buffer size.
The function begins with calling another function, which we’ll call
This function gets the length of the input file path, and allocates a buffer of the length plus 6 bytes. So if our file path was
aaa.txt (7 bytes), the allocated buffer would be 13 bytes.
Then, there’s a call to another function, which calls the
GetAdaptersInfo, it returns the MAC address of the first adapter.
Finally, you have a file path length + 6 bytes long buffer with the input path, and the additional 6 bytes are the MAC address.
Let’s look at the pseudo code after this:
As seen in the code, the program creates a MD5 hash of the previously created buffer of filename + MAC address. MD5 hashes are 16-byte long, so why does the program allocate a 32 byte buffer? The program actually splits each byte in the MD5 hash into two nibbles (4 bits), and stores each nibble in a byte of the buffer. This can be seen where the program does a shift-right 4 times, and a bitwise AND with 0xf.
This is everything for the MD5 hash creation. Let’s move forward.
After the MD5 calculation, there’s a call to a function that does some sort of AES encryption.
The function starts by creating a crypto handle using
CryptAcquireContextA, the provider type used is
PROV_RSA_AES which hints at the program using AES encryption.
It then creates a MD5 hash handle, which will be fed data and then passed to
CryptDeriveKey is used to derive and generate a new encryption key from a hash object.
So which data is being given to the MD5 hash? This is important, since if we’re able to derive from the same data, we’ll have the same AES key.
The function allocates a 14-byte buffer, this buffer comprises of 3 things:
- 6 bytes MAC address of first adapter (similarly to createMD5Hash)
- 4 bytes from the BIOS serial number.
- 4 bytes from the disk serial number.
The function gets the BIOS and disk serial number by running
wmic bios get serialnumber and
wmic diskdrive get serialnumber in cmd.exe, piping the output to a file (
command_result.txt) and reading 4 bytes from the second line of the file.
So, the 14 byte buffer is fed to the hash handle, a key is derived and then it encrypts the input file, by block sizes of 16 bytes. The AES encryption func also receives a 2nd parameter where the total encrypted buffer size is stored, this number will be divisible by 16 as 16 is the AES block size.
The Mersenne Twister and file format
After the AES encryption part, the program gets 4 bytes from BIOS serial number similar to before, and fills an integer array of size 625. The first 4 bytes in that array are the BIOS S/N, and then each 4 bytes are the previous 4 bytes multiplied by 6069, and the last 4 bytes serve as the counter. This array plays a role when writing to the file. It’s actually an internal state for the Mersenne Twister (MT19937). Mersenne twister is a pseudo-random number generator, I realized this was a Mersenne twister after solving this challenge when discussing the solution with someone. In this case, the PRNG seed is 4 bytes from the BIOS S/N.
So the beginning of the file is being built this way:
- A 4 byte magic is added (0x531B008A) in the beginning.
- The 32 byte MD5 result from createMD5Hash is added.
Then, the AES encrypted data is being written in a very specific way. First the program calculates how much bytes to write out of the AES encrypted buffer each time, and after how many iterations should it drop that number by 1.
bytes_to_write = encrypted_buffer_size / 739 + 1
iterations_until_dropping_write_size = encrypted_buffer_size - (739 * (encrypted_buffer_size / 739))
The loop operates this way:
- If the current number of iterations is equal to
iterations_until_dropping_write_size, subtract 1 from
- Write 4 pseudo-random bytes from the Mersenne Twister
bytes_to_writebytes from the AES encrypted buffer.
After this, the whole buffer goes through a XOR operation before being written to the file. The buffer is XORed against 4 bytes from disk serial number, retrieved in the same method mentioned previously.
To decrypt the file, we will complete the following steps with the encrypted file:
- XOR the first 4 bytes with the file magic (0x531B008A), the result is disk serial number.
- XOR the rest of the file with the disk serial number.
- Rebuild the MD5 and try to find the MAC address. (The filename part is known: intel.txt)
- Having the disk serial number, MAC address, and BIOS S/N, we can derive the key, reconstruct the encrypted data and decrypt it.
Solving the XOR
In this following Python script, we XOR the first 4 bytes with the magic to get the disk serial number back.
So now we know the disk serial number, and we have the original file before XOR at
Solving the MAC address
Now it’s important to use a hint that we’ve been given at the beginning of the challenge. The challenge text mentioned the manufacturer was “Or… Po… Ltd.”, so we’ll look for this specific MAC vendor and it’s MAC address range.
I personally downloaded the MAC vendor list from GitHub, and ran a regex with Sublime Text:
Or.+.+.+ P. I found this MAC vendor with the 00:13:37 address prefix:
001337 Orient Power Home Network Ltd.
So now we know the first 3 bytes of our MAC, it’s a matter of a few seconds to brute-force through 0xffffff MD5 combinations. I wrote a Python script to reconstruct the MD5 from the nibbles and brute-force the MD5 hash to find the rest of the MAC:
And we got our 3 bytes after a few seconds, the MAC is
As for the BIOS serial number, I was working in a VMware Windows machine and my encrypted files matched the bytes where I guessed the BIOS serial number was supposed to be retrieved from. So personally, I didn’t need to write any scripts, I knew the 4 bytes were “
VMwa”. In theory, to extract the BIOS, you would need to brute-force the mersenne twister PRNG with different seeds until you will find the seed you need. That seed is the 4 bytes from the BIOS S/N. Please note that the values used for Mersenne “tempering” operations are different than the usual implementations you would find online, you should match the bitmasks to the ones used in the binary. In this case,
b = 0x13a58ad and
c = 0x1df8c. Please refer to the Algorithmic Detail section to see how these are used.
To solve this, I wrote a Python script that utilizes
wincrypto, which provides WinCrypto bindings for Python. I have followed the same calculation as they are in the binary, the script can be executed against the file after reversing the XOR operation with the previous script.
Let’s execute the script and look at the contents..