Payatu Bandits played the HackTheBox Business CTF 2023 and secured 1st rank in India, but overall, we secured 31st by the end of the tournament. It was an exciting and excellent learning experience as all the team members collaborated and brainstormed on the challenges. Here, I am presenting the write-up of two of the challenges of easy level from the cryptography category. The first challenge was about breaking the Advanced Encryption System (AES-CTR mode) symmetric cryptography, where the encryption key was reused, and the second challenge was related to code analysis and Merkle Tree (related to Blockchain). The first challenge was solved by my colleague Modassir, and I solved the second challenge.
Challenge name: Initialization
Description: During a cyber security audit of your government’s infrastructure, you discover log entries showing traffic directed towards an IP address within the enemy territory of “Oumara”. This alarming revelation triggers suspicion of a mole within Lusons’ government. Determined to unveil the truth, you analyze the encryption scheme with the goal of breaking it and decrypting suspicious communication. Your objective is to extract vital information and gather intelligence, ultimately protecting your nation from potential threats.
This challenge has been solved by my bandit Modassir.
Initially, we are given source.py file containing the encryption logic, message.txt containing the original message and the output.txt containing the encrypted message as follows.
As evident from the source.py, this Python code will encrypt the message inside the messages.txt file using AES-CTR and then save the encrypted message in output.txt.
In the program encrypt.py, the key was generated during the creation of object ‘AE’ according to the number of strings present in the message.txt file. Identical keys were generated and used for the encryption of all the messages present in the message.txt file.
The identical keys were generated due to how the code was written in the generate_encryption_keys function.
keys = [[b'\x00']*16] * len(MSG) was creating a list – keys with inner lists (per message length of message) of 16 bytes each. Due to the ‘*’ operator, the inner lists created were only the references of a single list. So whenever one of the bytes of an inner list was changed in the ‘for loop’, values for all the lists at that byte got modified. This resulted in the final list byte values being replicated to all the lists; hence, all the keys list generated were the same.
Also, it was observed that the counter values generated in the loop were not unique and ‘nonce’ was also not used. Therefore, the final key used for the encryption of all the strings would be identical. As the code uses AES-CTR, we know that XOR of plaintext and key was used to generate ciphertext from Block cipher mode of operation – Wikipedia. We can rearrange the operands in the XOR function to generate the key and plaintext. The flag we needed to obtain was one of the plaintext strings, so this notation could be used:
flag = (ciphertext1 ⊕ ciphertext2) ⊕ known_plaintext
We can convert the following logic to a quick Python script that gives us the flag.
Running this solve.py script gives us the flag.
Challenge name: I’m gRoot
Description: After decrypting the communication, you uncover the identity of the mole as the senior blockchain developer. Shockingly, the developer had embedded a backdoor in the government’s decentralized blockchain network, originally designed to prevent corruption. You report this critical finding to the government council and are assigned the task of detecting and fixing the backdoor, ensuring the integrity and security of the network.
We were given a ZIP file that contained two Python files. We were also given a link to start a docker. It was expected of us to connect to the given IP and port and solve the challenge. The two files: server.py and utils.py.
To run the program locally, we can comment out the import FLAG statement and print “FLAG” instead of the value of FLAG. So, the goal here would be to get ‘FLAG’ in the output.
As we can see from the code, a block consists of transactions. A transaction consists of signatures. When initializing a BlockChain object, 3 blocks are created, and each block is added with 8 transactions.
We can also see that during the initialization of a BlockChain object, after getting all 8 transactions, mine function is called, and a Merkle tree is created. All the transactions of the block are appended to the Merkle tree as its nodes. The Merkle tree get_state function gives the hash of the top/root node. The code for the BlockChain class contains a list variable _mined_blocks. The ‘mine’ function appends three parameters to the BlockChain object – number, transactions list and the root node hash.
Merkle tree – https://en.wikipedia.org/wiki/Merkle_tree
Image reference: https://en.wikipedia.org/wiki/Merkle_tree#/media/File:Hash_Tree.svg
In the Merkle tree, the hash is calculated for each of the leaf nodes (base level). Then at the next upper level, the hash values of leaf nodes are added in pairs of 2. This reduces the number of nodes from n to n/2 in the next level. After the addition of the hashes, the result is stored in the node as a hash value. Similarly, the process goes on for higher-level nodes till the root node is reached. This way, the root/top node contains the hash value that is made from the hash values of all data added to the Merkle tree. Observe that changing the order of leaf nodes in the Merkle tree will change the hash of the root/top node.
On giving 1 as input,
We can see three mined blocks in the output, and each block has a list of 8 transactions along with the hash of the top node.
In the code, the print_mined_blocks function is called, which is present in the ‘utils.py’ file, with an argument as mined_blocks. ‘mined_blocks’ variable calls the mined_blocks function of BlockChain class and returns the list of blocks stored in _mined_blocks variable during initialization.
The ‘print_mined_blocks’ function prints the block number, hash and hex value of signatures (extracted using the ‘extract_signatures’ function) for each block that was added in the ‘mine’ function.
On giving 2 as input,
We are asked for signatures to forge the last block that was printed in the first option.
By looking at the code, we can see that signatures are extracted from the last created transaction and evaluated against the signature list input from the user.
The code for evaluation is present in the ‘utils.py’ file. For evaluating the signatures, the code is splitting the user input signatures with comma characters, so we need to provide the input by separating our forged signatures with commas. The code uses the unhexify function on our input which changes the hex value to byte value, so our input should be already in hex form. If our input signatures list is the same as the signatures of the last created block, the program will exit after giving the message ‘Signatures are the same’. And if the signatures list is not identical, the function simply returns the user input signature list.
Note: The code does not check for the number of signatures in the user input list.
In the ‘server.py’ file, the _signatures variable gets the list of signatures in user input from the evaluate function. The code then calls the ‘valid’ function of the BlockChain class with the user input signature list as an argument.
In the ‘valid’ function, a Merkle tree is created with a node containing values as user input signatures. Finally, the code checks if the root hash (from the ‘get_state’ function) value for the newly created Merkle tree is the same as the root hash for the previously created Merkle tree during initialization.
Hence, we aim to provide an input as a list of signatures (separated by a comma) whose root hash created in the Merkle tree will be the same as the root hash of originally created Merkle tree.
We can read about the Second Preimage attack in this Wikipedia article – https://en.wikipedia.org/wiki/Merkle_tree. It says that the root/top hash of the Merkle tree is not concerned with the depth of the tree. The root/top hash will be created from the second-level node hash values. We also know that during the creation of the Merkle Tree, hash values of the data inserted are calculated and used for creating the higher-level nodes.
So, we can calculate the non-hash values of second-level nodes of the original Merkle tree and input that in list form as forged signatures. The program does not check the length of the input list. On calculating the hash of both second-level node values, adding them, and again calculating the hash, we will get the root hash, resulting in it being solved.
We can write a code such as this to calculate the hash value at each node of the original Merkle tree.
I have used the same signature list as the last block created in the 1st option.
As we can see from the Merkle tree diagram, the number of nodes reduces to half as we move up. Here, leaf nodes were 8 (containing signatures). The second-last level will have 4 nodes. We calculate the hash using the SHA256 algorithm using a reference from https://www.geeksforgeeks.org/introduction-to-merkle-tree.
The hash is calculated from bytes value, so we convert the hash to bytes value using
bytes.fromhex(). To get the SHA256 value, we use the function
Now ‘a’, ‘b’, ‘c’ and ‘d’ are already hash values, so we do not need to calculate the individual hashes before adding them. Also, ‘e’ and ‘f’ are hash values which can be further computed to get the root hash that would match the root hash present in the 1st option output. For the user input, we do not need hash values as the Merkle tree implementation itself performs a hashing function on its node values.
The two values in the output are not hashed, which we would use as the input. Also, confirm that the value of ‘g’, which is our root hash value matches the root hash value of the third block in the first option output.
We can observe that we get the output as ‘FLAG’ which is what we changed the code to. In the original challenge, I used ‘netcat’ to connect to the given IP and port number, and in place of the string ‘FLAG’, I received the original flag.
We really enjoyed the CTF. It was a good mix of challenges from all the different categories. The crypto category was nicely made. It required some research on the cryptographic algorithms in use, understanding the source code in detail and the ability to write the script.