CITS3007 lab 10 (week 11) – Cryptography

1. Cryptography libraries

We will investigate how to perform basic encryption tasks using a cryptography library called Sodium, which is written in C. It is well-documented (you can find the documentation here), well-tested, highly portable, and used by many other projects. It allows us to perform tasks like encryption, decryption, signature checking, and password hashing.

Although Sodium is a C library, we will use it from the Python language, as that requires much less boilerplate code. In the CITS3007 SDE, we need to install the Python library PyNaCl, which “wraps” the C Sodium library, and provides a “Pythonic” interface to it (the documentation for PyNaCl is available here).1 Run the following commands in your development VM:

$ sudo apt-get update
$ sudo apt-get install python3-pip
$ pip install pynacl

This ensures we have the pip command available for managing Python libraries, then uses it to install PyNaCl. We’ll show how to use the PyNaCl library to create a public–private key pair (like those used by GitHub to allow repositories to be cloned or pushed without using a password). The lecture slides contain more information about public key cryptosystems like this, as does the PyNaCl documentation, here.

Exercise

Suppose Alice and Bob are both using a public-key cryptosystem, and both make their public keys available on the Web for anyone to access. Explain how could they use their keys so that Alice can securely send an encrypted message or file which can only be read by Bob.

1.1. Generating a key pair

In this section and the following ones, we will generate public–private key pairs, and use them to transfer encrypted content in exactly the way Alice and Bob could, in the previous exercise.

Save the following as keygen.py:

import nacl.utils
from nacl.public import PrivateKey
from nacl.encoding import HexEncoder

def write(name, hex, suffix):
    filename = 'key_' + name + suffix 
    with open(filename, 'wb') as ofp:
      ofp.write(hex)

def make_keys(name):
    secretKey = PrivateKey.generate()
    write(name, secretKey.encode(encoder=HexEncoder), '.sk')
    publicKey = secretKey.public_key
    write(name, publicKey.encode(encoder=HexEncoder), '.pk')

key_name = input("Enter a name for the key pair to generate: ")

make_keys(key_name)

Run it by executing python3 keygen.py, and entering a name (this could be a particular purpose you’re generating the key pair for – for instance, secret-hushmoney-communications-with-my-accountant – or just your own name).

This will generate two files, key_[NAME].sk and key_[NAME].pk, which hold our private and public keys, respectively. If you inspect those files (e.g. by using less) you will see that they simply contain a long sequence of hexadecimal digits.

In detail, here’s how the code works:

The secret key (in the “.sk” file) can be used by the user, you, to encrypt, decrypt and sign messages. The public key (in the “.pk” file) can be published to others, and can be used by other people to encrypt messages written to you, or decrypt messages written by you.

1.2. Using the key pair to encrypt

If possible, get another person in the lab to generate a key pair, and exchange public keys. Alternatively, create a second key pair with a different name (e.g. “other”), and choose this to be the “other person”.

Encrypt a message using the recipient’s public key and your private key. Save the following script as encrypt.py:

import nacl.utils
from nacl.public import PrivateKey, PublicKey, Box
from nacl.encoding import HexEncoder

class EncryptFile :
    def __init__(self, sender, receiver):
        self.sender = sender
        self.receiver = receiver
        self.sk = PrivateKey(self.get_key(sender, '.sk'), encoder=HexEncoder)
        self.pk = PublicKey(self.get_key(receiver, '.pk'), encoder=HexEncoder)

    def get_key(self, name, suffix):
        filename = 'key_' + name + suffix
        file = open(filename, 'rb')
        data = file.read()
        file.close()
        return data

    def encrypt(self, textfile, encfile):
        box = Box(self.sk, self.pk)
        tfile = open(textfile, 'rb')
        text = tfile.read()
        tfile.close()
        etext = box.encrypt(text)
        efile = open(encfile, 'wb')
        efile.write(etext)
        efile.close()

sender = input("Enter the name for your key pair: ")
recip = input("Enter the name for the recipient's key pair: ")
encrypter = EncryptFile(sender, recip)
target_file = input("Enter a file to encrypt: ")
encrypter.encrypt(target_file, f'{target_file}.enc')
print('Done!')

Run it with the command python3 encrypt.py. You will need to provide the name of your key pair (from the previous exercise), the recipient’s key pair, and a file to encrypt (you can just choose the encrypt.py script if you have no other text file handy).

The script should create a binary file ORIG_FILE.enc (where ORIG_FILE is whatever the name of the original file was) – this is the encrypted file.

In more detail, here is what the script does:

1.3. Using the key pair to decrypt

Save the following as decrypt.py:

import nacl.utils
from nacl.public import PrivateKey, PublicKey, Box
from nacl.encoding import HexEncoder
import sys

class DecryptFile:
    def __init__(self, sender, receiver):
        self.sender = sender
        self.receiver = receiver
        self.sk = PrivateKey(self.get_key(receiver, '.sk'), encoder=HexEncoder)
        self.pk = PublicKey(self.get_key(sender, '.pk'), encoder=HexEncoder)

    def get_key(self, name, suffix):
        filename = 'key_' + name + suffix
        try:
            with open(filename, 'rb') as file:
                data = file.read()
            return data
        except FileNotFoundError:
            print(f"Key file '{filename}' not found.")
            sys.exit(1)

    def decrypt(self, encfile, textfile):
        box = Box(self.sk, self.pk)
        try:
            with open(encfile, 'rb') as efile:
                etext = efile.read()
            dtext = box.decrypt(etext)
            with open(textfile, 'wb') as tfile:
                tfile.write(dtext)
            print(f"Decrypted file saved as '{textfile}'")
        except FileNotFoundError:
            print(f"Encrypted file '{encfile}' not found.")
            sys.exit(1)

sender = input("Enter the name for the sender's key pair: ")
recip = input("Enter your name for your key pair: ")
decrypter = DecryptFile(sender, recip)
enc_file = input("Enter the name of the encrypted file to decrypt: ")
target_file = input("Enter the name for the decrypted output file: ")
decrypter.decrypt(enc_file, target_file)

To use the script, you need to have an encrypted file (with a .enc extension) generated by the “encrypt.py” script in the same directory – ideally, swap with another person and attempt to decrypt their .enc file – together with your private key (with an .sk extension), and the other person’s public key (with a .pk extension).

Run python3 decrypt.py and follow the prompts: enter the sender’s key pair name, your key pair name, the name of the encrypted file to decrypt, and the name for the decrypted output file.

The script will decrypt the file using the private key associated with your name and the sender’s public key and save the decrypted content to the specified output file.

1.4. Challenge task

As a challenge task, you might like to research how to use libsodium to sign a message with your key pair so that other users can verify a (plaintext) message came from you.

2. Cryptography questions and exercises

See if you can answer the following questions, after reviewing the material on cryptography in the lectures.

Question 2(a)

Suppose in the CITS3007 SDE you create the MD5 hash of some password, using a command like:

$ printf mypassword | md5sum

In what format is the hash displayed? How large is the hash, in bytes? How would you write it in C syntax?

Question 2(b)

What is the purpose of salting passwords, when creating a password hash?

Question 2(c)

Look up Wikipedia to refresh your memory of what a hash collision is. Explain why hash collisions necessarily occur. That is, why must there always be two different plaintexts that have the same hash value?

3. CITS3007 project

You can use your lab time to work on the CITS3007 project. You may wish to discuss your project tests and code design with other students or the lab facilitators (although the actual code you submit must be your own, individual work).


  1. There are actually multiple Python libraries which provide access to the C Sodium library, which can be confusing, but they have quite different purposes. PyNaCl, which we use, provides a fairly high-level interface to Sodium, and allows Python programmers to use Python types (such as classes and lists) which they are familiar with.
        Two other Python libraries are pysodium and libnacl. These are not high-level – they pretty directly wrap the exact C functions exposed by the C Sodium library, and allow them to be called from Python.↩︎