Writing scripts in python
Information
The estimated time to complete this training module is 2h.
The prerequisites to take this module are:
- the installation module.
If you have any questions regarding the module content please ask them in the relevant module channel on the school Discord server. If you do not have access to the server and would like to join, please send us an email at school [dot] brainhack [at] gmail [dot] com.
Contact your local TA if you have questions on this module, or if you want to check that you completed successfully all the exercises.
Resources
This module is based on Greg Kiar’s QLSC 612 course in 2020, with slides from Joel Grus’ talk at JupyterCon 2018.
The video is available below:
Exercise
- Watch the video and follow along the hands on part to do the exercise. If you prefer to do the execercise on your own, the instructions are also written below.
Click to show the exercise instructions ⬇
In this exercise we will program a key-based encryption and decryption system. We will implement a version of the Vigenere cipher, but instead of just using the 26 letters of the alphabet, we will use all the unicode characters.
The Vigenere cipher consists in shifting the letters of the message to encrypt by the index of the corresponding letter in the key. For example the encryption of the letter B with the key D will result in the letter of new_index = index(B) + index(D) = 2 + 4 = 6, so it will be the 6th letter which is F.
⚠️ Note that here by index I mean the index of the letter in the alphabet and not the index of the letter in the string.
You pair up the letters of the message with the ones of the key one by one, and repeating the key if it is shorter than the message. For example if the message is message
and the key is key
, the pairs will be :
(m,k),
(e,e),
(s,y),
(s,k),
(a,e),
(g,y),
(e,k)
For the indices of the letter, we will not use the the number of the letter in the alphabet, but the unicode index of the letter, which is easily obtained with the native python function ord
. The reverse operation of getting a letter from its unicode index is obtained with native python function chr
. There are 1114112 unicode characters handled by python, so we’ll have to make sure we have indices in the range 0 to 1114111. To ensure that, we can use values modulo 1114112, i.e. encrypted_index = (ord(message_letter) + ord(key_letter)) % 1114112
.
Step 1: Create relevant functions in useful_functions.py
In that file, implement the following functions :
encrypt_letter(letter, key)
: return the encrypted letter with the key, e.g.encrypt_letter("l", "h")
should return'Ô'
.decrypt_letter(letter, key)
: return the decrypted letter with the key, e.g.decrypt_letter("Ô", "h")
should return'l'
.process_message(message, key, encrypt)
: return the encrypted message using the letters inkey
if encrypt isTrue
, and the decrypted message if encrypt isFalse
. For example :
process_message('coucou', 'clef', True)
'ÆÛÚÉÒá'
process_message('ÆÛÚÉÒá', 'clef', False)
'coucou'
After creating these function, try to call them in your python terminal or in a JupyterNotebook to try things out. Are the functions performing as you expected?
To reliably make sure that the process_message
function works correctly, let’s add a test at the end of the useful_functions.py
file.
- Define a
message
variable with a word (e.g.message = "word"
), then akey
variable with an other word (e.g.key = "key"
). - Use
process_message
to generate the encryption of themessage
variable with thekey
in anencrypted_msg
variable. - Use the
process_message
function again to decrypt theencrypted_msg
variable (still using the samekey
) in adecrypted_msg
variable. - Verify that
message == decrypted_msg
by printing “Test passed” if it is true and “Test failed” if it is false.
Now we have a proper test of our process_message
function, and we can run it by executing the useful_functions.py
script. However we don’t want to run the test when we just import the functions from the file, so we will need to use the if __name__ == "__main__":
statement.
- Put the test in an
if __name__ == "__main__":
block.
Now we have our functions and a test to validate them, we can conclude the first part of the exercise.
Step 2: Create a file cypher_script.py
:
- use the Argparse library introduced in the video so that a user can call the script with four arguments :
-i
,-o
,-k
and-m
.-i
will contain the path to the input text files containing the message.-o
will contain the path for the output file where the processed message will be written.-k
will be a string directly containing the key.-m
will be the mode: a string that can take the value"encryption"
or"decryption"
to tell the script if you want to encrypt or decrypt the input message. - The script should import the functions from
useful_functions.py
and use them in its main function to encrypt or decrypt the text in the input file using the text in the key file as the key, and save the results in the output file. So callingpython cypher_script.py -i msg_file.txt -o msg_encrypted.txt -k my_key -m encryption
should create amsg_encrypted.txt
file. - Don’t forget to write the code under
if __name__ == "__main__"
. Even though in this file it won’t make a difference, it is never too early to get used to good practices.
Last step: verify your implementation
Finally, decrypt the file obtained with :
wget https://raw.githubusercontent.com/school-brainhack/school-brainhack.github.io/main/content/en/modules/python_scripts/message_encrypted.txt
with the following key :
my_super_secret_key
Can you see something cool in the decrypted file ?
- Follow up with your local TA(s) to validate you completed the exercises correctly.
- 🎉 🎉 🎉 you completed this training module! 🎉 🎉 🎉
More resources
If you are curious to learn more advanced capabilities for the Argparse library, you can check this Argparse tutorial.
To learn more about python in general, you can check the tutorials of the official python documentation and choose the topic you want to learn. I also recommend the porgramiz tutorials which have nice videos. Finally for even nicer and fancier videos there is the excellent python programming playlist from the youtube channel Socratica.