AES Algorithm for beginners
AES is a symmetric-key encryption algorithm that is considered extremely secure, very easy to implement, and used in real world. This guide teaches you how to implement its 128-bit CTR mode variant, in C language. The procedure:
- You randomly generate a 128-bit cipher key
- You secretly share it with your friend (see note below).
- Then, when you want to send a secret message, you encrypt your message using your key.
- Send the encrypted message to your friend.
- Your friend decrypts it using the key you gave.
Note: Secretly sharing the key is done using a protocol like Diffie-Hellman. It’s not covered in this guide.
Overview
First half of the guide focuses on implementing the following function:
void AES128( uint8_t* block, uint8_t* cipherKey) { }
This function encrypts the given 16-byte block using a 16-byte key. This function can be thought of a pipeline of multiple functions applied on the block one after the other.
The 11 “Apply Key” steps all use a different round key, derived from the main cipher key using the function getNthRoundKey.
The second half of the guide works towards defining the following functions by repeatedly using AES128.
uint8_t* encrypt(uint8_t* plainText, uint64_t lengthInBytes, uint8_t* cipherKey) { }
uint8_t* decrypt(uint8_t* encryptedText, uint64_t lengthInBytes, uint8_t* cipherKey) { }
The Block
Blocks are drawn as 4×4 matrices of numbers 0-255. In C, we represent them as a flat array uint8_t[16] when declared on stack, and as uint8_t* when malloc’ed. The following block is {0,1,2,...,15} in C.
Note that the block is in column-major order, meaning that second element is below the first one, not to the right!
Step 1) Implement the following, that prints the block as 4×4 matrix.
void printBlock(uint8_t* block) { }
Step 2) Implement the following. This function takes a block, a row and a column number, and the element (value) that you need to place in the block.
void setElement(uint8_t* block, int nthRow, int nthCol, uint8_t element) { }
Rows are represented as a flat array uint8_t[4] when declared on stack, and as uint8_t* when malloc’ed. First element is the leftmost one. Columns are represented in the same way. First element is the topmost one.
Step 3) Implement the following.
uint8_t* getRow(uint8_t* block, int nthRow) {
// malloc storage for a new row
// copy values from block to the new storage
// return the newly allocated storage
}
uint8_t* getCol(uint8_t* block, int nthCol) {
// do the same thing as above
}
Step 4) Implement these keeping in mind that row and col are flat arrays of 4 elements. Use printf.
void printRow(uint8_t* row) { }
void printCol(uint8_t* col) { }
One crucial point to remember: Some functions like getRow and getCol allocate dynamically using malloc and return the allocated storage as a pointer. The pointers to the storage must not get lost. You should call free when you’re done using the returned object.
Step 5) Implement the rotate operations.

Rotate operations take a row or a column, shift all elements in a direction by 1 place, and the element that got popped out is reinserted into the opposite side. Keep in mind that rows go left-to-right, and columns go top-to-bottom.
void rotateLeftRow( uint8_t* row ) { }
void rotateUpCol( uint8_t* col ) { }
// Note: don't allocate anything new. Work on the given object directly.
Step 6) Implement the following.
void setRow(uint8_t* block, uint8_t* row, int nthRow) {
// Copy content of a given row into the block, so that block's nth row contains the data in row.
// Don't make any dynamic allocations.
}
void setCol(uint8_t* block, uint8_t* col, int nthCol) {
// Same as above
}
These functions will form the basis of the steps of AES128 algorithm. Make sure these functions work properly before continuing.
subBytes function
The following are the sbox constants:
{99, 124, 119, 123, 242, 107, 111, 197, 48, 1, 103, 43, 254, 215, 171, 118, 202, 130, 201, 125, 250, 89, 71, 240, 173, 212, 162, 175, 156, 164, 114, 192, 183, 253, 147, 38, 54, 63, 247, 204, 52, 165, 229, 241, 113, 216, 49, 21, 4, 199, 35, 195, 24, 150, 5, 154, 7, 18, 128, 226, 235, 39, 178, 117, 9, 131, 44, 26, 27, 110, 90, 160, 82, 59, 214, 179, 41, 227, 47, 132, 83, 209, 0, 237, 32, 252, 177, 91, 106, 203, 190, 57, 74, 76, 88, 207, 208, 239, 170, 251, 67, 77, 51, 133, 69, 249, 2, 127, 80, 60, 159, 168, 81, 163, 64, 143, 146, 157, 56, 245, 188, 182, 218, 33, 16, 255, 243, 210, 205, 12, 19, 236, 95, 151, 68, 23, 196, 167, 126, 61, 100, 93, 25, 115, 96, 129, 79, 220, 34, 42, 144, 136, 70, 238, 184, 20, 222, 94, 11, 219, 224, 50, 58, 10, 73, 6, 36, 92, 194, 211, 172, 98, 145, 149, 228, 121, 231, 200, 55, 109, 141, 213, 78, 169, 108, 86, 244, 234, 101, 122, 174, 8, 186, 120, 37, 46, 28, 166, 180, 198, 232, 221, 116, 31, 75, 189, 139, 138, 112, 62, 181, 102, 72, 3, 246, 14, 97, 53, 87, 185, 134, 193, 29, 158, 225, 248, 152, 17, 105, 217, 142, 148, 155, 30, 135, 233, 206, 85, 40, 223, 140, 161, 137, 13, 191, 230, 66, 104, 65, 153, 45, 15, 176, 84, 187, 22}
Step 7) Implement the following. The function takes an index, and returns the nth sbox constant.
uint8_t sbox(uint8_t i) { }
Step 8) Implement the following.
void subBytes( uint8_t* state ) {
// Apply sbox function to all elements of the given 4x4 block.
// Operate on it directly, don't allocate anything new.
}
shiftRows function
*Step 9) Implement the following by using getRow, rotateLeftRow and setRow.
void shiftRows( uint8_t* state ) {
// Get a row.
// Rotate it.
// Put it back to state.
// Repeat for all shifts, as shown in the image.
// Operate on state directly. Don't forget to free all the allocations.
}
mixColumns function
For this one we’ll first define gfmul function that takes a number 0 to 255, and another number 1-3. We define this function for only these ranges. Gfmul’s math is complicated, so instead of computing the results, we will use precomputed tables.
// Table for b=1: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255};
// Table for b=2: {0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 27, 25, 31, 29, 19, 17, 23, 21, 11, 9, 15, 13, 3, 1, 7, 5, 59, 57, 63, 61, 51, 49, 55, 53, 43, 41, 47, 45, 35, 33, 39, 37, 91, 89, 95, 93, 83, 81, 87, 85, 75, 73, 79, 77, 67, 65, 71, 69, 123, 121, 127, 125, 115, 113, 119, 117, 107, 105, 111, 109, 99, 97, 103, 101, 155, 153, 159, 157, 147, 145, 151, 149, 139, 137, 143, 141, 131, 129, 135, 133, 187, 185, 191, 189, 179, 177, 183, 181, 171, 169, 175, 173, 163, 161, 167, 165, 219, 217, 223, 221, 211, 209, 215, 213, 203, 201, 207, 205, 195, 193, 199, 197, 251, 249, 255, 253, 243, 241, 247, 245, 235, 233, 239, 237, 227, 225, 231, 229};
// Table for b=3: {0, 3, 6, 5, 12, 15, 10, 9, 24, 27, 30, 29, 20, 23, 18, 17, 48, 51, 54, 53, 60, 63, 58, 57, 40, 43, 46, 45, 36, 39, 34, 33, 96, 99, 102, 101, 108, 111, 106, 105, 120, 123, 126, 125, 116, 119, 114, 113, 80, 83, 86, 85, 92, 95, 90, 89, 72, 75, 78, 77, 68, 71, 66, 65, 192, 195, 198, 197, 204, 207, 202, 201, 216, 219, 222, 221, 212, 215, 210, 209, 240, 243, 246, 245, 252, 255, 250, 249, 232, 235, 238, 237, 228, 231, 226, 225, 160, 163, 166, 165, 172, 175, 170, 169, 184, 187, 190, 189, 180, 183, 178, 177, 144, 147, 150, 149, 156, 159, 154, 153, 136, 139, 142, 141, 132, 135, 130, 129, 155, 152, 157, 158, 151, 148, 145, 146, 131, 128, 133, 134, 143, 140, 137, 138, 171, 168, 173, 174, 167, 164, 161, 162, 179, 176, 181, 182, 191, 188, 185, 186, 251, 248, 253, 254, 247, 244, 241, 242, 227, 224, 229, 230, 239, 236, 233, 234, 203, 200, 205, 206, 199, 196, 193, 194, 211, 208, 213, 214, 223, 220, 217, 218, 91, 88, 93, 94, 87, 84, 81, 82, 67, 64, 69, 70, 79, 76, 73, 74, 107, 104, 109, 110, 103, 100, 97, 98, 115, 112, 117, 118, 127, 124, 121, 122, 59, 56, 61, 62, 55, 52, 49, 50, 35, 32, 37, 38, 47, 44, 41, 42, 11, 8, 13, 14, 7, 4, 1, 2, 19, 16, 21, 22, 31, 28, 25, 26};
Step 10) Implement gfmul using the tables above.
uint8_t gfmul(uint8_t a, uint8_t b) {
// a takes on values 0..255
// b takes on values 1, 2, 3
// result is a value 0..255
// get the correct constant from the correct table above.
}
Step 11) Implement the following.
void mixColumns( uint8_t* state ) {
// Allocate a new 4x4 block, that will be the output
// To calculate output's element at (row 0, col 0):
// get row 0 of mulmatrix,
// get col 0 of input state,
// apply gfmul, then xor the outputs together
// insert in newly allocated output block
// Repeat for all elements in the output
// Don't forget to free all the allocations.
// Don't return anything.
}
useRoundKey function
Step 12) Implement the following. Both state and roundKey are 4×4 blocks.
void useRoundKey( uint8_t* state, uint8_t* roundKey) {
// Take 0th element of state, and 0th element of roundKey
// Xor them together
// Put back in state.
// Repeat for all 16 elements
// Don't allocate anything new
}
Key expansion
Now the last piece of the pipeline is to determine the 11 round keys. First let’s see the procedure visually.
The 11 round keys are all 4×4 blocks, and in C we’ll represent them as flat arrays of 16 bytes. We denote the columns as w0, w1, … w43.
Round key 0 is a direct copy of the cipher key.
For columns w5, w6, w7 etc. we apply the following formula. ^ operator is xor in this context.
For columns w4, w8, w12 etc. we apply the following formula. ^ is xor, ror is rotateUpCol, and rcon is a special column whose topmost element is given by the table 1, 2, 4, 8, 16, 32, 64, 128, 27, 54 and all the other columns are zeros.
Now we first getNthColOfRoundKey to get a given w column, and then getNthRoundKey. We define getNthColOfRoundKey recursively, since it’s best formulated as a recursive function. Though, this is not strictly necessary.
Step 13) Implement the following.
// The function returns a column. That means, a flat array of 4 elements.
uint8_t* getNthColOfRoundKey( int nthCol, uint8_t* cipherKey ) {
uint8_t rcon[] = {1, 2, 4, 8, 16, 32, 64, 128, 27, 54};
if (/* w0, w1, w2, w3 */) {
// return the corresponding column from cipherKey
// using getCol
}
else if (/* w5, w6, w7, ... */) {
// malloc a new column
// call getNthColOfRoundKey twice to get columns w(n-1) and w(n-4)
// apply xor and fill in newly allocated column
// return it
}
else /* w4, w8, w12, ... */ {
// malloc a new column
// call getNthColOfRoundKey twice to get columns w(n-1) and w(n-4)
// rotate w(n-1)
// apply xor and fill in newly allocated column
// apply xor with correct element from rcon table
// return it
}
}
Step 14) Implement the following by using the previous function.
uint8_t* getNthRoundKey( int n, uint8_t* cipherKey) {
// Allocate a new storage for 4x4 block
// Fill it in by calling the previous function
// Return it
}
Step 15) Implement AES128 function as shown above. block and cipherKey are 4×4 blocks.
// Don't return anything; operate directly on block.
// Free all the allocations when you're done with them.
void AES128( uint8_t* block, uint8_t* cipherKey) { }
Turning cipher key into a stream
To encrypt a message, we generate a 64-bit random number called nonce. We split the input message into 16-byte blocks. Then apply the following same procedure to each block. The red part is accomplished by getKeystream and the orange part by encrypt. Lastly, we attach the nonce to the end of the file.
To decrypt it, we extract the nonce from the end of the file, then do getKeystream and call decrypt on all 16-byte blocks as shown.
Step 16) Implement a function to extract nth bit from a 64-bit number.
// For example, say number is 3. It's "000...0011" in binary. 0th and 1th bit are "1" and the rest are "0".
// getNthByte(3, 0) should return 1
// getNthByte(3, 1) should return 1
// getNthByte(3, 2) should return 0
uint8_t getNthByte( uint64_t number, int n) { }
// Hint: you can use & and >> operators
Step 17) Implement the following function. cipherKey is a 4×4 block, nthBlock is an integer, nonce is an array of 8 bytes.
uint8_t* getKeystream( uint8_t* cipherKey, uint64_t nthBlock, uint8_t* nonce) {
// Malloc storage for a 16-byte array that we'll call "keystream"
// Copy the nonce to the first 8 bytes of the keystream
// Turn "nthBlock" into an array of 8 bytes
// Copy the nthBlock array to the last 8 bytes of the keystream
// Apply AES128 on the keystream as data, and cipherKey as the key.
// Return the keystream
}
Step 18) Implement the following function. plainText is an array of bytes. Its length is specified in lengthInBytes parameter. cipherKey is a 4×4 block. It can also be thought of as an array of 16 bytes.
uint8_t* encrypt(uint8_t* plainText, uint64_t lengthInBytes, uint8_t* cipherKey) {
// Allocate an array of 8 bytes. Call it nonce.
// Fill it up with random bytes.
// Malloc storage of the length of plainText + 8. Call it encryptedText
// Divide plainText in blocks of 16 bytes.
// For all blocks,
// generate a keystream
// encrypt it using xor as shown above
// put the result in encryptedText
//
// Copy nonce to last 8 bytes of encryptedText
// Return encryptedText
}
Step 19) Implement the following function. encryptedText is an array of bytes. Its length is specified in lengthInBytes parameter. cipherKey is a 4×4 block.
uint8_t* decrypt(uint8_t* encryptedText, uint64_t lengthInBytes, uint8_t* cipherKey) {
// Allocate an array of 8 bytes. Call it nonce.
// Extract last 8 bytes and copy into nonce.
// Malloc storage of the length of encryptedText - 8. Call it decryptedText.
// Divide encryptedText in blocks of 16 bytes.
// For all blocks,
// generate a keystream
// decrypt it using xor as shown above
// put the result in decryptedText
//
// Return decryptedText
}
Step 20) (Optional) Adapt your program so that it accepts a plaintext file and outputs an encrypted file.







