1. 简介
AES是一种对称加密算法, 它有3种类型:
- AES-128: 密钥为128位(16字节)的AES, 加密10轮
- AES-192: 密钥为192位(24字节)的AES, 加密12轮
- AES-256: 密钥为256位(32字节)的AES, 加密14轮
密钥长度越长, 加密的强度越大, 当然与此同时开销也越大。每种类型下都有几种操作模式
- ECB(Electronic Codebook): 每个数据块独立加密
- CBC(Cipher Block Chaining): 每个数据块的加密依赖于前一个数据块的密文
- CFB(Cipher Feedback): 将加密块的部分作为输入反馈给下一个块的加密过程
- OFB(Output Feedback): 类似于CFB, 但加密反馈来自前一个块的输出, 不涉及密文本身
- CTR(Counter Mode): 使用计数器为每个块生成唯一的值, 然后与明文块进行xor操作
- GCM(Galois/Counter Mode): 提供加密和信息完整性校验, 适用于需要认证的加密场景, 比如网络通讯
- XTS-AES: 常用于磁盘加密, 结合AES与密钥掩码来提供更强的数据保护
- AES-CMAC: 用于生成消息验证码, 确保信息完整性
- AES-XTS: 一种用于硬盘加密的模式, 旨在提供抗篡改的数据保护
2. 原理
这里主要记录AES-128的ECB模式, 也是最基础的一种。AES-128加密一共要10轮, 每一轮进行加密的密钥称为轮密钥(Round Key), 由于需要一个用户提供的初始轮密钥, 所以实际上是11轮。
每一把轮密钥是16字节(AES-128), 所以AES密钥的长度为(1 + 10) * 16 = 176字节
但实际上用户只需要提供一把16字节的密钥即可, AES-128提供了密钥拓展算法(Key Expansion Algorithm), 可以由初始轮密钥(16字节)生成其他轮密钥。由于AES-128要加密10轮, 所以初始轮密钥要生产其他10把轮密钥, 一共生成10 * 16 = 160字节的密钥, 总共176字节。
AES-128进行加密解密的前提需要几样东西:
- 两个大小为16 * 16, 也就是总共有256个字节的S-Box矩阵与S-Box逆矩阵(AES定义的固定值)
- 两个大小为4 * 4, 也就是总共有16个字节的单位矩阵(AES定义的固定值)
- 一把用户提供的轮密钥, 一共16字节(用户自定义)
- 用户提供的密文(用户自定义)
注意, AES-128 ECB是对称加密算法, 它一次最多只加密16个字节(密钥长度决定的), 如果你要加密超过16个字节的怎么办? 那就把要加密的内容分成一个个16字节的块, 分别进行加密(这是ECB的情况)。
如果说最后一个块不足16个字节怎么办? 比如一共有38字节的内容要加密, 所以就是16+16+6, 这就是进行了3次加密, 可最后一个块是6字节不足16字节, 那就使用填充(padding):
- 零填充: 不足的部分全部补0(不建议使用, 解密时可能产生混淆)
- PKCS#7填充: 不足的部分全部补上你最后一个块存在的字节数, 比如你最后一个块是6字节, 那就剩下的10字节全部填6
AES-128的一轮加密分为几个步骤:
- 字节代换(SubKeys): 将明文内容的每个字节作为索引, 从S-Box中找值
- 行位移(ShiftRows): 将16字节的明文内容分成4份, 每份4个字节, 分别循环左移0, 1, 2, 3字节
- 列混淆(MixColumns): 将行位移后生成的数组(4 * 4)与单元矩阵进行矩阵乘法
- 轮密钥加法(AddRoundKeys): 将轮密钥与列混淆生成的数组(4 * 4)进行异或操作
这里分别来详细解释下, 先解释下字节代换:
AES-128的准备工作中, 包含了1个256字节的SBox和另一个256字节的逆RSBox, 这两个的关系就是:
y = SBox[x]
z = RSBox[y]
z == x
将一个值作为索引放入SBox[x]能获得值y, 把y放入RSBox[y]能得到z, 而z的值是等于x的, 所以相当于把y放入RSBox[y]能得到x
进行这个目的是为了混淆, 加密的时候我把明文x放入SBox[x]中获得了一个完全不相关的值y, 解密的时候我只要把y放入RSBox[y]就能获得明文x了
这两个数组都是AES-128标准中定义好了的, 是固定的:
#define SBOX_SIZE (256)BYTE bSBox[SBOX_SIZE] = {0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5, 0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76,0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0, 0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0,0xb7, 0xfd, 0x93, 0x26, 0x36, 0x3f, 0xf7, 0xcc, 0x34, 0xa5, 0xe5, 0xf1, 0x71, 0xd8, 0x31, 0x15,0x04, 0xc7, 0x23, 0xc3, 0x18, 0x96, 0x05, 0x9a, 0x07, 0x12, 0x80, 0xe2, 0xeb, 0x27, 0xb2, 0x75,0x09, 0x83, 0x2c, 0x1a, 0x1b, 0x6e, 0x5a, 0xa0, 0x52, 0x3b, 0xd6, 0xb3, 0x29, 0xe3, 0x2f, 0x84,0x53, 0xd1, 0x00, 0xed, 0x20, 0xfc, 0xb1, 0x5b, 0x6a, 0xcb, 0xbe, 0x39, 0x4a, 0x4c, 0x58, 0xcf,0xd0, 0xef, 0xaa, 0xfb, 0x43, 0x4d, 0x33, 0x85, 0x45, 0xf9, 0x02, 0x7f, 0x50, 0x3c, 0x9f, 0xa8,0x51, 0xa3, 0x40, 0x8f, 0x92, 0x9d, 0x38, 0xf5, 0xbc, 0xb6, 0xda, 0x21, 0x10, 0xff, 0xf3, 0xd2,0xcd, 0x0c, 0x13, 0xec, 0x5f, 0x97, 0x44, 0x17, 0xc4, 0xa7, 0x7e, 0x3d, 0x64, 0x5d, 0x19, 0x73,0x60, 0x81, 0x4f, 0xdc, 0x22, 0x2a, 0x90, 0x88, 0x46, 0xee, 0xb8, 0x14, 0xde, 0x5e, 0x0b, 0xdb,0xe0, 0x32, 0x3a, 0x0a, 0x49, 0x06, 0x24, 0x5c, 0xc2, 0xd3, 0xac, 0x62, 0x91, 0x95, 0xe4, 0x79,0xe7, 0xc8, 0x37, 0x6d, 0x8d, 0xd5, 0x4e, 0xa9, 0x6c, 0x56, 0xf4, 0xea, 0x65, 0x7a, 0xae, 0x08,0xba, 0x78, 0x25, 0x2e, 0x1c, 0xa6, 0xb4, 0xc6, 0xe8, 0xdd, 0x74, 0x1f, 0x4b, 0xbd, 0x8b, 0x8a,0x70, 0x3e, 0xb5, 0x66, 0x48, 0x03, 0xf6, 0x0e, 0x61, 0x35, 0x57, 0xb9, 0x86, 0xc1, 0x1d, 0x9e,0xe1, 0xf8, 0x98, 0x11, 0x69, 0xd9, 0x8e, 0x94, 0x9b, 0x1e, 0x87, 0xe9, 0xce, 0x55, 0x28, 0xdf,0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68, 0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16 };BYTE bInvSBox[SBOX_SIZE] = {0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38, 0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb,0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87, 0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb,0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d, 0xee, 0x4c, 0x95, 0x0b, 0x42, 0xfa, 0xc3, 0x4e,0x08, 0x2e, 0xa1, 0x66, 0x28, 0xd9, 0x24, 0xb2, 0x76, 0x5b, 0xa2, 0x49, 0x6d, 0x8b, 0xd1, 0x25,0x72, 0xf8, 0xf6, 0x64, 0x86, 0x68, 0x98, 0x16, 0xd4, 0xa4, 0x5c, 0xcc, 0x5d, 0x65, 0xb6, 0x92,0x6c, 0x70, 0x48, 0x50, 0xfd, 0xed, 0xb9, 0xda, 0x5e, 0x15, 0x46, 0x57, 0xa7, 0x8d, 0x9d, 0x84,0x90, 0xd8, 0xab, 0x00, 0x8c, 0xbc, 0xd3, 0x0a, 0xf7, 0xe4, 0x58, 0x05, 0xb8, 0xb3, 0x45, 0x06,0xd0, 0x2c, 0x1e, 0x8f, 0xca, 0x3f, 0x0f, 0x02, 0xc1, 0xaf, 0xbd, 0x03, 0x01, 0x13, 0x8a, 0x6b,0x3a, 0x91, 0x11, 0x41, 0x4f, 0x67, 0xdc, 0xea, 0x97, 0xf2, 0xcf, 0xce, 0xf0, 0xb4, 0xe6, 0x73,0x96, 0xac, 0x74, 0x22, 0xe7, 0xad, 0x35, 0x85, 0xe2, 0xf9, 0x37, 0xe8, 0x1c, 0x75, 0xdf, 0x6e,0x47, 0xf1, 0x1a, 0x71, 0x1d, 0x29, 0xc5, 0x89, 0x6f, 0xb7, 0x62, 0x0e, 0xaa, 0x18, 0xbe, 0x1b,0xfc, 0x56, 0x3e, 0x4b, 0xc6, 0xd2, 0x79, 0x20, 0x9a, 0xdb, 0xc0, 0xfe, 0x78, 0xcd, 0x5a, 0xf4,0x1f, 0xdd, 0xa8, 0x33, 0x88, 0x07, 0xc7, 0x31, 0xb1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xec, 0x5f,0x60, 0x51, 0x7f, 0xa9, 0x19, 0xb5, 0x4a, 0x0d, 0x2d, 0xe5, 0x7a, 0x9f, 0x93, 0xc9, 0x9c, 0xef,0xa0, 0xe0, 0x3b, 0x4d, 0xae, 0x2a, 0xf5, 0xb0, 0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61,0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26, 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d };
再看一下, 如何利用字节代换(SubKeys)进行加密吧, 我这里一次加密了一个DWORD, 其实是4个字节, 原理是一样的。
DWORD SBoxExchange(PBYTE pbSBox, DWORD dwNum)
{PBYTE pbPtr = (PBYTE)&dwNum;for (int i = 0; i < 4; ++i){*(pbPtr + i) = pbSBox[*(pbPtr + i)];}return(dwNum);
}
接下来是行位移(ShiftRows), 拿一张图直接说吧:
不难理解吧? 就是把16字节长的明文, 想象成一个4 * 4的数组, 并且是列优先的。然后每一行进行对应循环位移即可。
看一下代码实现:
// pState is a pointer which point to a 16 bytes array
VOID ShiftRows(BYTE* pState)
{BYTE bTmp = 0;// Row 2 move left 1 bytebTmp = *(pState + 1);*(pState + 1) = *(pState + 5);*(pState + 5) = *(pState + 9);*(pState + 9) = *(pState + 13);*(pState + 13) = bTmp;// Row 3 move left 2 bytesbTmp = *(pState + 2);*(pState + 2) = *(pState + 10);*(pState + 10) = bTmp;bTmp = *(pState + 6);*(pState + 6) = *(pState + 14);*(pState + 14) = bTmp;// Row 4 move left 3 bytesbTmp = *(pState + 3);*(pState + 3) = *(pState + 15);*(pState + 15) = *(pState + 11);*(pState + 11) = *(pState + 7);*(pState + 7) = bTmp;
}
其实它还有一种更优雅的写法, 是一个公式, 你可以选择用也可以不用:
// formula: state'[i][j] = state[i][(j + i) % 4] 其中i, j∈[0, 3]
VOID
RowShift(__in PBYTE pbMatrix, __out PBYTE pbReverseMatrix, __in size_t nMatrixSize = 16)
{if (!pbMatrix || !pbReverseMatrix || !nMatrixSize || 16 != nMatrixSize){return;}for (int i = 0; i < 4; ++i){for (int j = 0; j < 4; ++j){pbReverseMatrix[i * 4 + j] = pbMatrix[4 * i + (4 + j - i) % 4];}}return;
}
说了正向的行位移(RowShifts), 还有逆向的。 正向是加密的时候用, 逆向是解密的时候用, 其实是完全一样的, 唯一的区别就是逆向位移是往右的。
看一下代码实现:
VOID InvShiftRow(BYTE *pState)
{if (!pState){return;}BYTE bTmp = 0;// Row 1 rotate right by 1bTmp = pState[13];pState[13] = pState[9];pState[9] = pState[5];pState[5] = pState[1];pState[1] = bTmp;// Row 2 rotate right by 2bTmp = pState[14];pState[14] = pState[6];pState[6] = bTmp;bTmp = pState[10];pState[10] = pState[2];pState[2] = bTmp;// Row 3 rotate right by 3bTmp = pState[15];pState[15] = pState[3];pState[3] = pState[7];pState[7] = pState[11];pState[11] = bTmp;
}
同样它也有一个优雅实现的版本:
VOID ReverseRowShift(PBYTE pbReverseMatrix, PBYTE pbMatrix, size_t nMatrixSize = 16)
{if (!pbMatrix || !pbReverseMatrix || !nMatrixSize || 16 != nMatrixSize){return;}for (int i = 0; i < 4; ++i){for (int j = 0; j < 4; ++j){pbReverseMatrix[i * 4 + j] = pbMatrix[i * 4 + (4 + j + i) % 4];}}return;
}
接下来就是列混淆了(MixColumns), 它的操作实际上是让状态矩阵(其实就是16字节的明文数组被加密后的状态)与AES标准固定的一个4 * 4单位矩阵的进行矩阵乘法
用于加密的单元矩阵MixColumns:
| 02 03 01 01 |
| 01 02 03 01 |
| 01 01 02 03 |
| 03 01 01 02 |
用于解密的单元矩阵InvMixColumns:
| 0E 0B 0D 09 |
| 09 0E 0B 0D |
| 0D 09 0E 0B |
| 0B 0D 09 0E |
两者满足以下关系, 他们两者互逆, 你可以把它理解成1把锁和1把钥匙的关系, 两者角色可以对调。假设有一扇门(明文), 被其中之一上锁了(明文与单位矩阵相乘), 可以用钥匙解锁(密文与另一个单位矩阵相乘). 相反也成立的。
具体这里参考矩阵的乘法, 不做过多叙述。
BYTE mul2(BYTE x)
{return(((x >> 7) * 0x1B) ^ (x << 1));
}BYTE mul03(BYTE x)
{// 0x03(3) = 2^1 ^ 1return(mul2(x) ^ x);
}/*
* MixColumns
* [02 03 01 01] [s0 s4 s8 s12]
* [01 02 03 01] . [s1 s5 s9 s13]
* [01 01 02 03] [s2 s6 s10 s14]
* [03 01 01 02] [s3 s7 s11 s15]
*/
VOID MixColumns(BYTE* pbState, BYTE* pOutput)
{for (int i = 0; i < 4; ++i){BYTE s0 = pbState[i * 4];BYTE s1 = pbState[i * 4 + 1];BYTE s2 = pbState[i * 4 + 2];BYTE s3 = pbState[i * 4 + 3];pOutput[i * 4] = mul2(s0) ^ mul03(s1) ^ s2 ^ s3;pOutput[i * 4 + 1] = s0 ^ mul2(s1) ^ mul03(s2) ^ s3;pOutput[i * 4 + 2] = s0 ^ s1 ^ mul2(s2) ^ mul03(s3);pOutput[i * 4 + 3] = mul03(s0) ^ s1 ^ s2 ^ mul2(s3);}
}
这里来解释一下mul2是什么意思, 先来看一下它的另一种更易于阅读的写法:
BYTE mul2(BYTE x)
{return((x & 0x80) ? (0x1B ^ (x << 1)) : (x << 1));
}
BYTE是windows定义的类型, 其实际上是unsigned char, 是无符号字符型, 范围是从0~255, mul2本质上是做左移1位的操作, 就是乘以2
(x << 1)和(x * 2)是等价的
但左移时, 当最高位是1的时候就会产生溢出, AES规定所有计算都必须限制在GF(2^8)有限域中, 不用管这些复杂的数学名词, 大白话就是:
如果最高位是1, 那就要对其异或上0x1B
所以这里(x & 0x80)其实就是检测最高位
如果是1那就((x << 1) ^ 0x1B), 如果是0, 那就直接(x << 1)即可。
总结下来就是, 对x乘2, 如果x最高为为1, 则溢出所以异或上0x1B
那这个mul3是什么?
BYTE mul03(BYTE x)
{// 0x03(3) = 2^1 ^ 1return(mul2(x) ^ x);
}
其实就是x乘以3, 并让它以2的多项式方式实现, (2^1 + 2^0) * x这样就可以了
在GF(2^8)有限域中所有加法都是xor异或操作。
所以就变成了2^1 * x ⊕ 1 * x, (2^1 * x)等价于(x << 1)
所以就是(mul2(x) ⊕ x)也就是(mul2(x) ^ x)了。其他所有的都是这样。
那个函数只是在模仿矩阵乘法的操作, 现在给出2个矩阵A为单位矩阵, B为状态矩阵
现在对其进行加密, 使用单位矩阵A, 乘以状态矩阵B, C0的第一列如下, 其他列也是用相同的方法进行计算。用A的第1行每个元素, 分别乘以B的第1列的每个元素, 并相加(在GF(2^8)有限域内加法都变成了异或(⊕))
同样, 既然有正向的列混淆,那也就有逆向的列混淆, 正向用于加密, 逆向用于解密。其实就是将矩阵换成其逆矩阵:
BYTE mul0e(BYTE x)
{// 0x0e(14) = 2^3 + 2^2 + 2^1return(mul2(mul2(mul2(x))) ^ mul2(mul2(x)) ^ mul2(x));
}BYTE mul0b(BYTE x)
{// 0x0b(11) = 2^3 + 2^1 + 1return(mul2(mul2(mul2(x))) ^ mul2(x) ^ x);
}BYTE mul0d(BYTE x)
{// 0x0d(13) = 2^3 + 2^2 + 1return(mul2(mul2(mul2(x))) ^ mul2(mul2(x)) ^ x);
}BYTE mul09(BYTE x)
{// 0x09(9) = 2^3 + 1return(mul2(mul2(mul2(x))) ^ x);
}BYTE mul2(BYTE x)
{return(((x >> 7) * 0x1B) ^ (x << 1));
}/*
Inverse MixColumns
[0e 0b 0d 09] [s0 s4 s8 s12]
[09 0e 0b 0d] . [s1 s5 s9 s13]
[0d 09 0e 0b] [s2 s6 s10 s14]
[0b 0d 09 0e] [s3 s7 s11 s15]
*/
VOID InvMixColumns(BYTE* pbState, BYTE *pOutput)
{for (int i = 0; i < 4; ++i){BYTE s0 = pbState[i * 4];BYTE s1 = pbState[i * 4 + 1];BYTE s2 = pbState[i * 4 + 2];BYTE s3 = pbState[i * 4 + 3];pOutput[i * 4] = mul0e(s0) ^ mul0b(s1) ^ mul0d(s2) ^ mul09(s3);pOutput[i * 4 + 1] = mul09(s0) ^ mul0e(s1) ^ mul0b(s2) ^ mul0d(s3);pOutput[i * 4 + 2] = mul0d(s0) ^ mul09(s1) ^ mul0e(s2) ^ mul0b(s3);pOutput[i * 4 + 3] = mul0b(s0) ^ mul0d(s1) ^ mul09(s2) ^ mul0e(s3);}
}
他只是将单位矩阵换成了其逆矩阵:
下面是结果的第一列的计算方法, 其他列也是如此
最后就是轮密钥加法(AddRoundKeys)了, 这一步比较简单, 就是将上一步得到的状态矩阵与轮密钥进行异或操作:
// AddRoundKey
for (i = 0; i < AES_BLOCK_SIZE; ++i, ++pbRoundKey)
{*(pbCipherText + i) ^= *pbRoundKey;
}
3. 加密流程梳理
加密的所有步骤都说完了, 接下来整体梳理一遍。
- 首先用户准备16字节的轮密钥, 外加明文K, 明文K不足16字节, 用PKCS#7填充
- 使用密钥拓展算法(Key Expansion)生成160个字节, 算上初始轮密钥一共11把轮密钥(176字节)
- 进入第1轮加密, 将明文K(16字节)与初始密钥(16字节)进行异或操作(AddRoundKeys), 得到的内容称为状态矩阵, 第一轮只有AddRoundKeys步骤
- 进入第2-10轮加密, 这里面包含了4步, 按照字节代换(SubKeys), 行位移(RowShifts), 列混淆(MixColumns), 加轮密钥(AddRoundKeys)这4步进行
- 进入第11轮加密, 这里按照字节代换(SubKeys), 行位移(RowShifts), 加轮密钥(AddRoundKeys)这3步进行, 注意这里没有列混淆(MixColumns)
4. 密钥拓展算法(Key Expansion Algorithm)
这个算法的目的是从一个固定长度的初始密钥(种子密钥)中生成多个轮密钥, 以供AES加密的每一轮使用。对于AES-128, 其初始密钥长度是16字节, 但AES加密过程需要10把轮密钥, 所以这个算法负责通过初始密钥生成这10把密钥。
初始密钥记为: k0, k1, k2, ..., k15, 一共16个字节, 128位。
将这16个字节分成4个32位的DWORD, 每个w[i]包含4个字节, 例如: w[0] = k0 || k1 || k2 || k3
这4个字节分别记作w[0], w[1], w[2], w[3]
从w[4]开始, 根据上一个DWORD, 也就是w[3]和特定的规则生成新的32位字
规则如下:
if (i % 4 == 0) w[i] = w[i - 4] ^ g(w[i - 1])
if (i % 4 != 0) w[i] = w[i - 4] ^ w[i - 1]
如果看迷糊了没关系, 大白话来说一下, 一把轮密钥一共16字节, 分成4个4字节。记为w[0], w[1], w[2], w[3], 下面要新生成一把轮密钥(16字节), 首先要生成第一个DWORD, 也就是w[4], 这个i就是4, 发现其能被4整除, 所以w[4] = w[0] ^ g(w[3]), 推倒下去:
4 % 4 == 0索引能被4整除, 使用g()函数: w[4] = w[0] ^ g(w[3])
5 % 4 != 0索引不能被4整除: w[5] = w[1] ^ w[4]
6 % 4 != 0索引不能被4整除: w[6] = w[2] ^ w[5]
7 % 4 != 0索引不能被4整除: w[7] = w[3] ^ w[6]
总结一下, 其永远是上一个DWORD与上一把轮密钥的同等位置的DWORD进行异或操作, 如果当前索引能被4整除, 就要用g()处理一下
结合上图看看, w[4]就是其上一个DWORD, 也就是w[3]和上一把轮密钥的同等位置w[0]两者异或生成的。如果当前正在生成的DWORD的索引能被4整除, 那需要为上一个DWORD加上g()处理
原理说完了, 先把代码贴出来:
// Round constants
const DWORD dwRoundConst[10] = {0x01000000, 0x02000000, 0x04000000, 0x08000000,0x10000000, 0x20000000, 0x40000000, 0x80000000,0x1B000000, 0x36000000
};DWORD Ror8Bits(DWORD dwWord)
{return(((dwWord << 8) & 0xFFFFFF00) | ((dwWord >> 24) & 0x000000FF));
}DWORD SBoxExchange(PBYTE pbSBox, DWORD dwNum)
{PBYTE pbPtr = (PBYTE)&dwNum;for (int i = 0; i < 4; ++i){*(pbPtr + i) = pbSBox[*(pbPtr + i)];}return(dwNum);
}// key is 128 bits => 16 bytes
// roundkey is 176 bytes (1 + 10) * 16 = 176 bytes
VOID KeyScheduleAlgo(PBYTE pbKey, PBYTE pbRoundKey)
{if (!pbKey || !pbRoundKey){return;}PDWORD pdwLastWord = nullptr;PDWORD pdwFirstWord = nullptr;PDWORD pdwStartWord = nullptr;// No.0 key, also the initial keyfor (int i = 0; i < AES_BLOCK_SIZE; ++i){pbRoundKey[i] = pbKey[i];}pdwLastWord = (((PDWORD)pbRoundKey) + 3); // w[3]pdwFirstWord = (PDWORD)pbRoundKey; // w[0]pdwStartWord = pdwLastWord + 1; // it's 10 roundsDWORD dwRor8 = 0;for (int i = 0; i < AES_TOTAL_ROUNDS; ++i){dwRor8 = Ror8Bits(*pdwLastWord);dwRor8 = SBoxExchange(bSBox, dwRor8);dwRor8 ^= dwRoundConst[i];*pdwStartWord = dwRor8 ^ *pdwFirstWord;++pdwLastWord; // w[4]++pdwFirstWord; // w[1]++pdwStartWord;*pdwStartWord = *pdwFirstWord ^ *pdwLastWord;++pdwLastWord; // w[5]++pdwFirstWord; // w[2]++pdwStartWord; *pdwStartWord = *pdwFirstWord ^ *pdwLastWord;++pdwLastWord; // w[6]++pdwFirstWord; // w[3]++pdwStartWord;*pdwStartWord = *pdwFirstWord ^ *pdwLastWord;++pdwLastWord; ++pdwFirstWord; ++pdwStartWord;}
}
下面仔细记录一下这个g()函数, 他是专门处理索引能被4整除的索引对应的DWORD的。那它具体做了什么呢?
- 首先他将w[i-1]保存到一个临时的DWORD中, w[i-1]也就是要生成DWORD的上一个DWORD。比如说是wt, 将wt循环左移了1字节, 假设wt=0x11223344, 那他就将wt变成了0x22334411
- 接着wt实际上有4个字节, 分别对其每个字节进行字节代换(SubBytes)操作。
- 最后就是使用对应轮数的轮常量对wt进行异或操作, 由于要生成10把轮密钥, 目前生成的是第1把, 所以这里应该使用的是索引为0的轮常量。到此g()函数所有的任务完成了
完成g函数的工作, w[i-1]就被处理好了, 接下来就能使用上一把轮密钥的对应DWORD也就是w[i-4]与其进行xor, 只有索引与4的倍数是特殊的, 其他不需要经过g()函数的处理, 让w[i-1]直接与w[i-4]异或即可
完整的代码贴一下:
#include <windows.h>
#include <cstdio>
#include <cstdlib>
#define AES_BLOCK_SIZE (16)
#define AES_TOTAL_ROUNDS (10)#define SBOX_SIZE (256)BYTE bSBox[SBOX_SIZE] = {0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5, 0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76,0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0, 0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0,0xb7, 0xfd, 0x93, 0x26, 0x36, 0x3f, 0xf7, 0xcc, 0x34, 0xa5, 0xe5, 0xf1, 0x71, 0xd8, 0x31, 0x15,0x04, 0xc7, 0x23, 0xc3, 0x18, 0x96, 0x05, 0x9a, 0x07, 0x12, 0x80, 0xe2, 0xeb, 0x27, 0xb2, 0x75,0x09, 0x83, 0x2c, 0x1a, 0x1b, 0x6e, 0x5a, 0xa0, 0x52, 0x3b, 0xd6, 0xb3, 0x29, 0xe3, 0x2f, 0x84,0x53, 0xd1, 0x00, 0xed, 0x20, 0xfc, 0xb1, 0x5b, 0x6a, 0xcb, 0xbe, 0x39, 0x4a, 0x4c, 0x58, 0xcf,0xd0, 0xef, 0xaa, 0xfb, 0x43, 0x4d, 0x33, 0x85, 0x45, 0xf9, 0x02, 0x7f, 0x50, 0x3c, 0x9f, 0xa8,0x51, 0xa3, 0x40, 0x8f, 0x92, 0x9d, 0x38, 0xf5, 0xbc, 0xb6, 0xda, 0x21, 0x10, 0xff, 0xf3, 0xd2,0xcd, 0x0c, 0x13, 0xec, 0x5f, 0x97, 0x44, 0x17, 0xc4, 0xa7, 0x7e, 0x3d, 0x64, 0x5d, 0x19, 0x73,0x60, 0x81, 0x4f, 0xdc, 0x22, 0x2a, 0x90, 0x88, 0x46, 0xee, 0xb8, 0x14, 0xde, 0x5e, 0x0b, 0xdb,0xe0, 0x32, 0x3a, 0x0a, 0x49, 0x06, 0x24, 0x5c, 0xc2, 0xd3, 0xac, 0x62, 0x91, 0x95, 0xe4, 0x79,0xe7, 0xc8, 0x37, 0x6d, 0x8d, 0xd5, 0x4e, 0xa9, 0x6c, 0x56, 0xf4, 0xea, 0x65, 0x7a, 0xae, 0x08,0xba, 0x78, 0x25, 0x2e, 0x1c, 0xa6, 0xb4, 0xc6, 0xe8, 0xdd, 0x74, 0x1f, 0x4b, 0xbd, 0x8b, 0x8a,0x70, 0x3e, 0xb5, 0x66, 0x48, 0x03, 0xf6, 0x0e, 0x61, 0x35, 0x57, 0xb9, 0x86, 0xc1, 0x1d, 0x9e,0xe1, 0xf8, 0x98, 0x11, 0x69, 0xd9, 0x8e, 0x94, 0x9b, 0x1e, 0x87, 0xe9, 0xce, 0x55, 0x28, 0xdf,0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68, 0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16 };BYTE bInvSBox[SBOX_SIZE] = {0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38, 0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb,0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87, 0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb,0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d, 0xee, 0x4c, 0x95, 0x0b, 0x42, 0xfa, 0xc3, 0x4e,0x08, 0x2e, 0xa1, 0x66, 0x28, 0xd9, 0x24, 0xb2, 0x76, 0x5b, 0xa2, 0x49, 0x6d, 0x8b, 0xd1, 0x25,0x72, 0xf8, 0xf6, 0x64, 0x86, 0x68, 0x98, 0x16, 0xd4, 0xa4, 0x5c, 0xcc, 0x5d, 0x65, 0xb6, 0x92,0x6c, 0x70, 0x48, 0x50, 0xfd, 0xed, 0xb9, 0xda, 0x5e, 0x15, 0x46, 0x57, 0xa7, 0x8d, 0x9d, 0x84,0x90, 0xd8, 0xab, 0x00, 0x8c, 0xbc, 0xd3, 0x0a, 0xf7, 0xe4, 0x58, 0x05, 0xb8, 0xb3, 0x45, 0x06,0xd0, 0x2c, 0x1e, 0x8f, 0xca, 0x3f, 0x0f, 0x02, 0xc1, 0xaf, 0xbd, 0x03, 0x01, 0x13, 0x8a, 0x6b,0x3a, 0x91, 0x11, 0x41, 0x4f, 0x67, 0xdc, 0xea, 0x97, 0xf2, 0xcf, 0xce, 0xf0, 0xb4, 0xe6, 0x73,0x96, 0xac, 0x74, 0x22, 0xe7, 0xad, 0x35, 0x85, 0xe2, 0xf9, 0x37, 0xe8, 0x1c, 0x75, 0xdf, 0x6e,0x47, 0xf1, 0x1a, 0x71, 0x1d, 0x29, 0xc5, 0x89, 0x6f, 0xb7, 0x62, 0x0e, 0xaa, 0x18, 0xbe, 0x1b,0xfc, 0x56, 0x3e, 0x4b, 0xc6, 0xd2, 0x79, 0x20, 0x9a, 0xdb, 0xc0, 0xfe, 0x78, 0xcd, 0x5a, 0xf4,0x1f, 0xdd, 0xa8, 0x33, 0x88, 0x07, 0xc7, 0x31, 0xb1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xec, 0x5f,0x60, 0x51, 0x7f, 0xa9, 0x19, 0xb5, 0x4a, 0x0d, 0x2d, 0xe5, 0x7a, 0x9f, 0x93, 0xc9, 0x9c, 0xef,0xa0, 0xe0, 0x3b, 0x4d, 0xae, 0x2a, 0xf5, 0xb0, 0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61,0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26, 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d };const DWORD dwRoundConst[AES_TOTAL_ROUNDS] = {0x01000000, 0x02000000, 0x04000000, 0x08000000,0x10000000, 0x20000000, 0x40000000, 0x80000000,0x1B000000, 0x36000000
};DWORD Ror8Bits(DWORD dwWord)
{return(((dwWord << 8) & 0xFFFFFF00) | ((dwWord >> 24) & 0x000000FF));
}DWORD SBoxExchange(PBYTE pbSBox, DWORD dwNum)
{PBYTE pbPtr = (PBYTE)&dwNum;for (int i = 0; i < 4; ++i){*(pbPtr + i) = pbSBox[*(pbPtr + i)];}return(dwNum);
}// key is 128 bits => 16 bytes
// roundkey is 176 bytes (1 + 10) * 16 = 176 bytes
VOID KeyScheduleAlgo(PBYTE pbKey, PBYTE pbRoundKey)
{if (!pbKey || !pbRoundKey){return;}PDWORD pdwLastWord = nullptr;PDWORD pdwFirstWord = nullptr;PDWORD pdwStartWord = nullptr;// No.0 key, also the initial keyfor (int i = 0; i < AES_BLOCK_SIZE; ++i){pbRoundKey[i] = pbKey[i];}pdwLastWord = (((PDWORD)pbRoundKey) + 3); // w[3]pdwFirstWord = (PDWORD)pbRoundKey; // w[0]pdwStartWord = pdwLastWord + 1; // it's 10 roundsDWORD dwRor8 = 0;for (int i = 0; i < AES_TOTAL_ROUNDS; ++i){dwRor8 = Ror8Bits(*pdwLastWord);dwRor8 = SBoxExchange(bSBox, dwRor8);dwRor8 ^= dwRoundConst[i];*pdwStartWord = dwRor8 ^ *pdwFirstWord;++pdwLastWord; // w[4]++pdwFirstWord; // w[1]++pdwStartWord;*pdwStartWord = *pdwFirstWord ^ *pdwLastWord;++pdwLastWord; // w[5]++pdwFirstWord; // w[2]++pdwStartWord; *pdwStartWord = *pdwFirstWord ^ *pdwLastWord;++pdwLastWord; // w[6]++pdwFirstWord; // w[3]++pdwStartWord;*pdwStartWord = *pdwFirstWord ^ *pdwLastWord;++pdwLastWord; ++pdwFirstWord; ++pdwStartWord;}
}VOID ShiftRows(BYTE* pState)
{BYTE bTmp = 0;// Row 2 move left 1 bytebTmp = *(pState + 1);*(pState + 1) = *(pState + 5);*(pState + 5) = *(pState + 9);*(pState + 9) = *(pState + 13);*(pState + 13) = bTmp;// Row 3 move left 2 bytesbTmp = *(pState + 2);*(pState + 2) = *(pState + 10);*(pState + 10) = bTmp;bTmp = *(pState + 6);*(pState + 6) = *(pState + 14);*(pState + 14) = bTmp;// Row 4 move left 3 bytesbTmp = *(pState + 3);*(pState + 3) = *(pState + 15);*(pState + 15) = *(pState + 11);*(pState + 11) = *(pState + 7);*(pState + 7) = bTmp;
}VOID InvShiftRow(BYTE *pState)
{if (!pState){return;}BYTE bTmp = 0;// Row 1 rotate right by 1bTmp = pState[13];pState[13] = pState[9];pState[9] = pState[5];pState[5] = pState[1];pState[1] = bTmp;// Row 2 rotate right by 2bTmp = pState[14];pState[14] = pState[6];pState[6] = bTmp;bTmp = pState[10];pState[10] = pState[2];pState[2] = bTmp;// Row 3 rotate right by 3bTmp = pState[15];pState[15] = pState[3];pState[3] = pState[7];pState[7] = pState[11];pState[11] = bTmp;
}BYTE mul2(BYTE x)
{return(((x >> 7) * 0x1B) ^ (x << 1));
}BYTE mul0e(BYTE x)
{// 0x0e(14) = 2^3 + 2^2 + 2^1return(mul2(mul2(mul2(x))) ^ mul2(mul2(x)) ^ mul2(x));
}BYTE mul0b(BYTE x)
{// 0x0b(11) = 2^3 + 2^1 + 1return(mul2(mul2(mul2(x))) ^ mul2(x) ^ x);
}BYTE mul0d(BYTE x)
{// 0x0d(13) = 2^3 + 2^2 + 1return(mul2(mul2(mul2(x))) ^ mul2(mul2(x)) ^ x);
}BYTE mul09(BYTE x)
{// 0x09(9) = 2^3 + 1return(mul2(mul2(mul2(x))) ^ x);
}/*
Inverse MixColumns
[0e 0b 0d 09] [s0 s4 s8 s12]
[09 0e 0b 0d] . [s1 s5 s9 s13]
[0d 09 0e 0b] [s2 s6 s10 s14]
[0b 0d 09 0e] [s3 s7 s11 s15]
*/
VOID InvMixColumns(BYTE* pbState, BYTE *pOutput)
{for (int i = 0; i < 4; ++i){BYTE s0 = pbState[i * 4];BYTE s1 = pbState[i * 4 + 1];BYTE s2 = pbState[i * 4 + 2];BYTE s3 = pbState[i * 4 + 3];pOutput[i * 4] = mul0e(s0) ^ mul0b(s1) ^ mul0d(s2) ^ mul09(s3);pOutput[i * 4 + 1] = mul09(s0) ^ mul0e(s1) ^ mul0b(s2) ^ mul0d(s3);pOutput[i * 4 + 2] = mul0d(s0) ^ mul09(s1) ^ mul0e(s2) ^ mul0b(s3);pOutput[i * 4 + 3] = mul0b(s0) ^ mul0d(s1) ^ mul09(s2) ^ mul0e(s3);}
}BYTE mul03(BYTE x)
{// 0x03(3) = 2^1 ^ 1return(mul2(x) ^ x);
}/*
* MixColumns
* [02 03 01 01] [s0 s4 s8 s12]
* [01 02 03 01] . [s1 s5 s9 s13]
* [01 01 02 03] [s2 s6 s10 s14]
* [03 01 01 02] [s3 s7 s11 s15]
*/
VOID MixColumns(BYTE* pbState, BYTE* pOutput)
{for (int i = 0; i < 4; ++i){BYTE s0 = pbState[i * 4];BYTE s1 = pbState[i * 4 + 1];BYTE s2 = pbState[i * 4 + 2];BYTE s3 = pbState[i * 4 + 3];pOutput[i * 4] = mul2(s0) ^ mul03(s1) ^ s2 ^ s3;pOutput[i * 4 + 1] = s0 ^ mul2(s1) ^ mul03(s2) ^ s3;pOutput[i * 4 + 2] = s0 ^ s1 ^ mul2(s2) ^ mul03(s3);pOutput[i * 4 + 3] = mul03(s0) ^ s1 ^ s2 ^ mul2(s3);}
}VOID AES128Decrypt(const BYTE* pbRoundKey,const BYTE* pbCipherText,BYTE* pbPlainText)
{BYTE i = 0, j = 0;BYTE t, u, v;BYTE tmp[AES_BLOCK_SIZE] = { 0 };// 176 bytes totally, pointer to the begining of last 16 bytespbRoundKey += 160;// first roundfor (i = 0; i < AES_BLOCK_SIZE; ++i){*(pbPlainText + i) = *(pbCipherText + i) ^ *(pbRoundKey + i);}pbRoundKey -= 16;// Inverse ShiftRowsInvShiftRow(pbPlainText);// Inverse SubBytesfor (i = 0; i < AES_BLOCK_SIZE; ++i){*(pbPlainText + i) = bInvSBox[*(pbPlainText + i)];}// 9 roundsfor (j = 1; j < AES_TOTAL_ROUNDS; ++j){for (i = 0; i < AES_BLOCK_SIZE; ++i){*(tmp + i) = *(pbPlainText + i) ^ *(pbRoundKey + i);}/** Inverse MixColumns* [0e 0b 0d 09] [s0 s4 s8 s12]* [09 0e 0b 0d] . [s1 s5 s9 s13]* [0d 09 0e 0b] [s2 s6 s10 s14]* [0b 0d 09 0e] [s3 s7 s11 s15]*/InvMixColumns(tmp, pbPlainText);// Inverse ShiftRowsInvShiftRow(pbPlainText);// Inverse SubBytesfor (i = 0; i < AES_BLOCK_SIZE; ++i) {*(pbPlainText + i) = bInvSBox[*(pbPlainText + i)];}pbRoundKey -= 16;}// last AddRoundKeyfor (i = 0; i < AES_BLOCK_SIZE; ++i) {*(pbPlainText + i) ^= *(pbRoundKey + i);}
}VOID AES128Encrypt(const BYTE* pbRoundKey,const BYTE* pbPlainText,BYTE* pbCipherText)
{int i = 0, j = 0;BYTE bTmp[AES_BLOCK_SIZE] = { 0 };BYTE bMix = 0;// first AddRoundKeyfor (i = 0; i < AES_BLOCK_SIZE; ++i, ++pbRoundKey){*(pbCipherText + i) = *(pbPlainText + i) ^ *pbRoundKey;}for (j = 1; j < AES_TOTAL_ROUNDS; ++j){// SubBytesfor (i = 0; i < AES_BLOCK_SIZE; ++i){*(bTmp + i) = bSBox[*(pbCipherText + i)];}// ShiftRowsShiftRows(bTmp);// MixColumns// | 02 03 01 01 |// | 01 02 03 01 | * Status matrix// | 01 01 02 03 |// | 03 01 01 02 |MixColumns(bTmp, pbCipherText);// AddRoundKeyfor (i = 0; i < AES_BLOCK_SIZE; ++i, ++pbRoundKey){*(pbCipherText + i) ^= *pbRoundKey;}}// last roundfor (i = 0; i < AES_BLOCK_SIZE; ++i){*(pbCipherText + i) = bSBox[*(pbCipherText + i)];}ShiftRows(pbCipherText);for (i = 0; i < AES_BLOCK_SIZE; ++i, ++pbRoundKey){*(pbCipherText + i) ^= *pbRoundKey;}
}PBYTE PKCSPadding(PBYTE pbContent, DWORD dwContentLen, PDWORD pdwPaddingLen)
{if (!pbContent || !dwContentLen){return(nullptr);}DWORD dwPadddngLen = (dwContentLen + AES_BLOCK_SIZE - 1) / AES_BLOCK_SIZE * AES_BLOCK_SIZE;DWORD dwLeft = AES_BLOCK_SIZE - (dwPadddngLen - dwContentLen);if (pdwPaddingLen){*pdwPaddingLen = dwPadddngLen;}// no matter what, we allocate memory for old bufferPBYTE pbNewBuf = (PBYTE)::malloc(dwPadddngLen);if (!pbNewBuf){return(nullptr);}::memset(pbNewBuf, 0, dwPadddngLen);::memcpy_s(pbNewBuf, dwPadddngLen, pbContent, dwContentLen);if (!(dwContentLen % AES_BLOCK_SIZE)){// if the content length is a multiple of 16// no need to paddingreturn(pbNewBuf);}for (DWORD dwIdx = dwContentLen; dwIdx < dwPadddngLen; ++dwIdx){pbNewBuf[dwIdx] = dwLeft;}return(pbNewBuf);
}BOOL AES128Enc(PBYTE pbBuf, DWORD dwBufLen, PBYTE pbScheduleKey)
{if (!pbBuf || !dwBufLen || !pbScheduleKey){return(FALSE);}BYTE bCipherBuf[AES_BLOCK_SIZE];DWORD dwBufCnt = dwBufLen / AES_BLOCK_SIZE;for (int i = 0; i < dwBufCnt; ++i){AES128Encrypt(pbScheduleKey, (pbBuf + i * AES_BLOCK_SIZE), bCipherBuf);::memcpy_s(pbBuf + i * AES_BLOCK_SIZE, AES_BLOCK_SIZE, bCipherBuf, AES_BLOCK_SIZE);}return(TRUE);
}BOOL AES128Dec(PBYTE pbBuf, DWORD dwBufLen, PBYTE pbScheduleKey)
{if (!pbBuf || !dwBufLen || !pbScheduleKey){return(FALSE);}BYTE bPlainBuf[AES_BLOCK_SIZE];DWORD dwBufCnt = dwBufLen / AES_BLOCK_SIZE;for (int i = 0; i < dwBufCnt; ++i){AES128Decrypt(pbScheduleKey, (pbBuf + i * AES_BLOCK_SIZE), bPlainBuf);::memcpy_s(pbBuf + i * AES_BLOCK_SIZE, AES_BLOCK_SIZE, bPlainBuf, AES_BLOCK_SIZE);}return(TRUE);
}VOID PrintHex(PBYTE pbContent, DWORD dwContentLen)
{for (int i = 0; i < dwContentLen; ++i){printf("0x%02X ", pbContent[i]);}
}int main()
{BYTE bScheduleKey[176] = { 0 };BYTE bInitialKey[16] = { 0x19, 0x84, 0x5C, 0xEB, 0x72, 0x90, 0x4F, 0x36, 0xDA, 0xC2, 0x0D, 0x7B, 0xA8, 0x53, 0xE6, 0x31 };CHAR szContent[] = "Do you wanna use AES-128 to encrypt your traffic? Very good to encrypt";BYTE bEncryptOutput[MAXBYTE] = { 0 };CHAR szDecrypted[MAXBYTE] = { 0 };DWORD dwPaddingLen = 0;PBYTE pbNewBuf = PKCSPadding((PBYTE)szContent, sizeof(szContent), &dwPaddingLen);if (!pbNewBuf){return(-1);}// Generate round keysKeyScheduleAlgo(bInitialKey, bScheduleKey);printf("Key: \r\n");PrintHex(bScheduleKey, sizeof(bScheduleKey));printf("\r\n\r\n");// Encrypt plain textAES128Enc((PBYTE)pbNewBuf, dwPaddingLen, bScheduleKey);printf("Encrypted: \r\n");PrintHex(pbNewBuf, sizeof(szContent));printf("\r\n\r\n");// Decrypt cipher textAES128Dec(pbNewBuf, dwPaddingLen, bScheduleKey);printf("Decrypted: \r\n");printf("%s\r\n", pbNewBuf);free(pbNewBuf);system("pause");return(0);
}
(完)