图像哈希：QDCT篇

这个领域的背景

相关性质

FQDCT和IQDCT的相关公式，公式来自于论文（Partial Encryption of Color Image Using Quaternion Discrete Cosine Transform）：
$FQDCT_q(p,s) = \alpha(p)\alpha(s)\sum_{x=0}^{X-1}\sum_{y=0}^{Y}\mu_qf_q(x,y)N(p,s,x,y)\\ IQDCT_q(x,y) = -\sum_{x=0}^{X-1}\sum_{y=0}^{Y}\alpha(p)\alpha(s)\mu_qC_q(p,s)N(p,s,x,y)\\ u_q = \frac{1}{\sqrt{3}}i+\frac{1}{\sqrt{3}}j+\frac{1}{\sqrt{3}}k(这是论文给定的)\\ N(p,s,x,y) = cos(\frac{\pi(2x+1)p}{2X})cos(\frac{\pi(2y+1)p}{2Y})\\ \alpha(p) = \begin{cases} \frac{1}{\sqrt{X}},p=0\\ \sqrt\frac{2}{{X}},p\neq0 \end{cases} \alpha(s) = \begin{cases} \frac{1}{\sqrt{Y}},p=0\\ \sqrt\frac{2}{{Y}},p\neq0 \end{cases}\\ 其中C_q=FQDCT_q(f(x,y))\\ C_q(p,s) = C_r(p,s)+C_i(p,s)i+C_j(p,s)j+C_k(p,s)k(其中虚部的实数部分见论文公式13)\\ 上述式子是因为：i*j=-j*i=k,j*k=-k*j=i,k*i=-i*k=j$

文章信息

作者：Xiaomei Xing
期刊：KSII Transactions on Internet and Information Systems（三区）
题目：A Novel Perceptual Hashing for Color Images Using a Full Quaternion Representation

目的、实验步骤及结论

目的：QDCT来生成图像哈希，四元数的主要目的就是为了融合多个通道的信息，使得提取的特征更多。
实验步骤：
- 数据预处理：双线性插值（512*512大小），高斯低通滤波（3 * 3）
- 特征提取：分块（32 * 32）；针对每一块都使用QDCT计算得到每一块对应的相同大小的四元数矩阵，即得到一个实部和三个虚数的系数，最后使用四元数的振幅（实部和三个虚数的系数平方和开根号）来得到每一块的最终特征矩阵。
- 生成哈希值：计算每一块特征矩阵每一行和列的2-33个元素的 $L_2$ 范数（即向量之间的距离），最终会得到256个距离值，使用下述公式计算每一个图片的哈希值。
  $\begin{cases} 0,d_i < T\\ 1,Otherwise \end{cases}\\ 其中T表示d排序结果的中间值$
- 图像相似度：使用每张图片之间的汉明距离来判断是否相似。小于阈值则相似，否则不相似。
结论：

使用四元数能够有效地提取每个通道的信息，从而大幅度地提高图像哈希的鲁棒性和唯一性。

在这里我针对四元数对QDCT和QDFT进行一个解释，我的理解就是每一个像素点我们都可以使用四元数 $f_q$ 进行表示，我们进行分块之后每一个块可以是看作一个四元数矩阵。我们对每一个块进行DFT和DCT之后得到的结果也是一个四元数矩阵 $C_q$ 。
$f_q(x,y) = f_blk+f_r(x,y)i+f_g(x,y)j+f_b(x,y)k\\ C_q(p,s) = C_r(p,s)+C_i(p,s)i+C_j(p,s)j+C_k(p,s)k$
我们需要得到的最终结果就是 $C_q$ ，至于 $C_q$ 中的四个系数是怎么求得的，在论文中文章开头的那篇论文中写的很详细，无非进行将三个通道进行相加减然后再使用DCT或者DFT计算得出，这样看其实QDFT和QDCT也没有那么神秘了吧。

def image_hash(img_path):img = processing(img_path)C_r_list = image_feature(img)h_i = gen_hashing(C_r_list)return h_idef processing(img_path):"""input：图片的路径output：处理后的RGB图片"""try:img = cv2.imread(img_path)x = img.shape[0]//2 # 高度y = img.shape[1]//2 # 宽度Min = x if x<y else ycropped_image = img[x-Min:x+Min, y-Min:y+Min] # 裁剪图像img = cv2.resize((cropped_image), (512,512), interpolation=cv2.INTER_LINEAR)except:img = imageio.mimread(img_path)img = np.array(img)img = img[0]img = img[:, :, 0:3]x = img.shape[0]//2 # 高度y = img.shape[1]//2 # 宽度Min = x if x<y else ycropped_image = img[x-Min:x+Min, y-Min:y+Min] # 裁剪图像img = cv2.resize((cropped_image), (512,512), interpolation=cv2.INTER_LINEAR)
#     out = cv2.GaussianBlur(img, (3, 3),1.3) # 使用python自带的高斯滤波kernel = np.array([[1,2,1],[2,4,2],[1,2,1]])/16out = cv2.filter2D(img, -1 , kernel=kernel)  # 二维滤波器out = cv2.cvtColor(out, cv2.COLOR_BGR2YCrCb)return outdef image_feature(img):"""iamge:(512,512,3)return: array格式(x,64,64)"""C_r_list = np.zeros((0,64,64)).tolist()for i in range(0,512,64):for j in range(0,512,64):image_block = img[i:i+64,j:j+64,:]C_r,C_i,C_j,C_k = QDCT(image_block) # 可以在这里取出实部和三个虚数的实部C_r_list.append(np.sqrt(C_r**2+C_i**2+C_j**2+C_k**2).tolist())# C_r_list.append(cv2.dct(np.float32(image_block[:,:,0])))return np.array(C_r_list)def gen_hashing(feature_matrix):"""生成图像哈希值,和原论文不同，我的P和Q矩阵是每一行代表一个图像块。input:array (x,64,64)output:list (x)"""d_i = []h_i = []P_matrix = np.zeros((0,32)).tolist()Q_matrix = np.zeros((0,32)).tolist()for i in feature_matrix:i = np.array(i)row = i[0,1:33].reshape(1,-1)column = i[1:33,0].reshape(1,-1)P_matrix.extend(row.tolist())Q_matrix.extend(column.tolist())P_matrix = np.array(P_matrix)Q_matrix = np.array(Q_matrix)P_matrix_1 = (P_matrix - np.mean(P_matrix,axis = 0))/np.std(P_matrix,axis = 0,ddof=1)Q_matrix_1 = (Q_matrix - np.mean(Q_matrix,axis = 0))/np.std(Q_matrix,axis = 0,ddof=1)d_i = np.sqrt(np.sum((P_matrix_1 - Q_matrix_1)**2,axis = 1))median = np.median(d_i)for i in d_i:if i < median:h_i.append(0)else:h_i.append(1)return np.array(h_i)def QDCT(img):"""img：(64,64,3)"""# C_r = DCT(img[:,:,0]+img[:,:,1]+img[:,:,2]) * (- 1 / np.sqrt(3))Y = cv2.cvtColor(img, cv2.COLOR_RGB2YUV)[:,:,0]V_blk = np.sum((Y-np.mean(Y))**2)/(img.shape[0]**2)C_r = cv2.dct(np.float32(img[:,:,0]+img[:,:,1]+img[:,:,2]) * (- 1 / np.sqrt(3)))C_i = cv2.dct(np.float32(img[:,:,2]-img[:,:,1]+V_blk) * (1 / np.sqrt(3)))C_j = cv2.dct(np.float32(img[:,:,0]-img[:,:,2]+V_blk) * (1 / np.sqrt(3)))C_k = cv2.dct(np.float32(img[:,:,1]-img[:,:,0]+V_blk) * (1 / np.sqrt(3)))# C_i = DCT(img[:,:,2]-img[:,:,1]) * (1 / np.sqrt(3))# C_j = DCT(img[:,:,0]-img[:,:,2]) * (1 / np.sqrt(3))# C_k = DCT(img[:,:,1]-img[:,:,0]) * (1 / np.sqrt(3))return C_r,C_i,C_j,C_k
def dist_img(h1,h2):return sum(np.abs(h1-h2))