这篇文章比较长,请耐心看
- 空间域滤波基础
- 线性滤波
- 可分离滤波器核
- 空间域滤波和频率域滤波的一些重要比较
- 如何构建空间滤波器
- 第一种卷积方法(公式法)
- 第二种卷积的方法(可分离核)
- 第三种方法(img2col)
- 这是分离核与公式法的结果对比
- 下面是各种卷积的运算时间的比较
- 书上的一些习题
先上基础,后面有许多的程序代码
空间域滤波基础
滤波是指通过、修改或抑制图像的规定频率分量
-
通过低频的滤波器称为低通滤波器
-
通过模糊图像来平滑图像
-
通过高频的滤波器称为高通滤波器
-
线性空间滤波器
-
空间滤波通过把每个像素的值替换为该像素及其邻域的函数值来修改图像,如果 对图像像素执行的运算是线性的,则称为线性空间滤波器
空间域是通过卷积运算而达到滤波的效果。 而频率域滤波,是通过傅里叶变换到频率域,利用乘法运行达到滤波的效果
- 卷积是空间域滤波的基础,它等效于频率域中的乘法,反之亦然;
- 空间域中振幅为A的冲激,是频率域中值为A的一个常数,反之亦然
线性滤波
线性空间滤波器在图像fff和滤波器核www之间执行乘积和运算。
- 核
- 是一个阵列,其大小定义了运算的邻域
- 也称为滤波器核,核,模板,窗口
滤波器的响应g(x,y)g(x,y)g(x,y)是核系数和和核所覆盖图像像素的乘积之和:
g(x,y)=w(−1,−1)f(x−1,y−1)+w(−1,0)f(x−1,y)+⋯+w(0,0)f(x,y)+⋯+w(1,1)f(x+1,y+1)(3.30)g(x, y) = w(-1, -1)f(x-1, y-1) + w(-1, 0)f(x-1, y) + \dots + w(0,0)f(x,y)+ \dots + w(1, 1)f(x+1, y+1)\tag{3.30}g(x,y)=w(−1,−1)f(x−1,y−1)+w(−1,0)f(x−1,y)+⋯+w(0,0)f(x,y)+⋯+w(1,1)f(x+1,y+1)(3.30)
坐标xxx和yyy变化时,核的中心逐个像素地移动,并在移动过程中生产滤波后的图像ggg
m×nm \times{n}m×n的核对大小为M×NM \times{N}M×N的图像的线性空间滤波可以表示为:
g(x,y)=∑s=−aa∑t=−bbw(s,t)f(x+s,y+t)(3.31)g(x, y) = \sum_{s=-a}^{a} \sum_{t=-b}^{b} w(s,t)f(x+s, y+t) \tag{3.31}g(x,y)=s=−a∑at=−b∑bw(s,t)f(x+s,y+t)(3.31)
m=2a+1,n=2b+1m = 2a + 1, n = 2b + 1m=2a+1,n=2b+1
离散冲激强度(振幅)
δ(x−x0,y−y0)={A,x0=xandy0=y0,others(3.33)\delta(x - x_0, y - y_0) = \begin{cases} A, & x_0 = x and y_0 = y \\0, & \text{others} \end{cases} \tag{3.33}δ(x−x0,y−y0)={A,0,x0=xandy0=yothers(3.33)
式(3.31)(相关)重写如下:
(w⋆f)(x,y)=∑s=−aa∑t=−bbw(s,t)f(x+s,y+t)(3.34)(w\star{f})(x,y) = \sum_{s=-a}^{a} \sum_{t=-b}^{b} w(s,t)f(x+s, y+t) \tag{3.34}(w⋆f)(x,y)=s=−a∑at=−b∑bw(s,t)f(x+s,y+t)(3.34)
注:式中的黑色星,在书上是空星星,LaTex找了半天也没有找到这个字符对应的。
- 卷积的定义:
(w⋆f)(x,y)=∑s=−aa∑t=−bbw(s,t)f(x−s,y−t)(3.35)(w\star{f})(x,y) = \sum_{s=-a}^{a} \sum_{t=-b}^{b} w(s,t)f(x - s, y - t) \tag{3.35}(w⋆f)(x,y)=s=−a∑at=−b∑bw(s,t)f(x−s,y−t)(3.35)
线性空间滤波和空间卷积是同义的
- 卷积和相关的一些基本性质
性质 | 卷积 | 相关 |
---|---|---|
交换律 | f⋆g=g⋆ff \star{g} = g \star{f}f⋆g=g⋆f | - |
结合律 | f⋆(g⋆h)=(f⋆g)⋆hf \star{(g \star{h})} = (f \star{g}) \star{h}f⋆(g⋆h)=(f⋆g)⋆h | - |
分配律 | f⋆(g+h)=(f⋆g)+(f⋆h)f \star{(g + h)} = (f \star{g}) + (f \star{h})f⋆(g+h)=(f⋆g)+(f⋆h) | f⋆(g+h)=(f⋆g)+(f⋆h)f \star{(g + h)} = (f \star{g}) + (f \star{h})f⋆(g+h)=(f⋆g)+(f⋆h) |
图像有时是被顺序地、分段地滤波(即卷积)的,每个阶段会使用不同的核,对于QQQ个阶段同的滤波,图像fff首先用核w1w_1w1滤波,结果再用核w2w_2w2滤波,以此类推,由于总面积满足交换律,因此这多阶段的滤波可在音效滤波运算w⋆fw \star{f}w⋆f中完成。
w=w1⋆w2⋆w3⋆…⋆wQ(3.38)w = w_1 \star{w_2} \star{w_3} \star{\ldots} \star{w_{Q}} \tag{3.38}w=w1⋆w2⋆w3⋆…⋆wQ(3.38)
可分离滤波器核
若二维函数G(x,y)G(x, y)G(x,y)可写成一维函数G1(x)G_{1}(x)G1(x)和G2(x)G_{2}(x)G2(x)的乘积,即G(x,y)=G1(x)G2(y)G(x, y) = G_{1}(x) G_{2}(y)G(x,y)=G1(x)G2(y),则它是可分离的。
空间滤波器核是一个矩阵,而可分离核是一个能够表示为两个向量的外积的矩阵。
w=[111111]=crT=[11][111]w = \begin{bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix} = cr^{T} = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \begin{bmatrix} 1 & 1 & 1 \end{bmatrix}w=[111111]=crT=[11][111]
大小为m×nm \times{n}m×n的可分离核表示为两个向量vvv和www的外积;
w=vwT(3.41)w = vw^T \tag{3.41}w=vwT(3.41)
vvv和www分别是是大小为m×1m\times{1}m×1和n×1n\times{1}n×1的向量
m×mm\times{m}m×m方形核可以写成
w=vvT(3.42)w = vv^T \tag{3.42}w=vvT(3.42)
如果一个核www,可分解为两个更简单的核并且满足w=w1⋆w2w = w_1 \star{w_2}w=w1⋆w2,则有
w⋆f=(w1⋆w2)⋆f=(w2⋆w1)⋆f=w2⋆(w1⋆f)=w1⋆(w2⋆f)(3.43)w \star{f}= (w_1 \star{w_2}) \star{f} = (w_2 \star{w_1}) \star{f} = w_2 \star{(w_1} \star{f}) = w_1 \star{(w_2} \star{f}) \tag{3.43}w⋆f=(w1⋆w2)⋆f=(w2⋆w1)⋆f=w2⋆(w1⋆f)=w1⋆(w2⋆f)(3.43)
对于大小为M×NM \times{N}M×N的图像和大小为m×nm \times{n}m×n的核,需要MNmnMNmnMNmn将乘法和加法运算。如果是可分离核,分别运算,则需要MN(m+n)MN(m+n)MN(m+n)次乘法和加法运算。因此使用可分离核执行卷积运算相对于不可分离核执行卷积的计算优势定义为:
C=MNmnMN(m+n)=mnm+n(3.44)C = \frac{MNmn}{MN(m + n)} = \frac{mn}{m+n} \tag{3.44}C=MN(m+n)MNmn=m+nmn(3.44)
怎样确定核是否可分离
- 只需要确定其秩是否为1
- 因为分离的向量的秩都是为1的
确定某个核的矩阵秩为1后,就很容易求出两个向量vvv和www,因为它们的外积vwTvw^TvwT等于这个核,求可分离核的步骤:
- 在核中找到任何一个非零元素,并将其值表示为EEE。
- 形成向量ccc和rrr,它们分别等于核中包含步骤1中找到的元素的那一行和那一列。
- 参考式(3.41),令v=cv = cv=c和wT=r/Ew^T = r/EwT=r/E。
我们的目的是找到两个一维核w1w_1w1和w2w_2w2,以便实现一维卷积。根据前面的表示方法,有w1=c=v,w2=r/E=wTw_1 = c = v, w_2 = r/E = w^Tw1=c=v,w2=r/E=wT。对于圆对称的核,通过核的中心的列描述整个核,即w=vvT/cw = vv^T /cw=vvT/c,其中ccc是中心系数的值,则一维分量是w1=vw_1 = vw1=v和w2=vT/cw_2 = v^T /cw2=vT/c。
空间域滤波和频率域滤波的一些重要比较
- 卷积是空间域滤波的基础,它等效于频度域的乘法,反之亦然。
- 空间域中振幅为AAA的冲激,是频率域中值为AAA的一个常数,反之亦然。
如何构建空间滤波器
-
根据其数学性质
-
计算邻域像素平均值的滤波器会模糊图像,类似于积分
-
计算图像局部导数滤波器会锐化图像。
-
对形状具有所需性质的二维空间函数进行取样
-
高斯函数 的样本可以构建加权平均(低通)滤波器
-
设计具有规定频率响应的空间滤波器
-
一维滤波器值可以表示为向量vvv,而二维可分离核可用式(3.42)得到
-
近似于圆对称函数的二维核
程序部分
程序里也许多注释,如有不明,也可以给我留言或者发信息
def kernel_seperate(kernel):"""get kernel seperate vector if separableparam: kernel: input kerel of the filterreturn two separable vector c, r"""rank = np.linalg.matrix_rank(kernel)if rank == 1:#-----------------Numpy------------------c = np.min(kernel, axis=1)r = np.min(kernel, axis=0)#------------------Loop------------------# r = w[0, :]# c = []# for i in range(w.shape[0]):# temp = int((w[i, :] / r)[0])# c.append(temp)else:print('Not separable')# c = np.array(c).reshape(-1, 1)
# r = np.array(r).reshape(-1, 1)return c, r
第一种卷积方法(公式法)
根据公式用循环写的,方法的命名为算术平均,实际上是基础的一个离散卷积运算。
这种方法就是效率比较低,但是大核也可以运算,不会爆内存
def arithmentic_mean(image, kernel):""":param image: input image:param kernel: input kernel:return: image after convolution"""img_h = image.shape[0]img_w = image.shape[1]m = kernel.shape[0]n = kernel.shape[1]# paddingpadding_h = int((m - 1)/2)padding_w = int((n - 1)/2)image_pad = np.pad(image.copy(), [(padding_h, padding_h), (padding_w, padding_w)], mode="constant", constant_values=0)image_convol = image.copy().astype(np.float)for i in range(padding_h, img_h + padding_h):for j in range(padding_w, img_w + padding_w):temp = np.sum(image_pad[i-padding_h:i+padding_h+1, j-padding_w:j+padding_w+1] * kernel)image_convol[i - padding_h][j - padding_w] = temp # 1/(m * n) * tempreturn image_convol, image_pad
第二种卷积的方法(可分离核)
如果核不可分离的话,不可以用此方法
此种方法效率很高,而且大的核也不会爆内存,就是需要核可分离
def separate_kernel_conv2D(img, kernel):"""separable kernel convolution 2D, if 2D kernel not separable, then canot use this fuction. in the fuction. first need toseperate the 2D kernel into two 1D vectors.param: img: input image you want to implement 2d convolution with 2d kernelparam: kernel: 2D separable kernel, such as Gaussian kernel, Box Kernelreturn image after 2D convolution"""m, n = kernel.shape[:2]pad_h = int((m - 1) / 2)pad_w = int((n - 1) / 2)w_1, w_2 = kernel_seperate(kernel)convovle = np.vectorize(np.convolve, signature='(x),(y)->(k)')r_2 = convovle(img, w_2)r_r = np.rot90(r_2)r_1 = convovle(r_r, w_1)r_1 = r_1.Tr_1 = r_1[:, ::-1]img_dst = img.copy().astype(np.float)img_dst = r_1[pad_h:-pad_h, pad_w:-pad_w]return img_dst
第三种方法(img2col)
此方效率虽然比第一种高,但比第二种低,但是需要大量的内存才能实现,尤其是大核,很容易爆内存错误
def img2col(image, kernel_shape, padding='same', mode='constant', strides=1):"""image to col fuctionparam: image: input image you want to change to colparam: kernel_shape: input shape of the kernel, normally is a tuple, like (3, 3)param: padding: string, valid values 'valid' and 'same', 'valid' then not padding, output shape will be smaller than inputimage, 'same' padding the input image, output image will be same shape as input imageparam: mode: padding mode, please refere to numpy.pad. valid string values are 'constant', 'edge', 'linear_ramp', 'maximum', 'mean', 'median', 'minimum', 'reflect', 'symmetric', 'wrap'param: strides: strides of the slidding window, default is 1"""height, width = image.shape[:2]kernel_h, kernel_w = kernel_shape[:2]# paddingif padding == 'same':'''padding here only consider convolution, not strides '''result_shape = np.array([height , width])pad_h = (kernel_h - 1) // 2pad_w = (kernel_w - 1) // 2image_new = np.pad(image, [(pad_h, pad_h), (pad_w, pad_w)], mode=mode)elif padding == 'valid':result_shape = np.array([(height - kernel_h + 1), (width - kernel_w + 1)])image_new = image image2col = np.zeros((kernel_h*kernel_w, (result_shape[0]//strides) * (result_shape[1]//strides)))count = 0for h in range(0, result_shape[0], strides):for w in range(0, result_shape[1], strides):cur_item = image_new[h:h+kernel_h, w:w+kernel_w].reshape(-1)image2col[:, count] = cur_item count += 1return image2col, result_shape // stridesdef my_conv2d(image, kernel, padding="same", mode='constant', strides=1):"""convolution implement using img2colparam: image: input image you want to change to implement convolutionparam: kernel: kernelparam: padding: string, valid values 'valid' and 'same', 'valid' then not padding, output shape will be smaller than inputimage, 'same' padding the input image, output image will be same shape as input imageparam: mode: padding mode, please refere to numpy.pad. valid string values are 'constant', 'edge', 'linear_ramp', 'maximum', 'mean', 'median', 'minimum', 'reflect', 'symmetric', 'wrap'param: strides: strides of the slidding window, default is 1"""image2col, result_shape = img2col(image, kernel.shape, padding=padding, mode=mode, strides=strides)kernel_flatten = kernel.reshape(1, -1)convolve = np.matmul(kernel_flatten, image2col)return convolve.reshape(result_shape[0], result_shape[1])
这是为了实现效果而写的方法
def visualize_input(img, ax):"""add annotation to the image, values of each pixelparam: img: input imageparam: ax: axes of the matplotlib"""ax.imshow(img, cmap='gray', vmin=0, vmax=255)width, height = img.shapethresh = 100#img.max()/2.5for x in range(width):for y in range(height):ax.annotate(str(round(img[x][y],2)), xy=(y,x),horizontalalignment='center',verticalalignment='center',color='white' if img[x][y]<thresh else 'black')def visualize_kernel(img, ax):"""add annotation to kernel w(0, 0)param: img: input imageparam: ax: axes of the matplotlib"""ax.imshow(img, cmap='gray', vmin=0, vmax=255), ax.set_xticks([]), ax.set_yticks([])width, height = img.shapethresh = img.max()/2.5for x in range(width):for y in range(height):ax.annotate('w'+ str((x - 1, y - 1)), xy=(y,x),horizontalalignment='center',verticalalignment='center',color='white' if img[x][y]<thresh else 'black')def visualize_image(img, ax):"""add annotation to kernel with f(x, y)param: img: input imageparam: ax: axes of the matplotlib"""ax.imshow(img, cmap='gray', vmin=0, vmax=255), ax.set_xticks([]), ax.set_yticks([])width, height = img.shapethresh = img.max()/2.5for x in range(width):for y in range(height):ax.annotate('f'+ str(('x' + str(x - 1), 'y' + str(y - 1))), xy=(y,x),horizontalalignment='center',verticalalignment='center',color='white' if img[x][y]<thresh else 'black')
# 线性空间滤波的原理图
height, width = 10, 10
img_ori = get_check(height, width, lower=230, upper=255)grid_size = (3, 3)
m = grid_size[0]
n = grid_size[1]img_ori[height//2-m:height//2, width//2-n:width//2] = img_ori[height//2-m:height//2, width//2-n:width//2] - 100
img_dst = img_ori[height//2-m:height//2, width//2-n:width//2]fig = plt.figure(figsize=(15, 6))plt.subplot(1, 3, 1), plt.imshow(img_ori, cmap='gray', vmin=0, vmax=255), plt.title(f'Convolution')
plt.xticks([]), plt.yticks([])ax2 = fig.add_subplot(1, 3, 2)
ax2.set_title('Kernel'), visualize_kernel(img_dst, ax2)ax3 = fig.add_subplot(1, 3, 3)
ax3.set_title('Image'), visualize_image(img_dst, ax3)plt.tight_layout()
plt.show()
def visualize_show_annot(img_show, img_annot, ax):"""add annotation to the image, values of each pixelparam: img: input imageparam: ax: axes of the matplotlib"""height, width = img_annot.shapeimg_show = img_show[:height, :width]ax.imshow(img_show, cmap='gray', vmin=0, vmax=255)thresh = 100 #img_show.max()/2.5for x in range(height):for y in range(width):ax.annotate(str(round(img_annot[x][y],2)), xy=(y,x),horizontalalignment='center',verticalalignment='center',color='white' if img_annot[x][y]<thresh else 'black')
# 线性空间滤波的原理图,为了好看,原图像显示为灰色
height, width = 5, 5
img_show = np.ones([height, width], dtype=np.uint8) * 180
img_ori = np.zeros([height, width], dtype=np.uint8)
img_ori[2, 2] = 1kernel = np.arange(1, 10).reshape(3, 3)
img_dst, img_pad = arithmentic_mean(img_ori, kernel)fig = plt.figure(figsize=(10, 10))
# fig.subplots_adjust(left=None, bottom=None, right=None, top=None,wspace=0.1, hspace=0.1)
ax1 = fig.add_subplot(2, 2, 1)
ax1.set_title('Original f'), visualize_show_annot(img_show, img_ori, ax1), plt.xticks([]), plt.yticks([])height, width = 7, 7
img_show = np.ones([height, width], dtype=np.uint8) * 180
ax2 = fig.add_subplot(2, 2, 2)
ax2.set_title('Padded f'), visualize_show_annot(img_show, img_pad, ax2), plt.xticks([]), plt.yticks([])ax3 = fig.add_subplot(2, 2, 3)
ax3.set_title('Kernel'), visualize_show_annot(img_show, kernel, ax3), plt.xticks([]), plt.yticks([])ax4 = fig.add_subplot(2, 2, 4)
ax4.set_title('Result'), visualize_show_annot(img_show, img_dst, ax4), plt.xticks([]), plt.yticks([])plt.tight_layout()
plt.show()
# 旋转kernel得到的结果
height, width = 5, 5
img_show = np.ones([height, width], dtype=np.uint8) * 190
img_ori = np.zeros([height, width], dtype=np.uint8)
img_ori[2, 2] = 1kernel = np.arange(1, 10)[::-1].reshape(3, 3)
img_dst, img_pad = arithmentic_mean(img_ori, kernel)fig = plt.figure(figsize=(10, 10))
# fig.subplots_adjust(left=None, bottom=None, right=None, top=None,wspace=0.1, hspace=0.1)
ax1 = fig.add_subplot(2, 2, 1)
ax1.set_title('Original f'), visualize_show_annot(img_show, img_ori, ax1), plt.xticks([]), plt.yticks([])height, width = 7, 7
img_show = np.ones([height, width], dtype=np.uint8) * 180
ax2 = fig.add_subplot(2, 2, 2)
ax2.set_title('Padded f'), visualize_show_annot(img_show, img_pad, ax2), plt.xticks([]), plt.yticks([])ax3 = fig.add_subplot(2, 2, 3)
ax3.set_title('Kernel'), visualize_show_annot(img_show, kernel, ax3), plt.xticks([]), plt.yticks([])ax4 = fig.add_subplot(2, 2, 4)
ax4.set_title('Result'), visualize_show_annot(img_show, img_dst, ax4), plt.xticks([]), plt.yticks([])plt.tight_layout()
plt.show()
这是分离核与公式法的结果对比
# 可分离卷积核与全核卷积结果一样(上面的结果)
height, width = 5, 5
img_show = np.ones([height, width], dtype=np.uint8) * 180
img_ori = np.zeros([height, width], dtype=np.uint8)
img_ori[1:4, 2] = 1# Separate kernel
w = np.array([[1, 2, 1], [2, 4, 2], [1, 2, 1]])
w_1, w_2 = kernel_seperate(w)
w_1 = w_1.reshape(-1, 1)
w_2 = w_2.reshape(-1, 1)
img_w_1, img_pad_1 = arithmentic_mean(img_ori, w_1)
img_w_2, img_pad_2 = arithmentic_mean(img_w_1, w_2.T)fig = plt.figure(figsize=(15, 10))
ax1 = fig.add_subplot(2, 3, 1)
ax1.set_title('Original f'), visualize_show_annot(img_show, img_ori, ax1), plt.xticks([]), plt.yticks([])height, width = 7, 7
img_show = np.ones([height, width], dtype=np.uint8) * 180
ax2 = fig.add_subplot(2, 3, 2)
ax2.set_title('img_pad_1'), visualize_show_annot(img_show, img_pad_1, ax2), plt.xticks([]), plt.yticks([])ax3 = fig.add_subplot(2, 3, 3)
ax3.set_title('img_pad_2'), visualize_show_annot(img_show, img_pad_2, ax3), plt.xticks([]), plt.yticks([])ax4 = fig.add_subplot(2, 3, 4)
ax4.set_title('img_w_1'), visualize_show_annot(img_show, img_w_1, ax4), plt.xticks([]), plt.yticks([])ax5 = fig.add_subplot(2, 3, 5)
ax5.set_title('img_w_2'), visualize_show_annot(img_show, img_w_2, ax5), plt.xticks([]), plt.yticks([])plt.tight_layout()
plt.show()
# 可分离卷积核与全核卷积结果一样(上面的结果)
# import time
height, width = 5, 5
# img_show = np.ones([height, width], dtype=np.uint8) * 180
# img_ori = np.zeros([height, width], dtype=np.uint8)
# img_ori[1:4, 2] = 1
img_ori = np.arange(1, height*width+1).reshape(height, width)# Separate kernel
w = np.array([[1, 2, 1], [2, 4, 2], [1, 2, 1]])
w_1, w_2 = kernel_seperate(w)
w_1 = w_1.reshape(-1, 1)
w_2 = w_2.reshape(-1, 1)
img_w_1, img_pad_1 = arithmentic_mean(img_ori, w_1)
img_w_2, img_pad_2 = arithmentic_mean(img_w_1, w_2.T)fig = plt.figure(figsize=(15, 10))
ax1 = fig.add_subplot(2, 3, 1)
ax1.set_title('Original f'), visualize_show_annot(img_show, img_ori, ax1), plt.xticks([]), plt.yticks([])height, width = 7, 7
img_show = np.ones([height, width], dtype=np.uint8) * 180
ax2 = fig.add_subplot(2, 3, 2)
ax2.set_title('img_pad_1'), visualize_show_annot(img_show, img_pad_1, ax2), plt.xticks([]), plt.yticks([])ax3 = fig.add_subplot(2, 3, 3)
ax3.set_title('img_pad_2'), visualize_show_annot(img_show, img_pad_2, ax3), plt.xticks([]), plt.yticks([])ax4 = fig.add_subplot(2, 3, 4)
ax4.set_title('img_w_1'), visualize_show_annot(img_show, img_w_1, ax4), plt.xticks([]), plt.yticks([])ax5 = fig.add_subplot(2, 3, 5)
ax5.set_title('img_w_2'), visualize_show_annot(img_show, img_w_2, ax5), plt.xticks([]), plt.yticks([])img_img2col = my_conv2d(img_ori, w, padding='same', mode='constant').astype(int)
ax6 = fig.add_subplot(2, 3, 6)
ax6.set_title('img_img2col'), visualize_show_annot(img_show, img_img2col, ax6), plt.xticks([]), plt.yticks([])plt.tight_layout()
plt.show()
下面是各种卷积的运算时间的比较
可以看出可分离核,11x11大小的核,numpy的运算效率也是很高的。
# 全核卷积与分离核单独做卷积,没有优化,所以分离核时间比是全核卷积的一倍
# 这里只做展示,结果不重要
import timeheight, width = 512, 512
img_show = np.ones([height, width], dtype=np.uint8) * 180
img_ori = np.zeros([height, width], dtype=np.uint8)
img_ori[1:4, 2] = 1# Kernel
# w = np.array([[1, 2, 1], [2, 4, 2], [1, 2, 1]])
w = np.ones([11, 11])# Loop convolution comsum more time, but not need very big memory
start_time = time.time()
img_dst = arithmentic_mean(img_ori, w)
elapse = time.time() - start_time
print(f'Full kernel covolution elapse -> {elapse}s')# img2col, bit faster than loop, but need very more memory, if kernel is too big, cannot carry on
start_time = time.time()
img_dst = my_conv2d(img_ori, w)
elapse = time.time() - start_time
print(f'Img2col covolution elapse -> {elapse}s')# Separable Kernel convolution optimized by Numpy, run very fast, but the kernel has to be separable
start_time = time.time()
img_dst = separate_kernel_conv2D(img_ori, w)
elapse = time.time() - start_time
print(f'Seperate kernel covolution optimized elapse -> {elapse}s')# Separable kernel not optimized
start_time = time.time()
w_1, w_2 = kernel_seperate(w)
w_1 = w_1.reshape(-1, 1)
w_2 = w_2.reshape(-1, 1)
img_w_1, img_pad_1 = arithmentic_mean(img_ori, w_1)
img_w_2, img_pad_2 = arithmentic_mean(img_w_1, w_2.T)
elapse = time.time() - start_time
print(f'separate kernel covolution elapse -> {elapse}s')# opencv run very fast, but not actual convolution
start_time = time.time()
img_dst = cv2.filter2D(img_ori, -1, w)
elapse = time.time() - start_time
print(f'Opencv elapse -> {elapse}s')
Full kernel covolution elapse -> 1.2127759456634521s
Img2col covolution elapse -> 0.584395170211792s
Seperate kernel covolution optimized elapse -> 0.009973526000976562s
separate kernel covolution elapse -> 2.176215887069702s
Opencv elapse -> 0.004025936126708984s
在接下来的文章也会有这几种卷积方法效果的对比
书上的一些习题
# 习题3.20
w = np.array([[1, 2, 1], [2, 4, 2], [1, 2, 1]])
print(w)
print()
rank = np.linalg.matrix_rank(w)
print(f"Kernel's rank is -> {rank}\n")
c, r = kernel_seperate(w)
print('Two vector are below:')
print(f'vector c -> {c}, vector r -> {r}\n')
print(f'cv.T -> \n{c.reshape(-1, 1) * r.reshape(-1, 1).T}')
[[1 2 1][2 4 2][1 2 1]]Kernel's rank is -> 1Two vector are below:
vector c -> [1 2 1], vector r -> [1 2 1]cv.T ->
[[1 2 1][2 4 2][1 2 1]]
# 习题3.20
height, width = 5, 5
img_show = np.ones([height, width], dtype=np.uint8) * 180
img_ori = np.zeros([height, width], dtype=np.uint8)
img_ori[1:4, 2] = 1w = np.array([[1, 2, 1], [2, 4, 2], [1, 2, 1]])
kernel = w
img_dst, img_pad = arithmentic_mean(img_ori, kernel)fig = plt.figure(figsize=(10, 10))
# fig.subplots_adjust(left=None, bottom=None, right=None, top=None,wspace=0.1, hspace=0.1)
ax1 = fig.add_subplot(2, 2, 1)
ax1.set_title('Original f'), visualize_show_annot(img_show, img_ori, ax1), plt.xticks([]), plt.yticks([])height, width = 7, 7
img_show = np.ones([height, width], dtype=np.uint8) * 180
ax2 = fig.add_subplot(2, 2, 2)
ax2.set_title('Padded f'), visualize_show_annot(img_show, img_pad, ax2), plt.xticks([]), plt.yticks([])ax3 = fig.add_subplot(2, 2, 3)
ax3.set_title('Kernel'), visualize_show_annot(img_show, kernel, ax3), plt.xticks([]), plt.yticks([])ax4 = fig.add_subplot(2, 2, 4)
ax4.set_title('Result'), visualize_show_annot(img_show, img_dst, ax4), plt.xticks([]), plt.yticks([])plt.tight_layout()
plt.show()
# 习题3.22
# (a)
v = np.array([[1], [2], [1]])
w_T = np.array([2, 1, 1, 3])
w = v * w_T.T
print(w)
rank = np.linalg.matrix_rank(w)
print(f"Kernel's rank is -> {rank}")
# rank = 1,所以是可分离的
[[2 1 1 3][4 2 2 6][2 1 1 3]]
Kernel's rank is -> 1
# 习题3.22
# (b)
w = np.array([[1, 3, 1], [2, 6, 2]])
print(w)
rank = np.linalg.matrix_rank(w)
print(f"Kernel's rank is -> {rank}")
w_1, w_2 = kernel_seperate(w)
w_r = w_1.reshape(-1, 1) * w_2.reshape(-1, 1).T
print(w == w_r)
[[1 3 1][2 6 2]]
Kernel's rank is -> 1
[[ True True True][ True True True]]
# 习题3.22
# (c)
w = np.array([[2, 1, 1, 3],[4, 2, 2, 6],[2, 1, 1, 3]])
print(w)
rank = np.linalg.matrix_rank(w)
print(f"Kernel's rank is -> {rank}")
w_1, w_2 = kernel_seperate(w)
w_1 = w_1.reshape(-1, 1)
w_2 = w_2.reshape(-1, 1)
w_r = w_1 * w_2.T
print(w == w_r)
[[2 1 1 3][4 2 2 6][2 1 1 3]]
Kernel's rank is -> 1
[[ True True True True][ True True True True][ True True True True]]