铅华洗尽,粉黛不施,人工智能AI基于ProPainter技术去除图片以及视频水印(Python3.10)

视频以及图片修复技术是一项具有挑战性的AI视觉任务，它涉及在视频或者图片序列中填补缺失或损坏的区域，同时保持空间和时间的连贯性。该技术在视频补全、对象移除、视频恢复等领域有广泛应用。近年来，两种突出的方案在视频修复中崭露头角：flow-based propagation和spatiotemporal Transformers。尽管两套方案都还不错，但它们也存在一些局限性，如空间错位、时间范围有限和过高的成本。

说白了，你通过AI技术移除水印或者修复一段不清晰的视频，但结果却没法保证连贯性，让人一眼能看出来这个视频或者图片还是缺失状态，与此同时，过高的算力成本也是普通人难以承受的。

本次，我们通过ProPainter框架来解决视频去水印任务，该框架引入了一种称为双域传播的新方法和一种高效的遮罩引导视频Transformers。这些组件共同增强了视频修复的性能，同时保持了计算效率，成本更低，让普通人也能完成复杂的水印去除任务，正所谓：清水出芙蓉，天然去雕饰。

安装配置ProPainter

老规矩，首先克隆项目：

git clone https://github.com/sczhou/ProPainter.git

该项目基于CUDA框架，请确保本地环境的CUDA版本大于9.2。

执行命令查看本地的CUDA版本：

nvcc --version

输出：

PS C:\Users\zcxey> nvcc --version  
nvcc: NVIDIA (R) Cuda compiler driver  
Copyright (c) 2005-2022 NVIDIA Corporation  
Built on Tue_Mar__8_18:36:24_Pacific_Standard_Time_2022  
Cuda compilation tools, release 11.6, V11.6.124  
Build cuda_11.6.r11.6/compiler.31057947_0

截至本文发布，笔者的版本是11.6，关于本机配置CUDA和cudnn，请移玉步至：声音好听,颜值能打,基于PaddleGAN给人工智能AI语音模型配上动态画面(Python3.10)，囿于篇幅，这里不再赘述。

随后进入项目：

cd ProPainter

安装依赖：

pip3 install -r requirements.txt

接着下载ProPainter的预训练模型：https://github.com/sczhou/ProPainter/releases/tag/v0.1.0

将其放入项目的weights目录中，模型放入之后的目录结构如下：

weights  |- ProPainter.pth  |- recurrent_flow_completion.pth  |- raft-things.pth  |- i3d_rgb_imagenet.pt (for evaluating VFID metric)  |- README.md

至此，ProPainter就配置好了。

对象移除

ProPainter很贴心地在项目中放入了一些示例，我们直接在项目的根目录运行命令：

python3 inference_propainter.py

程序输出：

E:\work\ProPainter>python inference_propainter.py  
Pretrained flow completion model has loaded...  
Pretrained ProPainter has loaded...  
Network [InpaintGenerator] was created. Total number of parameters: 39.4 million. To see the architecture, do print(network).  Processing: bmx-trees [80 frames]...  
100%|██████████████████████████████████████████████████████████████████████████████████| 16/16 [00:10<00:00,  1.52it/s]  All results are saved in results\bmx-trees

ProPainter就会自动演示一段80帧的视频对象移除功能，输出在项目的results文件夹中：

可以看到，脚本将画面里骑自行车的小孩以及自行车给移除了。

具体操作就是将要移除的物体遮罩以及原画面放入到项目的inputs文件夹中，随后预训练模型会根据遮罩完成移除和补全动作。

生成遮罩(mask)

为了防止不法者的滥用，项目作者移除了水印的示例，现在我们来进行演示如何移除水印，首先我有一张带水印的视频或者图片：

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

可以看到该水印十分巨大，将原始画面的沙发，桌子以及床都遮住了一部分，那么第一步我们需要生成水印的遮罩，让程序可以容易的识别水印轮廓。

首先安装Open-cv库：

pip3 install opencv-python

随后编写代码，将logo提取并产生遮罩：

import cv2  
import numpy as np  room = cv2.imread('D:/Downloads/room.png' )  
logo = cv2.imread('D:/Downloads/logo.png' )  #--- Resizing the logo to the shape of room image ---  
logo = cv2.resize(logo, (room.shape[1], room.shape[0]))  #--- Apply Otsu threshold to blue channel of the logo image ---  
ret, logo_mask = cv2.threshold(logo[:,:,0], 0, 255, cv2.THRESH_BINARY|cv2.THRESH_OTSU)  
cv2.imshow('logo_mask', logo_mask)  
cv2.waitKey()  
cv2.imwrite('D:/Downloads/logo_mask.png', logo_mask)

运行效果：

当然，如果不想通过代码来完成，也可以通过Photoshop来做，直接通过Photoshop的的内容选取-》反向选择-》填充黑色-》随后再次反向选择-》填充白色，来完成：

最后效果和Open-cv的处理结果是一样的。

去除水印

如此，我们得到了原画面以及水印的遮罩，在项目的inputs目录创建test目录，随后创建img和mask目录，分别将原画和水印遮罩放入目录：

├─inputs  
│  ├─test  
│  │  ├─img  
│  │  └─mask

注意，由于该项目是基于视频的，所以最少也得有两帧的画面，如果只有1帧的画面，会报错。

运行命令：

python3 inference_propainter.py --video inputs/test/img --mask inputs/test/mask

程序返回：

E:\work\ProPainter>python inference_propainter.py --video inputs/test/img --mask inputs/test/mask  
Pretrained flow completion model has loaded...  
Pretrained ProPainter has loaded...  
Network [InpaintGenerator] was created. Total number of parameters: 39.4 million. To see the architecture, do print(network).  Processing: img [2 frames]...  
100%|████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:54<00:00, 54.30s/it]  
IMAGEIO FFMPEG_WRITER WARNING: input image is not divisible by macro_block_size=16, resizing from (1227, 697) to (1232, 704) to ensure video compatibility with most codecs and players. To prevent resizing, make your input image divisible by the macro_block_size or set the macro_block_size to 1 (risking incompatibility).  
[swscaler @ 0000025d0a1b5900] Warning: data is not aligned! This can lead to a speed loss  
IMAGEIO FFMPEG_WRITER WARNING: input image is not divisible by macro_block_size=16, resizing from (1227, 697) to (1232, 704) to ensure video compatibility with most codecs and players. To prevent resizing, make your input image divisible by the macro_block_size or set the macro_block_size to 1 (risking incompatibility).  
[swscaler @ 000001b30eb858c0] Warning: data is not aligned! This can lead to a speed loss  All results are saved in results\img

可以看到，程序将处理后的两帧视频结果输出到了项目的results/img目录中，去除水印后的结果：

移除效果可谓是非常惊艳了。

当然，我们只处理了视频的其中两帧画面，如果是10分钟左右的视频通常需要大量的GPU内存。通过下面的参数输入，可以有效解决本地的“爆显存”错误：

通过减少--neighbor_length（默认为10）来减少局部长度的数量。  
通过增加--ref_stride（默认为10）来减少全局参考帧的数量。  
通过设置--resize_ratio（默认为1.0）来调整处理视频的大小。  
通过指定--width和--height来设置较小的视频尺寸。  
设置--fp16，在推理过程中使用fp16（半精度）。  
通过减少子视频的帧数--subvideo_length（默认为80），有效地分离了GPU内存成本和视频长度。