问题及解决方法总结。
code:GitHub - baaivision/vid2vid-zero: Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models
1.AttributeError: 'UNet2DConditionModel' object has no attribute 'encoder'
据说是预训练模型结构不匹配,偷懒把animatediff用的sd-v1-5搬过来果然不行。。老实下载sd-v1-4去了
网址:https://huggingface.co/CompVis/stable-diffusion-v1-4/tree/main
漫长的下载 x N
2.HFValidationError
File "/opt/conda/envs/vid2vid/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 158, in validate_repo_idraise HFValidationError(
huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/data/vid2vid-zero/checkpoints/stable-diffusion-v1-4'. Use `repo_type` argument if needed.
本来以为是文件路径写的不对,但检查了好几遍可以排除这个原因,查找报错的文件:
File "/opt/conda/envs/vid2vid/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 158, in validate_repo_id
找到validators.py 文件 line 158:
if repo_id.count("/") > 1:raise HFValidationError("Repo id must be in the form 'repo_name' or 'namespace/repo_name':"f" '{repo_id}'. Use `repo_type` argument if needed.")
就是说Path to off-the-shelf model输入里的“/”大于1个就会报错
而且每次点start,控制台都会运行:
https://huggingface.co/xxx
xxx是输入的Path to off-the-shelf model,比如这里就是
https://huggingface.co//data/vid2vid-zero/checkpoints/stable-diffusion-v1-4
可以看到是个错误的路径,代码里就要求输入能链接到Hugging Face Hub上的模型的格式,比如示例里给的输入是CompVis/stable-diffusion-v1-4,就会链接到在线模型:
https://huggingface.co/CompVis/stable-diffusion-v1-4
必须要改代码,让程序先查找本地模型,而不是直接去Hugging Face Hub在线用模型(因为会ConnectTimeoutError)
瞎改一通,runner.py里的download_base_model方法:
def download_base_model(self, base_model_id: str, token=None) -> str:# 设置模型文件的路径model_dir = self.checkpoint_dir / base_model_idorg_name = base_model_id.split('/')[0]org_dir = self.checkpoint_dir / org_name# 如果模型文件不存在,则创建一个空目录if not model_dir.exists():org_dir.mkdir(exist_ok=True)# 打印模型在Hugging Face Hub上的链接print(f'https://huggingface.co/{base_model_id}')print(token)print(org_dir)# 如果没有提供token,则使用Git Large File Storage (LFS)克隆模型if token == None:subprocess.run(shlex.split(f'git lfs install'), cwd=org_dir)subprocess.run(shlex.split(f'git lfs clone https://huggingface.co/{base_model_id}'),cwd=org_dir)return model_dir.as_posix()# 否则,使用Hugging Face Hub下载模型快照到临时路径,并返回临时路径else:temp_path = huggingface_hub.snapshot_download(base_model_id, use_auth_token=token)print(temp_path, org_dir)# 移动临时路径中的模型文件到目标路径# subprocess.run(shlex.split(f'mv {temp_path} {model_dir.as_posix()}'))# return model_dir.as_posix()return temp_path
改为:
class Runner:def __init__(self, hf_token: str | None = None):self.hf_token = hf_tokenself.checkpoint_dir = pathlib.Path('checkpoints')self.checkpoint_dir.mkdir(exist_ok=True)def download_base_model(self, base_model_id: str, token=None) -> str:model_dir = self.checkpoint_dir / base_model_idorg_name = base_model_id.split('/')[0]org_dir = self.checkpoint_dir / org_nameif not model_dir.exists():org_dir.mkdir(exist_ok=True)# 加载本地模型文件的代码local_model_path = '/data/vid2vid-zero/checkpoints/stable-diffusion-v1-4'return local_model_path
app.py也要改一点
这个问题姑且是解决了
3. 缺少xformers
尝试1:官网下载GitHub - facebookresearch/xformers: Hackable and optimized Transformers building blocks, supporting a composable construction.
conda install xformers -c xformers
失败:默认下载最新版xformers=0.0.23,Requires PyTorch 2.1.1,而我的配置:
环境:pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.6
报了一堆不兼容的错
尝试2:参考教程Linux安装xFormers教程-CSDN博客
解决啦
其实还有一个问题是支持xformer的GPU最低算力是(7,0),但我的GPU是(6,1),不知道有啥问题,不过暂时还没报错,先不管了
4.OSError: Unable to load weights from checkpoint file
OSError: Unable to load weights from checkpoint file for '/data/vid2vid-zero/checkpoints/stable-diffusion-v1-4/unet/diffusion_pytorch_model.bin' at '/data/vid2vid-zero/checkpoints/stable-diffusion-v1-4/unet/diffusion_pytorch_model.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
原因:找不到文件,或者文件损坏
看了一下发现是copy到一半和服务器断联了,所以bin文件只上传了一半,删掉再重新上传
参考:解决huggingface中模型无法自动下载或者下载过慢的问题_pytorch_COHREZ-华为云开发者联盟