近日,Bert-vits2-v2.2如约更新,该新版本v2.2主要把Emotion 模型换用CLAP多模态模型,推理支持输入text prompt提示词和audio prompt提示语音来进行引导风格化合成,让推理音色更具情感特色,并且推出了新的预处理webuI,操作上更加亲民和接地气。
更多情报请参见Bert-vits2官网:
https://github.com/fishaudio/Bert-VITS2/releases/tag/v2.2
与此同时,基于FastApi的推理web界面项目也同步适配了Bert-vits2-v2.2版本,官网如下:
https://github.com/jiangyuxiaoxiao/Bert-VITS2-UI
本次我们基于此两个项目来克隆原神角色八重神子的英文语音模型miko。
Bert-vits2-v2.2新的底模和情感模型
首先克隆Bert-vits2-v2.2官方项目:
git clone https://github.com/fishaudio/Bert-VITS2/tree/v2.2
安装依赖:
pip3 install -r requirements.txt
这里注意是v2.2的tag分支,因为官方随时都在更新,主分支可能会存在bug。
进入项目的目录:
cd /Bert-VITS2
随后下载新的底模和情感模型,下载地址:
https://openi.pcl.ac.cn/Stardust_minus/Bert-VITS2/modelmanage/show_model
将新的情感模型clap-hatsat-fused放入到项目的emotional目录,结构如下:
E:\work\Bert-VITS2-v22\emotional>tree /f
Folder PATH listing for volume myssd
Volume serial number is 7CE3-15AE
E:.
├───clap-htsat-fused
│ .gitattributes
│ config.json
│ merges.txt
│ preprocessor_config.json
│ pytorch_model.bin
│ README.md
│ special_tokens_map.json
│ tokenizer.json
│ tokenizer_config.json
│ vocab.json
│
└───wav2vec2-large-robust-12-ft-emotion-msp-dim .gitattributes config.json LICENSE preprocessor_config.json pytorch_model.bin README.md vocab.json
注意,wav2vec2-large-robust-12-ft-emotion-msp-dim是Bert-vits2-v2.1的情感模型,也需要保留,具体请移步:义无反顾马督工,Bert-vits2V210复刻马督工实践(Python3.10), 这里不再赘述。
至此,新模型就配置好了。
Bert-vits2-v2.2模型训练
首先下载训练集,以原神角色八重神子的英文配音为例子,数据集下载地址:
https://github.com/AI-Hobbyist/Genshin_Datasets
随后新建miko角色目录
mkdir miko
将语音标注文件以esd.list命名,放入miko目录。
同时将分片语音素材放入raw目录。
最后新建miko/configs/config.json配置文件:
{ "train": { "log_interval": 50, "eval_interval": 50, "seed": 42, "epochs": 1000, "learning_rate": 0.0002, "betas": [ 0.8, 0.99 ], "eps": 1e-09, "batch_size": 6, "fp16_run": false, "lr_decay": 0.99995, "segment_size": 16384, "init_lr_ratio": 1, "warmup_epochs": 0, "c_mel": 45, "c_kl": 1.0, "skip_optimizer": false, "freeze_ZH_bert": false, "freeze_JP_bert": false, "freeze_EN_bert": false }, "data": { "training_files": "data/miko/train.list", "validation_files": "data/miko/val.list", "max_wav_value": 32768.0, "sampling_rate": 44100, "filter_length": 2048, "hop_length": 512, "win_length": 2048, "n_mel_channels": 128, "mel_fmin": 0.0, "mel_fmax": null, "add_blank": true, "n_speakers": 1, "cleaned_text": true, "spk2id": { "miko": 0 } }, "model": { "use_spk_conditioned_encoder": true, "use_noise_scaled_mas": true, "use_mel_posterior_encoder": false, "use_duration_discriminator": true, "inter_channels": 192, "hidden_channels": 192, "filter_channels": 768, "n_heads": 2, "n_layers": 6, "kernel_size": 3, "p_dropout": 0.1, "resblock": "1", "resblock_kernel_sizes": [ 3, 7, 11 ], "resblock_dilation_sizes": [ [ 1, 3, 5 ], [ 1, 3, 5 ], [ 1, 3, 5 ] ], "upsample_rates": [ 8, 8, 2, 2, 2 ], "upsample_initial_channel": 512, "upsample_kernel_sizes": [ 16, 16, 8, 2, 2 ], "n_layers_q": 3, "use_spectral_norm": false, "gin_channels": 256 }, "version": "2.2"
}
这里注意"version": “2.2”,即版本号为最新的v2.2。
其他参数根据当前的设备环境酌情调整即可。
随后启动预处理页面:
python3 webui_preprocess.py
访问http://127.0.0.1:7860/:
按照页面的步骤进行操作即可,简单且方便。
操作完之后,运行训练命令:
python3 train_ms.py
训练好的模型放在data/miko/models目录,结构如下:
E:\work\Bert-VITS2-v22\Data\miko\models>tree /f
Folder PATH listing for volume myssd
Volume serial number is 7CE3-15AE
E:.
│ DUR_0.pth
│ DUR_100.pth
│ DUR_150.pth
│ DUR_50.pth
│ D_0.pth
│ D_100.pth
│ D_150.pth
│ D_50.pth
│ events.out.tfevents.1702457087.ly.13044.0
│ events.out.tfevents.1702458207.ly.12416.0
│ githash
│ G_0.pth
│ G_100.pth
│ G_150.pth
│ G_50.pth
│ train.log
│
└───eval events.out.tfevents.1702457087.ly.13044.1 events.out.tfevents.1702458207.ly.12416.1
至此,训练环节结束。
Bert-vits2-v2.2模型推理
推理我们使用Bert-vits2-UI项目的页面,克隆web项目:
git clone https://github.com/jiangyuxiaoxiao/Bert-VITS2-UI
将Web项目放入Bert-vits2-v2.2的根目录中,目录结构如下:
E:\work\Bert-VITS2-v22_lilith\Web>tree /f
Folder PATH listing for volume myssd
Volume serial number is 7CE3-15AE
E:.
│ index.html
│
├───assets
│ index-21bc6a28.css
│ index-402c0217.js
│
└───img helps1.png helps2.png Hiyori.ico
这里包含主页面、样式文件以及JS文件,基于Hiyori。
随后启动推理页面:
python3 server_fastapi.py
访问:http://127.0.0.1:5000/:
加载模型进行推理即可。
此外,还可以基于FastAPI的接口进行推理,换句话说,发送http请求即可获取推理音频,接口参数如下:
{ "openapi": "3.1.0", "info": { "title": "FastAPI", "version": "0.1.0" }, "paths": { "/": { "get": { "summary": "Index", "operationId": "index__get", "responses": { "200": { "description": "Successful Response", "content": { "application/json": { "schema": {} } } } } } }, "/voice": { "post": { "summary": "Voice", "description": "语音接口,若需要上传参考音频请仅使用post请求", "operationId": "voice_voice_post", "parameters": [ { "name": "model_id", "in": "query", "required": true, "schema": { "type": "integer", "description": "模型ID", "title": "Model Id" }, "description": "模型ID" }, { "name": "speaker_name", "in": "query", "required": false, "schema": { "type": "string", "description": "说话人名", "title": "Speaker Name" }, "description": "说话人名" }, { "name": "speaker_id", "in": "query", "required": false, "schema": { "type": "integer", "description": "说话人id,与speaker_name二选一", "title": "Speaker Id" }, "description": "说话人id,与speaker_name二选一" }, { "name": "sdp_ratio", "in": "query", "required": false, "schema": { "type": "number", "description": "SDP/DP混合比", "default": 0.2, "title": "Sdp Ratio" }, "description": "SDP/DP混合比" }, { "name": "noise", "in": "query", "required": false, "schema": { "type": "number", "description": "感情", "default": 0.2, "title": "Noise" }, "description": "感情" }, { "name": "noisew", "in": "query", "required": false, "schema": { "type": "number", "description": "音素长度", "default": 0.9, "title": "Noisew" }, "description": "音素长度" }, { "name": "length", "in": "query", "required": false, "schema": { "type": "number", "description": "语速", "default": 1, "title": "Length" }, "description": "语速" }, { "name": "language", "in": "query", "required": false, "schema": { "type": "string", "description": "语言", "title": "Language" }, "description": "语言" }, { "name": "auto_translate", "in": "query", "required": false, "schema": { "type": "boolean", "description": "自动翻译", "default": false, "title": "Auto Translate" }, "description": "自动翻译" }, { "name": "auto_split", "in": "query", "required": false, "schema": { "type": "boolean", "description": "自动切分", "default": false, "title": "Auto Split" }, "description": "自动切分" }, { "name": "emotion", "in": "query", "required": false, "schema": { "anyOf": [ { "type": "integer" }, { "type": "string" }, { "type": "null" } ], "description": "emo", "title": "Emotion" }, "description": "emo" } ], "requestBody": { "required": true, "content": { "multipart/form-data": { "schema": { "$ref": "#/components/schemas/Body_voice_voice_post" } } } }, "responses": { "200": { "description": "Successful Response", "content": { "application/json": { "schema": {} } } }, "422": { "description": "Validation Error", "content": { "application/json": { "schema": { "$ref": "#/components/schemas/HTTPValidationError" } } } } } }, "get": { "summary": "Voice", "description": "语音接口", "operationId": "voice_voice_get", "parameters": [ { "name": "text", "in": "query", "required": true, "schema": { "type": "string", "description": "输入文字", "title": "Text" }, "description": "输入文字" }, { "name": "model_id", "in": "query", "required": true, "schema": { "type": "integer", "description": "模型ID", "title": "Model Id" }, "description": "模型ID" }, { "name": "speaker_name", "in": "query", "required": false, "schema": { "type": "string", "description": "说话人名", "title": "Speaker Name" }, "description": "说话人名" }, { "name": "speaker_id", "in": "query", "required": false, "schema": { "type": "integer", "description": "说话人id,与speaker_name二选一", "title": "Speaker Id" }, "description": "说话人id,与speaker_name二选一" }, { "name": "sdp_ratio", "in": "query", "required": false, "schema": { "type": "number", "description": "SDP/DP混合比", "default": 0.2, "title": "Sdp Ratio" }, "description": "SDP/DP混合比" }, { "name": "noise", "in": "query", "required": false, "schema": { "type": "number", "description": "感情", "default": 0.2, "title": "Noise" }, "description": "感情" }, { "name": "noisew", "in": "query", "required": false, "schema": { "type": "number", "description": "音素长度", "default": 0.9, "title": "Noisew" }, "description": "音素长度" }, { "name": "length", "in": "query", "required": false, "schema": { "type": "number", "description": "语速", "default": 1, "title": "Length" }, "description": "语速" }, { "name": "language", "in": "query", "required": false, "schema": { "type": "string", "description": "语言", "title": "Language" }, "description": "语言" }, { "name": "auto_translate", "in": "query", "required": false, "schema": { "type": "boolean", "description": "自动翻译", "default": false, "title": "Auto Translate" }, "description": "自动翻译" }, { "name": "auto_split", "in": "query", "required": false, "schema": { "type": "boolean", "description": "自动切分", "default": false, "title": "Auto Split" }, "description": "自动切分" }, { "name": "emotion", "in": "query", "required": false, "schema": { "anyOf": [ { "type": "integer" }, { "type": "string" }, { "type": "null" } ], "description": "emo", "title": "Emotion" }, "description": "emo" } ], "responses": { "200": { "description": "Successful Response", "content": { "application/json": { "schema": {} } } }, "422": { "description": "Validation Error", "content": { "application/json": { "schema": { "$ref": "#/components/schemas/HTTPValidationError" } } } } } } }, "/models/info": { "get": { "summary": "Get Loaded Models Info", "description": "获取已加载模型信息", "operationId": "get_loaded_models_info_models_info_get", "responses": { "200": { "description": "Successful Response", "content": { "application/json": { "schema": {} } } } } } }, "/models/delete": { "get": { "summary": "Delete Model", "description": "删除指定模型", "operationId": "delete_model_models_delete_get", "parameters": [ { "name": "model_id", "in": "query", "required": true, "schema": { "type": "integer", "description": "删除模型id", "title": "Model Id" }, "description": "删除模型id" } ], "responses": { "200": { "description": "Successful Response", "content": { "application/json": { "schema": {} } } }, "422": { "description": "Validation Error", "content": { "application/json": { "schema": { "$ref": "#/components/schemas/HTTPValidationError" } } } } } } }, "/models/add": { "get": { "summary": "Add Model", "description": "添加指定模型:允许重复添加相同路径模型,且不重复占用内存", "operationId": "add_model_models_add_get", "parameters": [ { "name": "model_path", "in": "query", "required": true, "schema": { "type": "string", "description": "添加模型路径", "title": "Model Path" }, "description": "添加模型路径" }, { "name": "config_path", "in": "query", "required": false, "schema": { "type": "string", "description": "添加模型配置文件路径,不填则使用./config.json或../config.json", "title": "Config Path" }, "description": "添加模型配置文件路径,不填则使用./config.json或../config.json" }, { "name": "device", "in": "query", "required": false, "schema": { "type": "string", "description": "推理使用设备", "default": "cuda", "title": "Device" }, "description": "推理使用设备" }, { "name": "language", "in": "query", "required": false, "schema": { "type": "string", "description": "模型默认语言", "default": "ZH", "title": "Language" }, "description": "模型默认语言" } ], "responses": { "200": { "description": "Successful Response", "content": { "application/json": { "schema": {} } } }, "422": { "description": "Validation Error", "content": { "application/json": { "schema": { "$ref": "#/components/schemas/HTTPValidationError" } } } } } } }, "/models/get_unloaded": { "get": { "summary": "Get Unloaded Models Info", "description": "获取未加载模型", "operationId": "get_unloaded_models_info_models_get_unloaded_get", "parameters": [ { "name": "root_dir", "in": "query", "required": false, "schema": { "type": "string", "description": "搜索根目录", "default": "Data", "title": "Root Dir" }, "description": "搜索根目录" } ], "responses": { "200": { "description": "Successful Response", "content": { "application/json": { "schema": {} } } }, "422": { "description": "Validation Error", "content": { "application/json": { "schema": { "$ref": "#/components/schemas/HTTPValidationError" } } } } } } }, "/models/get_local": { "get": { "summary": "Get Local Models Info", "description": "获取全部本地模型", "operationId": "get_local_models_info_models_get_local_get", "parameters": [ { "name": "root_dir", "in": "query", "required": false, "schema": { "type": "string", "description": "搜索根目录", "default": "Data", "title": "Root Dir" }, "description": "搜索根目录" } ], "responses": { "200": { "description": "Successful Response", "content": { "application/json": { "schema": {} } } }, "422": { "description": "Validation Error", "content": { "application/json": { "schema": { "$ref": "#/components/schemas/HTTPValidationError" } } } } } } }, "/status": { "get": { "summary": "Get Status", "description": "获取电脑运行状态", "operationId": "get_status_status_get", "responses": { "200": { "description": "Successful Response", "content": { "application/json": { "schema": {} } } } } } }, "/tools/translate": { "get": { "summary": "Translate", "description": "翻译", "operationId": "translate_tools_translate_get", "parameters": [ { "name": "texts", "in": "query", "required": true, "schema": { "type": "string", "description": "待翻译文本", "title": "Texts" }, "description": "待翻译文本" }, { "name": "to_language", "in": "query", "required": true, "schema": { "type": "string", "description": "翻译目标语言", "title": "To Language" }, "description": "翻译目标语言" } ], "responses": { "200": { "description": "Successful Response", "content": { "application/json": { "schema": {} } } }, "422": { "description": "Validation Error", "content": { "application/json": { "schema": { "$ref": "#/components/schemas/HTTPValidationError" } } } } } } }, "/tools/random_example": { "get": { "summary": "Random Example", "description": "获取一个随机音频+文本,用于对比,音频会从本地目录随机选择。", "operationId": "random_example_tools_random_example_get", "parameters": [ { "name": "language", "in": "query", "required": false, "schema": { "type": "string", "description": "指定语言,未指定则随机返回", "title": "Language" }, "description": "指定语言,未指定则随机返回" }, { "name": "root_dir", "in": "query", "required": false, "schema": { "type": "string", "description": "搜索根目录", "default": "Data", "title": "Root Dir" }, "description": "搜索根目录" } ], "responses": { "200": { "description": "Successful Response", "content": { "application/json": { "schema": {} } } }, "422": { "description": "Validation Error", "content": { "application/json": { "schema": { "$ref": "#/components/schemas/HTTPValidationError" } } } } } } }, "/tools/get_audio": { "get": { "summary": "Get Audio", "operationId": "get_audio_tools_get_audio_get", "parameters": [ { "name": "path", "in": "query", "required": true, "schema": { "type": "string", "description": "本地音频路径", "title": "Path" }, "description": "本地音频路径" } ], "responses": { "200": { "description": "Successful Response", "content": { "application/json": { "schema": {} } } }, "422": { "description": "Validation Error", "content": { "application/json": { "schema": { "$ref": "#/components/schemas/HTTPValidationError" } } } } } } } }, "components": { "schemas": { "Body_voice_voice_post": { "properties": { "text": { "type": "string", "title": "Text" }, "reference_audio": { "type": "string", "format": "binary", "title": "Reference Audio" } }, "type": "object", "required": [ "text" ], "title": "Body_voice_voice_post" }, "HTTPValidationError": { "properties": { "detail": { "items": { "$ref": "#/components/schemas/ValidationError" }, "type": "array", "title": "Detail" } }, "type": "object", "title": "HTTPValidationError" }, "ValidationError": { "properties": { "loc": { "items": { "anyOf": [ { "type": "string" }, { "type": "integer" } ] }, "type": "array", "title": "Location" }, "msg": { "type": "string", "title": "Message" }, "type": { "type": "string", "title": "Error Type" } }, "type": "object", "required": [ "loc", "msg", "type" ], "title": "ValidationError" } } }
}
最后奉上Bert-vits2-v2.2本地训练推理整合包:
https://pan.baidu.com/s/1OVX9seRwZR6bZ-xsE_nRLg?pwd=v3uc
与众乡亲同飨。