一文详解基于NarrotoAI的短剧短视频自动解说、混剪AI平台搭建

背景

前阵给孩子做电子相册学了点剪辑技术，就想凑个热闹剪剪短剧玩玩，一是学以致用，再者也好奇短剧创作为啥这么火，跟个风。

初步了解情况后，发现我的剪辑技术已经落后了，行家们玩的主要是解说，并且剪辑和解说AI也都会了。真的这么牛吗，最近我也一直在关注各种AI工具，因此花了两个晚上的时间研究体验了一下AI剪辑视频。本着能用免费技术决不充值的原则，在对比了各种AI工具后，选择了NarrotoAI这款体验了一下。体验之后的结果是，我感觉我又行了。

申明：本文纯技术贴。

短剧剪辑不仅是内容创作的热门领域，更是学习AI技术的绝佳实践场景。通过将AI工具深度融入短剧制作的各个环节（如解说台词生成、脚本生成，自动剪辑等），创作者可在完成作品的同时，系统掌握前沿技术，文中涉及大量AI相关工具，全部可以免费获得 。

如果您在复刻过程中遇到问题，请关注并留言交流。

环境要求

环境就用现成的，我手上就有一台huawei matebook pro 笔记本，操作系统win11。

硬件配置：
- CPU：i7-1260P（12核16线程，4.7GHz睿频）可高效处理视频解析与剪辑任务58。
- GPU：Iris Xe显卡（96 EU）支持视频编解码加速，需安装最新驱动以启用硬件加速86。Intel® Iris® Xe Graphics
软件工具：
- Python 3.8+：推荐使用Anaconda管理环境。
- FFmpeg：用于视频处理，需添加到系统环境变量。
- Git：克隆代码仓库。

Anaconda安装

已有公众号文章《人工智能学习必备工具之-Anaconda3安装、配置及优化》https://mp.weixin.qq.com/s/karflR2eWIOmb4NcMIrD3Q进行了详细介绍。

视频工具

安装 ImageMagick

ImageMagick 是一款功能强大的开源图像处理软件套件。具体介绍如下:

特点
- 跨平台性：可在 Linux、Windows、Mac OS 等大多数非专有的操作系统上运行。
- 免费开源：遵守 GPL 许可协议，全部源码开放，可自由使用、复制、修改和发布。
- 语言支持广泛：支持 Perl、C、C++、Python、PHP、Ruby、Java 等编程语言，并提供了相应的接口。
功能
- 格式转换：能将图像在超过 200 种格式之间相互转换，如常见的 JPEG、PNG、GIF、TIFF，以及特殊的 RAW、SVG 等格式。
- 基本变换：可以对图像进行改变尺寸、旋转、裁剪、翻转、修剪等操作。
- 特效处理：具备模糊、锐化、阈值处理、色彩调整等特效功能。
- 动画制作：能够将一组图片制作成 GIF 动画。
- 图像合成与编辑：可将几张图片合成为一张组合图片，还能在图片上添加文字、图形，为图片加边框或框架等。
应用场景
- Web 开发：自动生成缩略图、优化图像格式，提升网页加载速度。
- 电商平台：批量处理商品图像，如裁剪、加水印、统一格式等。
- 数据分析与机器学习：对数据集中的图像进行预处理，如调整大小、去噪等。
- 个人项目：批量整理图片、生成相册或进行日常图像处理。

ImageMagick下载地址：https://imagemagick.org/archive/binaries/ImageMagick-7.1.1-40-Q16-x64-static.exe

安装之后，打开 cmd验证一下能否正常运行：

FFmpeg安装

ffmpeg官方：https://www.ffmpeg.org/download.html。

安装之后验证一下：
在这里插入图片描述

安装NarratoAI

介绍

支持阿里QwenVL大模型，国内网络可用；支持短剧混剪功能，十分钟精彩不断；新增一键合并视频和字幕。

支持阿里QwenVL大模型，国内网络可用
这次升级有了QwenVL大模型的视频理解能力，而且国内网络就能用，还有免费额度哦。
支持短剧混剪功能，十分钟精彩不断
工具现在支持短剧混剪，最长支持解析 10 分钟的视频。
优化时间戳到毫秒级，剪辑超精准
时间戳精确到毫秒了，这对剪辑特别有用。
新增一键合并视频和字幕，素材整理快人一步
新增的合并视频和字幕功能很方便。
脚本上传，创作按部就班
有了脚本上传功能
一键清理缓存，工具运行超流畅
要是工具用久了有点卡，别担心。
一键转录超便捷，文字提取超轻松
这个一键转录功能超实用。
支持 TTS生成失败支持自动重试

NarratoAI下载代码

# 克隆代码仓库
$ git clone https://github.com/linyqh/NarratoAI.git
Cloning into 'NarratoAI'...
remote: Enumerating objects: 1198, done.
remote: Counting objects: 100% (290/290), done.
remote: Compressing objects: 100% (131/131), done.
remote: Total 1198 (delta 178), reused 161 (delta 159), pack-reused 908 (from 2)
Receiving objects: 100% (1198/1198), 7.30 MiB | 3.15 MiB/s, done.
Resolving deltas: 100% (759/759), done.

配置python虚拟环境

直接使用pip安装依赖包，会出一些安装错误，解决起来比较费时间，推荐使用Anaconda3环境

# 创建 python 虚拟环境
conda create -n env_name python=3.10 -y
conda activate env_name
python -V#执行结果如下： 
(base) C:\Users\seane>conda activate env_name
(env_name) C:\Users\seane>python -V
Python 3.10.16(env_name) C:\Users\seane>

安装NarratoAI依赖库

#进入前面下载的代码目录 
cd NarratoAI# 安装依赖，如果出错，请参考后文中问题总结 
pip install -r requirements.txt# 安装 pytorch (无 GPU 的电脑可选)
pip3 install torch torchvision torchaudio

注：针对于intel集成显卡的（huawei matebook pro 2022)的pytorch的安装将在后续的文章中详细介绍。pytorch是一个python的AI工具套件，基于intel集成显卡的pytorch在性能上要优于基于cpu的版本。

安装成后窗口显示如下：

Successfully installed aiohappyeyeballs-2.4.6 aiohttp-3.10.11 aiosignal-1.3.2 altair-5.5.0 anyio-4.8.0 appdirs-1.4.4 attrs-25.1.0 av-12.3.0 azure-cognitiveservices-speech-1.37.0 blinker-1.9.0 brotli-1.1.0 cachetools-5.5.2 certifi-2025.1.31 chardet-5.2.0 charset-normalizer-3.4.1 click-8.1.8 colorama-0.4.6 coloredlogs-15.0.1 ctranslate2-4.5.0 dashscope-1.15.0 decorator-4.4.2 distro-1.9.0 edge-tts-6.1.19 fastapi-0.115.8 faster-whisper-1.0.3 flatbuffers-25.2.10 frozenlist-1.5.0 g4f-0.3.0.10 git-changelog-2.5.3 gitdb-4.0.12 gitpython-3.1.44 google-ai-generativelanguage-0.6.15 google-api-core-2.24.1 google-api-python-client-2.161.0 google-auth-2.38.0 google-auth-httplib2-0.2.0 google.generativeai-0.8.4 googleapis-common-protos-1.68.0 grpcio-1.70.0 grpcio-status-1.70.0 h11-0.14.0 httpcore-1.0.7 httplib2-0.22.0 httpx-0.27.2 huggingface-hub-0.29.1 humanfriendly-10.0 idna-3.10 imageio-2.37.0 imageio_ffmpeg-0.6.0 jiter-0.8.2 joblib-1.4.2 jsonschema-4.23.0 jsonschema-specifications-2024.10.1 loguru-0.7.3 markdown-it-py-3.0.0 mdurl-0.1.2 moviepy-2.0.0.dev2 multidict-6.1.0 narwhals-1.27.1 onnxruntime-1.20.1 openai-1.53.1 opencv-python-4.10.0.84 pandas-2.2.3 pillow-10.3.0 proglog-0.1.10 propcache-0.3.0 proto-plus-1.26.0 protobuf-5.29.3 pyarrow-19.0.1 pyasn1-0.6.1 pyasn1-modules-0.4.1 pycryptodome-3.21.0 pydantic-2.6.4 pydantic-core-2.16.3 pydeck-0.9.1 pydub-0.25.1 pygments-2.19.1 pyparsing-3.2.1 pyreadline3-3.5.4 pysrt-1.1.2 python-dateutil-2.9.0.post0 python-dotenv-1.0.1 python-multipart-0.0.20 pytz-2025.1 pyyaml-6.0.2 redis-5.0.3 referencing-0.36.2 regex-2024.11.6 requests-2.31.0 rich-13.9.4 rpds-py-0.23.1 rsa-4.9 safetensors-0.5.2 scikit-learn-1.5.2 scipy-1.15.2 semver-3.0.4 six-1.17.0 smmap-5.0.2 sniffio-1.3.1 starlette-0.45.3 streamlit-1.40.2 tenacity-9.0.0 threadpoolctl-3.5.0 tiktoken-0.8.0 tokenizers-0.21.0 toml-0.10.2 tomli-2.0.2 tornado-6.4.2 tqdm-4.67.1 transformers-4.47.0 tzdata-2025.1 uritemplate-4.1.1 urllib3-2.2.3 uvicorn-0.27.1 watchdog-5.0.2 win32-setctime-1.2.0 yarl-1.18.3 yt-dlp-2024.11.18

AI模型准备

语音模型Whisper

Whisper 模型用于生成字幕，转录视频，CPU和GPU均可运行，默认CPU；GPU会更快

下载地址：https://huggingface.co/guillaumekln/faster-whisper-large-v2
解压到narotoAI目： NarratoAI/app/models 。

预训练语言模型bert

下载 bert 模型: github.com
解压文件

视频理解模型（Gemini)

Gemini是用于视频理解的大模型 ,是一个在线推荐模型，可通过api访问，需要申请 api key。

注：如果无法访问google网站，可跳转到下一个章节，使用基于qwen的视频模型。

访问Google AI Studio申请API Key。
注册并登陆： Get api key，并保存，留待后面配置使用。

视频理解模型（QwenVL）

登陆https://bailian.console.aliyun.com/，注册账号
实名认证账号中心
开通模型
选择 VL-max latest
申请api key 并保存，留待后面配置使用。

运行及配置修改

首次运行生成配置文件

**注：先配置python虚拟环境后再运行工具。**激活方法参考前面章节。

首次运行NarratoAI后，会在NarratoAI目录下生成配置文件：config.toml。

(pytorch251) d:\code\NarratoAI>streamlit run webui.pyWelcome to Streamlit!If you’d like to receive helpful onboarding emails, news, offers, promotions,and the occasional swag, please enter your email address below. Otherwise,leave this field blank.Email:You can find our privacy policy at https://streamlit.io/privacy-policySummary:- This open source library collects usage statistics.- We cannot see and do not store information contained inside Streamlit apps,such as text, charts, images, etc.- Telemetry data is stored in servers in the United States.- If you'd like to opt out, add the following to %userprofile%/.streamlit/config.toml,creating that file if necessary:[browser]gatherUsageStats = falseYou can now view your Streamlit app in your browser.Local URL: http://localhost:8501Network URL: http://192.168.1.14:85012025-02-23 15:45:16.758 | INFO     | app.config.config:load_config:20 - copy config.example.toml to config.toml
2025-02-23 15:45:16.774 | INFO     | app.config.config:load_config:22 - load config from file: D:\code\NarratoAI/config.toml
2025-02-23 15:45:16.803 | INFO     | app.config.config:<module>:71 - NarratoAI v0.3.9
2025-02-23 15:46:01 | INFO | "./app\utils\utils.py:589": init_resources - 已复制系统字体: simhei.ttf
2025-02-23 15:46:02.242 Examining the path of torch.classes raised: Tried to instantiate class '__path__._path', but it does not exist! Ensure that it is registered via torch::class_

基本配置

Gemini API Key（vision_gemini_api_key）

将申请到API Key填入项目根目录的config.toml文件：

    ########## Vision Gemini API Keyvision_gemini_api_key = "xxxx"

text_openai_api_key

生成文案的大模型 API Key；建议不要再使用 Gemini 模型生成文案。国内有很多免费好用的api接口可用。比如火山方舟的doubao, Deepseek等。

ImageMagick路径指定

ffmpeg路径指定

ffmpeg_path = "D:\\AppGallery\\bin\\ffmpeg.exe"

proxy.http 和 proxy.https

配置字体和BGM

下载字体文件

下载地址：https://zenodo.org/records/13293144/files/STHeitiMedium.ttc

放置目录： NarratoAI\resource\fonts

下载BGM文件
1. 下载地址：https://zenodo.org/records/13293150/files/output000.mp3
2. 放置目录： NarratoAI\resource\songs

再次运行

修改完配置文件后，再次启动NarratoAI

(pytorch251) d:\code\NarratoAI>streamlit run webui.py

浏览器访问：

http://localhost:8501/

QwenVL 模型配置（国内使用强烈推荐）

如果Gemini模型无法访问，建议切换到QwenVL模型。

将之前申请的api key配置到 NarratoAI web界面中，
测试连接，显示 QwenVL 模型可用

视频剪辑实例

视频分析实例一:生成解说脚本

选择脚本模板，并上传视频文件，文件小于200M, 大文件可按提示保存到NarratoAI\resource\videos

上传一个视频后，点击AI生成画面解说脚本。

[{"timestamp": "00:00:38,500-00:00:38,500","picture": "画面中显示两个人在一个室内环境中。左边的人背对着镜头，穿着深色上衣和红色裤子，头发扎成马尾。右边的人面向镜头，穿着同样的深色上衣和红色裤子，双手张开，似乎在做某种动作或表演。背景是一扇白色的门和浅蓝色的墙壁，地板是灰色的瓷砖。画面上方有红色的文字“来到你的面前”，右下角有抖音的标志和一些文字信息。","narration": "俩同款着装室内整活","OST": 2,"new_timestamp": "00:00:00,000-00:00:00,000"}
]

后台信息：

UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 1579: illegal multibyte sequence
2025-02-23 22:58:51 | INFO | "./app\utils\video_processor_v2.py:312": process_video_pipeline -
步骤2: 从压缩视频提取关键帧...
2025-02-23 22:58:51 | INFO | "./app\utils\video_processor_v2.py:234": process_video - 读取视频帧...
读取视频: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7319/7319 [00:06<00:00, 1049.69it/s]
2025-02-23 22:58:58 | INFO | "./app\utils\video_processor_v2.py:246": process_video - 检测场景边界...█████████████████████████████████████████████████████████████████▍| 7289/7319 [00:06<00:00, 373.21it/s]
2025-02-23 22:59:08 | INFO | "./app\utils\video_processor_v2.py:248": process_video - 检测到 1 个场景边界2025-02-23 22:59:17 | INFO | "./app\utils\video_processor_v2.py:290": process_video_pipeline - 步骤1: 压缩视频...                                                                      | 0/1 [00:00<?, ?it/s]
Exception in thread Thread-69 (_readerthread):
Traceback (most recent call last):File "D:\AppGallery\Anaconda3\envs\pytorch251\Lib\threading.py", line 1075, in _bootstrap_innerself.run()File "D:\AppGallery\Anaconda3\envs\pytorch251\Lib\threading.py", line 1012, in runself._target(*self._args, **self._kwargs)File "D:\AppGallery\Anaconda3\envs\pytorch251\Lib\subprocess.py", line 1601, in _readerthreadbuffer.append(fh.read())^^^^^^^^^
UnicodeDecodeError: 'gbk' codec can't decode byte 0xa2 in position 1603: illegal multibyte sequence
2025-02-23 22:59:27 | INFO | "./app\utils\video_processor_v2.py:312": process_video_pipeline -
步骤2: 从压缩视频提取关键帧...
2025-02-23 22:59:28 | INFO | "./app\utils\video_processor_v2.py:234": process_video - 读取视频帧...
读取视频: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3359/3359 [00:11<00:00, 295.50it/s]
2025-02-23 22:59:39 | INFO | "./app\utils\video_processor_v2.py:246": process_video - 检测场景边界...
2025-02-23 22:59:48 | INFO | "./app\utils\video_processor_v2.py:248": process_video - 检测到 1 个场景边界████████████████████████████████████████████████████████████▏ | 3319/3359 [00:10<00:00, 456.36it/s]
提取关键帧: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:56<00:00, 56.25s/it]
保存压缩关键帧: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 16.25it/s]
2025-02-23 23:00:47 | INFO | "./app\utils\video_processor_v2.py:317": process_video_pipeline - ███████████████████████████████████████████████████████████████████████████████| 1/1 [00:56<00:00, 56.23s/it]
步骤3: 提取高清关键帧...                                                                                                                                                              | 0/1 [00:00<?, ?it/s]
提取高清帧: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  2.83it/s]
2025-02-23 23:00:47 | INFO | "./app\utils\video_processor_v2.py:194": extract_frames_by_numbers - 共提取了 1 个不同时间戳的帧
2025-02-23 23:00:47 | INFO | "./app\utils\video_processor_v2.py:325": process_video_pipeline - 处理完成！高清关键帧保存在: .\storage\temp\keyframes\37d1b60812267477b6ac5d5c610b737d[00:00<00:00,  2.84it/s]
2025-02-23 23:00:48 | INFO | "./app\utils\video_processor_v2.py:370": process_video_pipeline - 临时文件已清理
2025-02-23 23:00:48 | DEBUG | "./webui\tools\generate_script_docu.py:106": generate_script_docu - Vision LLM 提供商: qwenvl
2025-02-23 23:00:49 | INFO | "./app\utils\qwenvl_analyzer.py:121": analyze_images - 正在加载图片...
分析进度: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:04<00:00,  4.82s/it]
2025-02-23 23:00:56 | DEBUG | "./webui\tools\generate_script_docu.py:162": generate_script_docu - 批次 0 处理完成，共 1 张图片
2025-02-23 23:00:56 | DEBUG | "./webui\tools\generate_script_docu.py:166": generate_script_docu - 处理时间戳: 00:00:38,500-00:00:38,500███████████████████████████████████████| 1/1 [00:04<00:00,  4.81s/it]
2025-02-23 23:00:56 | DEBUG | "./webui\tools\generate_script_docu.py:211": generate_script_docu - 添加帧内容: 时间范围=00:00:38,500-00:00:38,500, 分析结果长度=149
2025-02-23 23:00:58 | INFO | "./app\utils\script_generator.py:319": __init__ - 文本 LLM 提供商: ep-20241231163508-gh4jr
2025-02-23 23:00:59 | WARNING | "./app\utils\script_generator.py:101": __init__ - 未找到模型 ep-20241231163508-gh4jr 的专用编码器，使用默认编码器
2025-02-23 23:01:23 | DEBUG | "./app\utils\script_generator.py:430": calculate_duration_and_word_count - 时间范围 00:00:38,500-00:00:38,500 的持续时间为 0.000秒, 估算字数: 10
2025-02-23 23:01:24 | INFO | "./app\utils\script_generator.py:443": process_frames - 时间范围: 00:00:38,500-00:00:38,500, 建议字数: 10
2025-02-23 23:01:24 | INFO | "./app\utils\script_generator.py:444": process_frames - 俩同款着装室内整活
2025-02-23 23:01:24 | INFO | "./app\utils\script_generator.py:514": _save_results - 保存脚本成功，总时长: 00:00:00,000
2025-02-23 23:01:24 | INFO | "./webui\tools\generate_script_docu.py:253": generate_script_docu - 脚本生成完成
2025-02-23 23:01:39.776 Examining the path of torch.classes raised: Tried to instantiate class '__path__._path',

视频自动解说案例二:短剧剪辑

跟上一节操作方法一样，上传视频，生成解说脚本，以后按照以下顺序，依次“脚本格式检查” -> “保存脚本” -> “裁剪视频” -> “生成视频”

裁剪视频成功：

生成视频成功：

处理过程日志

2025-02-24 10:04:55.761 | INFO     | __main__:render_generate_button:128 - 开始生成视频2025-02-24 10:04:55.761 | INFO     | app.services.task:start_subclip:162 - ## 开始任务: 6d2dfc04-0de9-4fb3-aedb-6e200c49cead2025-02-24 10:04:55.796 | INFO     | app.services.task:start_subclip:173 - ## 1. 加载视频脚本2025-02-24 10:04:55.825 | DEBUG    | app.services.task:start_subclip:185 - 解说完整脚本: 
瞧瞧这几人各怀心思，围绕赎金展开拉扯啦 瞧这室内几人着装各异 拉扯还在继续 瞧这室内拉扯 各怀心思忙 看这室内众人 神色各有千秋 看这室内众人 神色各有千秋
黑皮女后绿装男 业务网游挖币 众人着装神态超有趣 众人着装神态妙后续更逗 众人着装神态妙后续更逗，且看这几人要弄啥幺蛾子 深色西装男先亮相，俩严肃女登场又有啥戏？ 深绿西装男室内要干啥？2025-02-24 10:04:55.825 | DEBUG    | app.services.task:start_subclip:186 - 解说 OST 列表: 
[2, 2, 2, 2, 2, 2, 2, 2, 2, 2]2025-02-24 10:04:55.825 | DEBUG    | app.services.task:start_subclip:187 - 解说时间戳列表: 
['00:00:03,240-00:00:12,160', '00:00:14,000-00:00:22,079', '00:00:23,120-00:00:29,079', '00:00:30,440-00:00:36,960', '00:00:38,200-00:00:47,920', '00:00:48,759-00:00:52,679', '00:00:54,039-00:01:00,240', '00:01:01,840-00:01:11,079', '00:01:12,640-00:01:22,239', '00:01:24,319-00:01:26,719']2025-02-24 10:04:55.839 | INFO     | app.services.task:start_subclip:201 - ## 2. 根据OST设置生成音频列表2025-02-24 10:04:55.856 | DEBUG    | app.services.task:start_subclip:207 - 需要生成TTS的片段数: 102025-02-24 10:04:55.877 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音频2025-02-24 10:04:57.453 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_00,000-00_00_08,919.mp32025-02-24 10:04:57.453 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音频文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_00,000-00_00_08,919.mp32025-02-24 10:04:57.453 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音频2025-02-24 10:04:59.108 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_08,919-00_00_16,999.mp32025-02-24 10:04:59.108 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音频文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_08,919-00_00_16,999.mp32025-02-24 10:04:59.108 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音频2025-02-24 10:05:10.965 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_16,999-00_00_22,958.mp32025-02-24 10:05:10.965 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音频文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_16,999-00_00_22,958.mp32025-02-24 10:05:10.965 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音频2025-02-24 10:05:12.429 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_22,958-00_00_29,478.mp32025-02-24 10:05:12.445 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音频文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_22,958-00_00_29,478.mp32025-02-24 10:05:12.445 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音频2025-02-24 10:05:14.044 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_29,478-00_00_39,198.mp32025-02-24 10:05:14.060 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音频文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_29,478-00_00_39,198.mp32025-02-24 10:05:14.060 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音频2025-02-24 10:05:15.609 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_39,198-00_00_43,118.mp32025-02-24 10:05:15.610 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音频文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_39,198-00_00_43,118.mp32025-02-24 10:05:15.610 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音频2025-02-24 10:05:17.475 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_43,118-00_00_49,319.mp32025-02-24 10:05:17.486 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音频文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_43,118-00_00_49,319.mp32025-02-24 10:05:17.489 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音频2025-02-24 10:05:19.217 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_49,319-00_00_58,557.mp32025-02-24 10:05:19.217 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音频文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_49,319-00_00_58,557.mp32025-02-24 10:05:19.217 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音频2025-02-24 10:05:20.749 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_58,557-00_01_08,156.mp32025-02-24 10:05:20.749 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音频文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_58,557-00_01_08,156.mp32025-02-24 10:05:20.753 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音频2025-02-24 10:05:22.315 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_01_08,156-00_01_10,556.mp32025-02-24 10:05:22.315 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音频文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_01_08,156-00_01_10,556.mp32025-02-24 10:05:22.318 | INFO     | app.services.task:start_subclip:228 - 合并音频文件: ['D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_00_00,000-00_00_08,919.mp3', 'D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_00_08,919-00_00_16,999.mp3', 'D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_00_16,999-00_00_22,958.mp3', 'D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_00_22,958-00_00_29,478.mp3', 'D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_00_29,478-00_00_39,198.mp3', 'D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_00_39,198-00_00_43,118.mp3', 'D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_00_43,118-00_00_49,319.mp3', 'D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_00_49,319-00_00_58,557.mp3', 'D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_00_58,557-00_01_08,156.mp3', 'D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_01_08,156-00_01_10,556.mp3']2025-02-24 10:05:27.283 | INFO     | app.services.audio_merger:merge_audio_files:73 - 合并后的音频文件已保存: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\final_audio.mp32025-02-24 10:05:27.283 | INFO     | app.services.task:start_subclip:237 - 音频文件合并成功2025-02-24 10:05:27.299 | INFO     | app.services.task:start_subclip:263 - ## 3. 生成字幕、提供程序是: faster-whisper-large-v22025-02-24 10:05:27.299 | INFO     | app.services.subtitle:create:69 - 未检测到 CUDA，使用 CPU 模式2025-02-24 10:05:27.302 | INFO     | app.services.subtitle:create:78 - 使用 CPU 加载模型: ./app/models/faster-whisper-large-v22025-02-24 10:05:38.484 | INFO     | app.services.subtitle:create:86 - 模型加载完成，使用设备: cpu, 计算类型: int82025-02-24 10:05:38.484 | INFO     | app.services.subtitle:create:88 - start, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\subtitle.srt2025-02-24 10:06:00.072 | INFO     | app.services.subtitle:create:101 - 检测到的语言: 'zh', probability: 1.002025-02-24 10:06:44.076 | DEBUG    | app.services.subtitle:recognized:114 - [0.00s -> 1.82s] 瞧瞧这几人各怀心思2025-02-24 10:06:44.077 | DEBUG    | app.services.subtitle:recognized:114 - [2.20s -> 3.84s] 围绕赎金展开拉扯了2025-02-24 10:06:44.078 | DEBUG    | app.services.subtitle:recognized:114 - [9.12s -> 11.00s] 瞧这室内几人着装各异2025-02-24 10:06:44.080 | DEBUG    | app.services.subtitle:recognized:114 - [11.36s -> 12.52s] 拉扯还在继续2025-02-24 10:06:44.081 | DEBUG    | app.services.subtitle:recognized:114 - [17.16s -> 18.36s] 瞧这室内拉扯2025-02-24 10:06:44.082 | DEBUG    | app.services.subtitle:recognized:114 - [18.72s -> 19.70s] 各怀心思忙2025-02-24 10:06:44.084 | DEBUG    | app.services.subtitle:recognized:114 - [23.14s -> 24.28s] 看这室内众人2025-02-24 10:06:44.085 | DEBUG    | app.services.subtitle:recognized:114 - [24.60s -> 25.86s] 神色各有千秋2025-02-24 10:06:44.086 | DEBUG    | app.services.subtitle:recognized:114 - [29.67s -> 30.77s] 看这室内众人2025-02-24 10:06:44.088 | DEBUG    | app.services.subtitle:recognized:114 - [31.11s -> 32.35s] 神色各有千秋2025-02-24 10:06:44.089 | DEBUG    | app.services.subtitle:recognized:114 - [33.03s -> 34.45s] 黑皮女后绿妆男2025-02-24 10:06:44.107 | DEBUG    | app.services.subtitle:recognized:114 - [34.81s -> 35.85s] 业务网游挖币2025-02-24 10:06:44.110 | DEBUG    | app.services.subtitle:recognized:114 - [39.38s -> 41.16s] 众人着装神态超有趣2025-02-24 10:06:44.112 | DEBUG    | app.services.subtitle:recognized:114 - [43.31s -> 44.65s] 众人着装神态妙2025-02-24 10:06:44.113 | DEBUG    | app.services.subtitle:recognized:114 - [44.91s -> 45.65s] 后续更逗2025-02-24 10:07:28.051 | DEBUG    | app.services.subtitle:recognized:114 - [51.67s -> 54.31s] 且看这几人要弄啥幺蛾子2025-02-24 10:07:28.052 | DEBUG    | app.services.subtitle:recognized:114 - [58.77s -> 60.45s] 深色西装男先亮相2025-02-24 10:07:28.052 | DEBUG    | app.services.subtitle:recognized:114 - [60.87s -> 61.99s] 雅颜素女登场2025-02-24 10:07:28.052 | DEBUG    | app.services.subtitle:recognized:114 - [62.19s -> 62.89s] 又有啥戏2025-02-24 10:07:28.052 | DEBUG    | app.services.subtitle:recognized:114 - [68.36s -> 70.42s] 深绿西装男室内要干啥2025-02-24 10:07:28.052 | INFO     | app.services.subtitle:create:164 - complete, elapsed: 87.97 s2025-02-24 10:07:28.052 | INFO     | app.services.subtitle:create:181 - subtitle file created: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\subtitle.srt2025-02-24 10:07:28.067 | INFO     | app.services.task:start_subclip:277 - ## 4. 裁剪视频2025-02-24 10:07:28.083 | INFO     | app.services.task:start_subclip:295 - ## 5. 合并视频: => .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\combined.mp42025-02-24 10:07:28.083 | INFO     | app.services.video:combine_clip_videos:130 - 音频的最大持续时间: 70.55699999999999 s2025-02-24 10:07:28.491 | INFO     | app.services.video:combine_clip_videos:155 - 视频 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-00-03_240-00-00-12_160.mp4 已调整尺寸为 1080 x 19202025-02-24 10:07:28.799 | INFO     | app.services.video:combine_clip_videos:155 - 视频 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-00-14_000-00-00-22_079.mp4 已调整尺寸为 1080 x 19202025-02-24 10:07:29.087 | INFO     | app.services.video:combine_clip_videos:155 - 视频 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-00-23_120-00-00-29_079.mp4 已调整尺寸为 1080 x 19202025-02-24 10:07:29.588 | INFO     | app.services.video:combine_clip_videos:155 - 视频 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-00-30_440-00-00-36_960.mp4 已调整尺寸为 1080 x 19202025-02-24 10:07:30.340 | INFO     | app.services.video:combine_clip_videos:155 - 视频 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-00-38_200-00-00-47_920.mp4 已调整尺寸为 1080 x 19202025-02-24 10:07:31.063 | INFO     | app.services.video:combine_clip_videos:155 - 视频 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-00-48_759-00-00-52_679.mp4 已调整尺寸为 1080 x 19202025-02-24 10:07:31.767 | INFO     | app.services.video:combine_clip_videos:155 - 视频 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-00-54_039-00-01-00_240.mp4 已调整尺寸为 1080 x 19202025-02-24 10:07:32.547 | INFO     | app.services.video:combine_clip_videos:155 - 视频 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-01-01_840-00-01-11_079.mp4 已调整尺寸为 1080 x 19202025-02-24 10:07:32.819 | INFO     | app.services.video:combine_clip_videos:155 - 视频 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-01-12_640-00-01-22_239.mp4 已调整尺寸为 1080 x 19202025-02-24 10:07:33.227 | INFO     | app.services.video:combine_clip_videos:155 - 视频 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-01-24_319-00-01-26_719.mp4 已调整尺寸为 1080 x 19202025-02-24 10:07:33.243 | INFO     | app.services.video:combine_clip_videos:170 - 开始合并视频... (过程中出现 UserWarning: 不必理会)2025-02-24 10:08:26.147 | SUCCESS  | app.services.video:combine_clip_videos:184 - 视频合并完成2025-02-24 10:08:26.147 | INFO     | app.services.task:start_subclip:311 - ## 6. 最后合成: 1 => .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\final-1.mp42025-02-24 10:08:26.157 | WARNING  | app.utils.utils:get_bgm_file:153 - 在目录 .\resource\songs 中没有找到 MP3 或 FLAC 文件2025-02-24 10:08:26.657 | INFO     | app.services.video:generate_video_v3:331 - 读取到 20 条字幕2025-02-24 10:08:31.316 | INFO     | app.services.video:generate_video_v3:387 - 成功创建 20 条字幕剪辑2025-02-24 10:08:31.332 | DEBUG    | app.services.video:generate_video_v3:397 - 音量配置: {'original': 0.7, 'bgm': 0.3, 'narration': 1.0}2025-02-24 10:08:31.483 | INFO     | app.services.video:generate_video_v3:429 - 开始导出视频...
2025-02-24 10:10:58.073 | INFO     | app.services.video:generate_video_v3:436 - 视频已导出到: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\final-1.mp42025-02-24 10:10:58.079 | SUCCESS  | app.services.task:start_subclip:358 - 任务 6d2dfc04-0de9-4fb3-aedb-6e200c49cead 已完成, 生成 1 个视频.2025-02-24 10:10:58.321 | INFO     | __main__:render_generate_button:165 - 视频生成完成2025-02-24 10:10:58.508 | DEBUG    | webui.utils.performance:monitor_memory:12 - Memory usage: 2703.35 MB