成功的指令记录
conda create -n python3.7.12 python==3.7.12conda activate python3.7.12(python3.7.12) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ python -V
Python 3.7.12
(python3.7.12) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ python3 -m venv venv3712
(python3.7.12) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ source venv3712/bin/activate
(venv3712) (python3.7.12) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ pip install -r requirements_2080.txt
采用conda创建python env,再基于python venv在工程目录下创建venv方案。
git信息
(venv3712) (python3.7.12) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ git branch
* master
(venv3712) (python3.7.12) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ git log
commit 45ec525a0834b3f12605120eb36efe992c1f5455 (grafted, HEAD -> master, origin/master, origin/HEAD)
Author: m4singer <18866416692>
Date: Thu Dec 29 18:22:57 2022 +0800
init
(venv3712) (python3.7.12) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ git remote -v
origin https://github.com/M4Singer/M4Singer (fetch)
origin https://github.com/M4Singer/M4Singer (push)
由于github主页没有说明python3版本,经测试,发现python3.7.12可顺利安装依赖,其他几个版本均有故障。日志记录如下
(venv) (python3.9.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ pip install numpy==1.11
gcc: build/src.linux-x86_64-3.9/numpy/core/src/multiarray/lowlevel_strided_loops.c
gcc: numpy/core/src/multiarray/mapping.c
gcc: numpy/core/src/multiarray/methods.c
gcc: numpy/core/src/multiarray/multiarraymodule.c
gcc: build/src.linux-x86_64-3.9/numpy/core/src/multiarray/nditer_templ.c
gcc: numpy/core/src/multiarray/nditer_api.c
gcc: numpy/core/src/multiarray/nditer_constr.c
gcc: numpy/core/src/multiarray/nditer_pywrap.c
gcc: numpy/core/src/multiarray/number.c
gcc: numpy/core/src/multiarray/numpymemoryview.c
gcc: numpy/core/src/multiarray/numpyos.c
numpy/core/src/multiarray/numpyos.c:18:10: fatal error: xlocale.h: 没有那个文件或目录
18 | #include <xlocale.h>
| ^~~~~~~~~~~
compilation terminated.
numpy/core/src/multiarray/numpyos.c:18:10: fatal error: xlocale.h: 没æ é£ä¸ªæ件æç
18 | #include <xlocale.h>
| ^~~~~~~~~~~
compilation terminated.
error: Command "gcc -pthread -B /home/yeqiang/miniconda3/envs/python3.9.18/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/yeqiang/miniconda3/envs/python3.9.18/include -fPIC -O2 -isystem /home/yeqiang/miniconda3/envs/python3.9.18/include -fPIC -DHAVE_NPY_CONFIG_H=1 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE=1 -D_LARGEFILE64_SOURCE=1 -Ibuild/src.linux-x86_64-3.9/numpy/core/src/private -Inumpy/core/include -Ibuild/src.linux-x86_64-3.9/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -I/home/yeqiang/Downloads/ai/M4Singer/code/venv/include -I/home/yeqiang/miniconda3/envs/python3.9.18/include/python3.9 -Ibuild/src.linux-x86_64-3.9/numpy/core/src/private -Ibuild/src.linux-x86_64-3.9/numpy/core/src/private -Ibuild/src.linux-x86_64-3.9/numpy/core/src/private -c numpy/core/src/multiarray/numpyos.c -o build/temp.linux-x86_64-3.9/numpy/core/src/multiarray/numpyos.o" failed with exit status 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure
× Encountered error while trying to install package.
╰─> numpy
note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.
[notice] A new release of pip is available: 23.0.1 -> 23.2.1
[notice] To update, run: pip install --upgrade pip
(venv) (python3.9.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$
(venv) (python3.9.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ pip install wheel
Looking in indexes: http://mirrors.aliyun.com/pypi/simple/
Collecting wheel
Using cached http://mirrors.aliyun.com/pypi/packages/b8/8b/31273bf66016be6ad22bb7345c37ff350276cfd46e389a0c2ac5da9d9073/wheel-0.41.2-py3-none-any.whl (64 kB)
Installing collected packages: wheel
Successfully installed wheel-0.41.2
(venv) (python3.9.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$
gcc版本太高?
(venv) (python3.9.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ sudo apt install gcc-9
(venv) (python3.9.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ export CC=/usr/bin/gcc-9
(venv) (python3.9.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ pip install numpy==1.16.1
(venv) (python3.9.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ export CC=/usr/bin/gcc
(venv) (python3.9.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ pip install -v numpy==1.26.0
最终还是numpy版本选择太低了!!!
ERROR: Ignored the following versions that require a different python version: 0.52.0 Requires-Python >=3.6,<3.9; 0.52.0rc3 Requires-Python >=3.6,<3.9; 9.1.0 Requires-Python >=3.10
ERROR: Could not find a version that satisfies the requirement torch==1.6.0 (from versions: 1.7.1, 1.8.0, 1.8.1, 1.9.0, 1.9.1, 1.10.0, 1.10.1, 1.10.2, 1.11.0, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 2.0.0, 2.0.1)
ERROR: No matching distribution found for torch==1.6.0
(venv) (python3.9.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ pip install -v torch==1.7.1
(venv) (python3.9.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ pip install -v torch==1.7.1
Using pip 23.2.1 from /home/yeqiang/Downloads/ai/M4Singer/code/venv/lib/python3.9/site-packages/pip (python 3.9)
Looking in indexes: http://mirrors.aliyun.com/pypi/simple/
Collecting torch==1.7.1
Downloading http://mirrors.aliyun.com/pypi/packages/41/f4/4da4f26a04d93851e481e76ec17fed0d152a1691e8f1142ad763c9f07997/torch-1.7.1-cp39-cp39-manylinux1_x86_64.whl (776.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 776.8/776.8 MB 940.9 kB/s eta 0:00:00
Collecting typing-extensions (from torch==1.7.1)
Downloading http://mirrors.aliyun.com/pypi/packages/24/21/7d397a4b7934ff4028987914ac1044d3b7d52712f30e2ac7a2ae5bc86dd0/typing_extensions-4.8.0-py3-none-any.whl (31 kB)
Requirement already satisfied: numpy in ./venv/lib/python3.9/site-packages (from torch==1.7.1) (1.26.0)
Installing collected packages: typing-extensions, torch
changing mode of /home/yeqiang/Downloads/ai/M4Singer/code/venv/bin/convert-caffe2-to-onnx to 775
changing mode of /home/yeqiang/Downloads/ai/M4Singer/code/venv/bin/convert-onnx-to-caffe2 to 775
Successfully installed torch-1.7.1 typing-extensions-4.8.0
(venv) (python3.9.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$
同步修改 requirements_2080.txt torch==1.7.1
ERROR: Ignored the following versions that require a different python version: 0.52.0 Requires-Python >=3.6,<3.9; 0.52.0rc3 Requires-Python >=3.6,<3.9; 9.1.0 Requires-Python >=3.10
ERROR: Could not find a version that satisfies the requirement torchaudio==0.6.0 (from versions: 0.7.2, 0.8.0, 0.8.1, 0.9.0, 0.9.1, 0.10.0, 0.10.1, 0.10.2, 0.11.0, 0.12.0, 0.12.1, 0.13.0, 0.13.1, 2.0.0, 2.0.1, 2.0.2)
ERROR: No matching distribution found for torchaudio==0.6.0
同步修改 requirements_2080.txt torchaudio==0.7.2
ERROR: Ignored the following versions that require a different python version: 0.52.0 Requires-Python >=3.6,<3.9; 0.52.0rc3 Requires-Python >=3.6,<3.9; 9.1.0 Requires-Python >=3.10
ERROR: Could not find a version that satisfies the requirement torchvision==0.7.0 (from versions: 0.1.6, 0.1.7, 0.1.8, 0.1.9, 0.2.0, 0.2.1, 0.2.2, 0.2.2.post2, 0.2.2.post3, 0.8.2, 0.9.0, 0.9.1, 0.10.0, 0.10.1, 0.11.0, 0.11.1, 0.11.2, 0.11.3, 0.12.0, 0.13.0, 0.13.1, 0.14.0, 0.14.1, 0.15.0, 0.15.1, 0.15.2)
ERROR: No matching distribution found for torchvision==0.7.0
同步修改 requirements_2080.txt torchvision==0.8.2
Successfully built alignment audioread blinker Distance et-xmlfile future ipdb jieba librosa miditoolkit music21 nltk numba olefile praat-parselmouth pycwt PyInstaller python-Levenshtein pytorch-lightning pyworld PyYAML resampy scikit-image stopit typing uuid webrtcvad pretty-midi
Failed to build llvmlite scikit-learn
ERROR: Could not build wheels for llvmlite, scikit-learn, which is required to install pyproject.toml-based projects
cwd: /tmp/pip-install-5mct2iow/scikit-learn_c819d945f24a40bfbe3a6c7c94f9f28f/
Building wheel for scikit-learn (setup.py) ... error
ERROR: Failed building wheel for scikit-learn
Running setup.py clean for scikit-learn
Running command python setup.py clean
Partial import of sklearn during the build process.
/tmp/pip-install-5mct2iow/scikit-learn_c819d945f24a40bfbe3a6c7c94f9f28f/setup.py:123: DeprecationWarning:
`numpy.distutils` is deprecated since NumPy 1.23.0, as a result
of the deprecation of `distutils` itself. It will be removed for
Python >= 3.12. For older Python versions it will remain present.
It is recommended to use `setuptools < 60.0` for those Python versions.
For more details, see:
https://numpy.org/devdocs/reference/distutils_status_migration.html
===============================================python3.8.18
默认gcc-11
(venv) (python3.9.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ git checkout requirements_2080.txt
从索引区更新了 1 个路径
(venv) (python3.9.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ deactivate
(python3.9.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ conda deactivate
(base) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$
(base) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$
(base) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ conda activate python3.8.18
(python3.8.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ python3 -m venv venv3818
(python3.8.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ source venv3818/bin/activate
(venv3818) (python3.8.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ pip install --upgrade pip
Looking in indexes: http://mirrors.aliyun.com/pypi/simple/
Requirement already satisfied: pip in ./venv3818/lib/python3.8/site-packages (23.0.1)
Collecting pip
Downloading http://mirrors.aliyun.com/pypi/packages/50/c2/e06851e8cc28dcad7c155f4753da8833ac06a5c704c109313b8d5a62968a/pip-23.2.1-py3-none-any.whl (2.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 2.6 MB/s eta 0:00:00
Installing collected packages: pip
Attempting uninstall: pip
Found existing installation: pip 23.0.1
Uninstalling pip-23.0.1:
Successfully uninstalled pip-23.0.1
Successfully installed pip-23.2.1
(venv3818) (python3.8.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$
(venv3818) (python3.8.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ git install -v -r requirements_2080.txt
#############
AttributeError: 'dict' object has no attribute '__NUMPY_SETUP__'
error: subprocess-exited-with-error
× Preparing metadata (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
full command: /home/yeqiang/Downloads/ai/M4Singer/code/venv3818/bin/python3 /home/yeqiang/Downloads/ai/M4Singer/code/venv3818/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py prepare_metadata_for_build_wheel /tmp/tmp2aqbozzx
cwd: /tmp/pip-install-nque4yqb/pyworld_8b8bd8684bf6448a8623dd78a29a9e66
Preparing metadata (pyproject.toml) ... error
(venv3818) (python3.8.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ pip install numpy==1.24.4 # 无用!
(venv3818) (python3.8.18) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ pip install --upgrade setuptools # 无用!
Looking in indexes: http://mirrors.aliyun.com/pypi/simple/
Requirement already satisfied: setuptools in ./venv3818/lib/python3.8/site-packages (56.0.0)
Collecting setuptools
Using cached http://mirrors.aliyun.com/pypi/packages/bb/26/7945080113158354380a12ce26873dd6c1ebd88d47f5bc24e2c5bb38c16a/setuptools-68.2.2-py3-none-any.whl (807 kB)
Installing collected packages: setuptools
Attempting uninstall: setuptools
Found existing installation: setuptools 56.0.0
Uninstalling setuptools-56.0.0:
Successfully uninstalled setuptools-56.0.0
Successfully installed setuptools-68.2.2
同步修改 requirements_2080.txt numpy==1.24.4 # 无用!
Installing collected packages: numpy
Successfully installed numpy-1.24.4
Installing backend dependencies ... done
Running command Preparing metadata (pyproject.toml)
running dist_info
creating /tmp/pip-modern-metadata-1tbxqd33/pyworld.egg-info
writing /tmp/pip-modern-metadata-1tbxqd33/pyworld.egg-info/PKG-INFO
writing dependency_links to /tmp/pip-modern-metadata-1tbxqd33/pyworld.egg-info/dependency_links.txt
writing requirements to /tmp/pip-modern-metadata-1tbxqd33/pyworld.egg-info/requires.txt
writing top-level names to /tmp/pip-modern-metadata-1tbxqd33/pyworld.egg-info/top_level.txt
writing manifest file '/tmp/pip-modern-metadata-1tbxqd33/pyworld.egg-info/SOURCES.txt'
/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/dist.py:498: SetuptoolsDeprecationWarning: Invalid dash-separated options
!!
********************************************************************************
Usage of dash-separated 'description-file' will not be supported in future
versions. Please use the underscore name 'description_file' instead.
See https://setuptools.pypa.io/en/latest/userguide/declarative_config.html for details.
********************************************************************************
!!
opt = self.warn_dash_deprecation(opt, section)
Traceback (most recent call last):
File "/home/yeqiang/Downloads/ai/M4Singer/code/venv3818/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
main()
File "/home/yeqiang/Downloads/ai/M4Singer/code/venv3818/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
File "/home/yeqiang/Downloads/ai/M4Singer/code/venv3818/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 149, in prepare_metadata_for_build_wheel
return hook(metadata_directory, config_settings)
File "/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 396, in prepare_metadata_for_build_wheel
self.run_setup()
File "/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 507, in run_setup
super(_BuildMetaLegacyBackend, self).run_setup(setup_script=setup_script)
File "/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 341, in run_setup
exec(code, locals())
File "<string>", line 43, in <module>
File "/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/__init__.py", line 103, in setup
return distutils.core.setup(**attrs)
File "/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
return run_commands(dist)
File "/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
dist.run_commands()
File "/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
self.run_command(cmd)
File "/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 989, in run_command
super().run_command(command)
File "/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/command/dist_info.py", line 107, in run
self.egg_info.run()
File "/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/command/egg_info.py", line 318, in run
self.find_sources()
File "/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/command/egg_info.py", line 326, in find_sources
mm.run()
File "/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/command/egg_info.py", line 548, in run
self.add_defaults()
File "/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/command/egg_info.py", line 586, in add_defaults
sdist.add_defaults(self)
File "/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/command/sdist.py", line 113, in add_defaults
super().add_defaults()
File "/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/_distutils/command/sdist.py", line 251, in add_defaults
self._add_defaults_ext()
File "/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/_distutils/command/sdist.py", line 335, in _add_defaults_ext
build_ext = self.get_finalized_command('build_ext')
File "/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 305, in get_finalized_command
cmd_obj.ensure_finalized()
File "/tmp/pip-build-env-w4btlj_8/overlay/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 111, in ensure_finalized
self.finalize_options()
File "<string>", line 29, in finalize_options
AttributeError: 'dict' object has no attribute '__NUMPY_SETUP__'
error: subprocess-exited-with-error
###############
3.11.5
(base) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ conda activate python3.11.5
(python3.11.5) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ python3 -m venv venv3115
(python3.11.5) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ source venv3115/bin/activate
(venv3115) (python3.11.5) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ git checkout requirements_2080.txt
从索引区更新了 1 个路径
(venv3115) (python3.11.5) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ pip install -v -r requirements_2080.txt
× Building wheel for numpy (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [936 lines of output]
setup.py:67: RuntimeWarning: NumPy 1.19.3 may not yet support Python 3.11.
warnings.warn(
(venv3115) (python3.11.5) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ pip install -v numpy==1.26.0
Using pip 23.2.1 from /home/yeqiang/Downloads/ai/M4Singer/code/venv3115/lib/python3.11/site-packages/pip (python 3.11)
Looking in indexes: http://mirrors.aliyun.com/pypi/simple/
Link requires a different Python (3.11.5 not in: '>=3.7,<3.11'): http://mirrors.aliyun.com/pypi/packages/3a/be/650f9c091ef71cb01d735775d554e068752d3ff63d7943b26316dc401749/numpy-1.21.2.zip#sha256=423216d8afc5923b15df86037c6053bf030d15cc9e3224206ef868c2d63dd6dc (from http://mirrors.aliyun.com/pypi/simple/numpy/) (requires-python:>=3.7,<3.11)
Link requires a different Python (3.11.5 not in: '>=3.7,<3.11'): http://mirrors.aliyun.com/pypi/packages/5f/d6/ad58ded26556eaeaa8c971e08b6466f17c4ac4d786cd3d800e26ce59cc01/numpy-1.21.3.zip#sha256=63571bb7897a584ca3249c86dd01c10bcb5fe4296e3568b2e9c1a55356b6410e (from http://mirrors.aliyun.com/pypi/simple/numpy/) (requires-python:>=3.7,<3.11)
Link requires a different Python (3.11.5 not in: '>=3.7,<3.11'): http://mirrors.aliyun.com/pypi/packages/fb/48/b0708ebd7718a8933f0d3937513ef8ef2f4f04529f1f66ca86d873043921/numpy-1.21.4.zip#sha256=e6c76a87633aa3fa16614b61ccedfae45b91df2767cf097aa9c933932a7ed1e0 (from http://mirrors.aliyun.com/pypi/simple/numpy/) (requires-python:>=3.7,<3.11)
Link requires a different Python (3.11.5 not in: '>=3.7,<3.11'): http://mirrors.aliyun.com/pypi/packages/c2/a8/a924a09492bdfee8c2ec3094d0a13f2799800b4fdc9c890738aeeb12c72e/numpy-1.21.5.zip#sha256=6a5928bc6241264dce5ed509e66f33676fc97f464e7a919edc672fb5532221ee (from http://mirrors.aliyun.com/pypi/simple/numpy/) (requires-python:>=3.7,<3.11)
Link requires a different Python (3.11.5 not in: '>=3.7,<3.11'): http://mirrors.aliyun.com/pypi/packages/45/b7/de7b8e67f2232c26af57c205aaad29fe17754f793404f59c8a730c7a191a/numpy-1.21.6.zip#sha256=ecb55251139706669fdec2ff073c98ef8e9a84473e51e716211b41aa0f18e656 (from http://mirrors.aliyun.com/pypi/simple/numpy/) (requires-python:>=3.7,<3.11)
Collecting numpy==1.26.0
Downloading http://mirrors.aliyun.com/pypi/packages/c4/36/161e2f8110f8c49e59f6107bd6da4257d30aff9f06373d0471811f73dcc5/numpy-1.26.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.2/18.2 MB 5.1 MB/s eta 0:00:00
Installing collected packages: numpy
changing mode of /home/yeqiang/Downloads/ai/M4Singer/code/venv3115/bin/f2py to 775
Successfully installed numpy-1.26.0
同步修改 requirements_2080.txt numpy==1.26.0
3.7.12
(venv3712) (python3.7.12) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$
+ pip install -v -r requirements_2080.txt
Non-user install because user site-packages disabled
Created temporary directory: /tmp/pip-ephem-wheel-cache-lnvdxczj
Created temporary directory: /tmp/pip-req-tracker-brz5dnty
Initialized build tracking at /tmp/pip-req-tracker-brz5dnty
Created build tracker: /tmp/pip-req-tracker-brz5dnty
Entered build tracker: /tmp/pip-req-tracker-brz5dnty
Created temporary directory: /tmp/pip-install-s7un1lfq
Requirement already satisfied: absl-py==0.11.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 1)) (0.11.0)
Requirement already satisfied: alignment==1.0.10 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 2)) (1.0.10)
Requirement already satisfied: altgraph==0.17 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 3)) (0.17)
Requirement already satisfied: appdirs==1.4.4 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 4)) (1.4.4)
Requirement already satisfied: async-timeout==3.0.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 5)) (3.0.1)
Requirement already satisfied: audioread==2.1.9 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 6)) (2.1.9)
Requirement already satisfied: backcall==0.2.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 7)) (0.2.0)
Requirement already satisfied: blinker==1.4 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 8)) (1.4)
Requirement already satisfied: brotlipy==0.7.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 9)) (0.7.0)
Requirement already satisfied: cachetools==4.2.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 10)) (4.2.0)
Requirement already satisfied: certifi==2020.12.5 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 11)) (2020.12.5)
Requirement already satisfied: cffi==1.14.4 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 12)) (1.14.4)
Requirement already satisfied: chardet==4.0.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 13)) (4.0.0)
Requirement already satisfied: click==7.1.2 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 14)) (7.1.2)
Requirement already satisfied: cycler==0.10.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 15)) (0.10.0)
Requirement already satisfied: Cython==0.29.21 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 16)) (0.29.21)
Requirement already satisfied: cytoolz==0.11.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 17)) (0.11.0)
Requirement already satisfied: decorator==4.4.2 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 18)) (4.4.2)
Requirement already satisfied: Distance==0.1.3 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 19)) (0.1.3)
Requirement already satisfied: einops==0.3.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 20)) (0.3.0)
Requirement already satisfied: et-xmlfile==1.0.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 21)) (1.0.1)
Requirement already satisfied: fsspec==0.8.4 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 22)) (0.8.4)
Requirement already satisfied: future==0.18.2 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 23)) (0.18.2)
Requirement already satisfied: g2p-en==2.1.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 24)) (2.1.0)
Requirement already satisfied: g2pM==0.1.2.5 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 25)) (0.1.2.5)
Requirement already satisfied: google-auth==1.24.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 26)) (1.24.0)
Requirement already satisfied: google-auth-oauthlib==0.4.2 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 27)) (0.4.2)
Requirement already satisfied: grpcio==1.34.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 28)) (1.34.0)
Requirement already satisfied: h5py==3.1.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 29)) (3.1.0)
Requirement already satisfied: horology==1.1.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 30)) (1.1.0)
Requirement already satisfied: httplib2==0.18.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 31)) (0.18.1)
Requirement already satisfied: idna==2.10 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 32)) (2.10)
Requirement already satisfied: imageio==2.9.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 33)) (2.9.0)
Requirement already satisfied: inflect==5.0.2 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 34)) (5.0.2)
Requirement already satisfied: ipdb==0.13.4 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 35)) (0.13.4)
Requirement already satisfied: ipython==7.19.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 36)) (7.19.0)
Requirement already satisfied: ipython-genutils==0.2.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 37)) (0.2.0)
Requirement already satisfied: jdcal==1.4.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 38)) (1.4.1)
Requirement already satisfied: jedi==0.17.2 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 39)) (0.17.2)
Requirement already satisfied: jieba==0.42.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 40)) (0.42.1)
Requirement already satisfied: jiwer==2.2.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 41)) (2.2.0)
Requirement already satisfied: joblib==1.0.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 42)) (1.0.0)
Requirement already satisfied: kiwisolver==1.3.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 43)) (1.3.1)
Requirement already satisfied: librosa==0.8.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 44)) (0.8.0)
Requirement already satisfied: llvmlite==0.31.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 45)) (0.31.0)
Requirement already satisfied: Markdown==3.3.3 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 46)) (3.3.3)
Requirement already satisfied: matplotlib==3.3.3 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 47)) (3.3.3)
Requirement already satisfied: miditoolkit==0.1.7 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 48)) (0.1.7)
Requirement already satisfied: mido==1.2.9 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 49)) (1.2.9)
Requirement already satisfied: music21==5.7.2 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 50)) (5.7.2)
Requirement already satisfied: networkx==2.5 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 51)) (2.5)
Requirement already satisfied: nltk==3.5 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 52)) (3.5)
Requirement already satisfied: numba==0.48.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 53)) (0.48.0)
Requirement already satisfied: numpy==1.19.4 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 54)) (1.19.4)
Requirement already satisfied: oauth2client==4.1.3 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 55)) (4.1.3)
Requirement already satisfied: oauthlib==3.1.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 56)) (3.1.0)
Requirement already satisfied: olefile==0.46 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 57)) (0.46)
Requirement already satisfied: packaging==20.7 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 58)) (20.7)
Requirement already satisfied: pandas==1.2.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 59)) (1.2.0)
Requirement already satisfied: parso==0.7.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 60)) (0.7.1)
Requirement already satisfied: patsy==0.5.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 61)) (0.5.1)
Requirement already satisfied: pexpect==4.8.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 62)) (4.8.0)
Requirement already satisfied: pickleshare==0.7.5 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 63)) (0.7.5)
Requirement already satisfied: Pillow==8.0.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 64)) (8.0.1)
Requirement already satisfied: pooch==1.3.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 65)) (1.3.0)
Requirement already satisfied: praat-parselmouth==0.3.3 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 66)) (0.3.3)
Requirement already satisfied: prompt-toolkit==3.0.8 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 67)) (3.0.8)
Requirement already satisfied: protobuf==3.13.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 68)) (3.13.0)
Requirement already satisfied: ptyprocess==0.6.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 69)) (0.6.0)
Requirement already satisfied: pyasn1==0.4.8 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 70)) (0.4.8)
Requirement already satisfied: pyasn1-modules==0.2.8 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 71)) (0.2.8)
Requirement already satisfied: pycparser==2.20 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 72)) (2.20)
Requirement already satisfied: pycwt==0.3.0a22 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 73)) (0.3.0a22)
Requirement already satisfied: Pygments==2.7.3 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 74)) (2.7.3)
Requirement already satisfied: PyInstaller==3.6 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 75)) (3.6)
Requirement already satisfied: PyJWT==1.7.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 76)) (1.7.1)
Requirement already satisfied: pyloudnorm==0.1.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 77)) (0.1.0)
Requirement already satisfied: pyparsing==2.4.7 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 78)) (2.4.7)
Requirement already satisfied: pypinyin==0.39.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 79)) (0.39.0)
Requirement already satisfied: PySocks==1.7.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 80)) (1.7.1)
Requirement already satisfied: python-dateutil==2.8.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 81)) (2.8.1)
Requirement already satisfied: python-Levenshtein==0.12.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 82)) (0.12.0)
Requirement already satisfied: pytorch-lightning==0.7.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 83)) (0.7.1)
Requirement already satisfied: pytz==2020.5 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 84)) (2020.5)
Requirement already satisfied: PyWavelets==1.1.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 85)) (1.1.1)
Requirement already satisfied: pyworld==0.2.12 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 86)) (0.2.12)
Requirement already satisfied: PyYAML==5.3.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 87)) (5.3.1)
Requirement already satisfied: regex==2020.11.13 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 88)) (2020.11.13)
Requirement already satisfied: requests==2.25.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 89)) (2.25.1)
Requirement already satisfied: requests-oauthlib==1.3.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 90)) (1.3.0)
Requirement already satisfied: resampy==0.2.2 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 91)) (0.2.2)
Requirement already satisfied: Resemblyzer==0.1.1.dev0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 92)) (0.1.1.dev0)
Requirement already satisfied: rsa==4.6 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 93)) (4.6)
Requirement already satisfied: scikit-image==0.16.2 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 94)) (0.16.2)
Requirement already satisfied: scikit-learn==0.22.2.post1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 95)) (0.22.2.post1)
Requirement already satisfied: scipy==1.5.4 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 96)) (1.5.4)
Requirement already satisfied: six==1.15.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 97)) (1.15.0)
Requirement already satisfied: SoundFile==0.10.3.post1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 98)) (0.10.3.post1)
Requirement already satisfied: stopit==1.1.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 99)) (1.1.1)
Requirement already satisfied: tensorboard==2.4.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 100)) (2.4.0)
Requirement already satisfied: tensorboard-plugin-wit==1.7.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 101)) (1.7.0)
Requirement already satisfied: tensorboardX==2.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 102)) (2.1)
Requirement already satisfied: TextGrid==1.5 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 103)) (1.5)
Requirement already satisfied: threadpoolctl==2.1.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 104)) (2.1.0)
Requirement already satisfied: toolz==0.11.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 105)) (0.11.1)
Requirement already satisfied: torch==1.6.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 106)) (1.6.0)
Requirement already satisfied: torchaudio==0.6.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 107)) (0.6.0)
Requirement already satisfied: torchvision==0.7.0 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 108)) (0.7.0)
Requirement already satisfied: tqdm==4.54.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 109)) (4.54.1)
Requirement already satisfied: traitlets==5.0.5 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 110)) (5.0.5)
Requirement already satisfied: typing==3.7.4.3 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 111)) (3.7.4.3)
Requirement already satisfied: urllib3==1.26.2 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 112)) (1.26.2)
Requirement already satisfied: uuid==1.30 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 113)) (1.30)
Requirement already satisfied: wcwidth==0.2.5 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 114)) (0.2.5)
Requirement already satisfied: webencodings==0.5.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 115)) (0.5.1)
Requirement already satisfied: webrtcvad==2.0.10 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 116)) (2.0.10)
Requirement already satisfied: Werkzeug==1.0.1 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 117)) (1.0.1)
Requirement already satisfied: pretty-midi==0.2.9 in ./venv3712/lib/python3.7/site-packages (from -r requirements_2080.txt (line 118)) (0.2.9)
Requirement already satisfied: setuptools>=40.3.0 in ./venv3712/lib/python3.7/site-packages (from google-auth==1.24.0->-r requirements_2080.txt (line 26)) (47.1.0)
Requirement already satisfied: cached-property; python_version < "3.8" in ./venv3712/lib/python3.7/site-packages (from h5py==3.1.0->-r requirements_2080.txt (line 29)) (1.5.2)
Requirement already satisfied: importlib-metadata; python_version < "3.8" in ./venv3712/lib/python3.7/site-packages (from Markdown==3.3.3->-r requirements_2080.txt (line 46)) (6.7.0)
Requirement already satisfied: wheel>=0.26; python_version >= "3" in ./venv3712/lib/python3.7/site-packages (from tensorboard==2.4.0->-r requirements_2080.txt (line 100)) (0.41.2)
Requirement already satisfied: zipp>=0.5 in ./venv3712/lib/python3.7/site-packages (from importlib-metadata; python_version < "3.8"->Markdown==3.3.3->-r requirements_2080.txt (line 46)) (3.15.0)
Requirement already satisfied: typing-extensions>=3.6.4; python_version < "3.8" in ./venv3712/lib/python3.7/site-packages (from importlib-metadata; python_version < "3.8"->Markdown==3.3.3->-r requirements_2080.txt (line 46)) (4.7.1)
WARNING: You are using pip version 20.1.1; however, version 23.2.1 is available.
You should consider upgrading via the '/home/yeqiang/Downloads/ai/M4Singer/code/venv3712/bin/python3 -m pip install --upgrade pip' command.
Removed build tracker: '/tmp/pip-req-tracker-brz5dnty'
运行测试
(venv3712) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ find | grep binarize.py
./data_gen/singing/binarize.py
./data_gen/tts/bin/binarize.py
(venv3712) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ export PYTHONPATH=.
CUDA_VISIBLE_DEVICES=0 python data_gen/tts/bin/binarize.py --config usr/configs/m4singer/base.yaml
| Hparams chains: ['configs/config_base.yaml', 'configs/tts/base.yaml', 'configs/tts/fs2.yaml', 'configs/tts/base_zh.yaml', 'configs/singing/base.yaml', 'usr/configs/base.yaml', 'usr/configs/popcs_ds_beta6.yaml', 'usr/configs/m4singer/base.yaml']
| Hparams:
K_step: 51, accumulate_grad_batches: 1, audio_num_mel_bins: 80, audio_sample_rate: 24000, base_config: ['usr/configs/popcs_ds_beta6.yaml'],
binarization_args: {'shuffle': False, 'with_txt': True, 'with_wav': False, 'with_align': True, 'with_spk_embed': True, 'with_f0': True, 'with_f0cwt': True}, binarizer_cls: data_gen.singing.binarize.M4SingerBinarizer, binary_data_dir: data/binary/m4singer, check_val_every_n_epoch: 10, clip_grad_norm: 1,
content_cond_steps: [], cwt_add_f0_loss: False, cwt_hidden_size: 128, cwt_layers: 2, cwt_loss: l1,
cwt_std_scale: 0.8, datasets: ['m4singer'], debug: False, dec_ffn_kernel_size: 9, dec_layers: 4,
decay_steps: 50000, decoder_type: fft, dict_dir: , diff_decoder_type: wavenet, diff_loss_type: l1,
dilation_cycle_length: 1, dropout: 0.1, ds_workers: 4, dur_enc_hidden_stride_kernel: ['0,2,3', '0,2,3', '0,1,3'], dur_loss: mse,
dur_predictor_kernel: 3, dur_predictor_layers: 5, enc_ffn_kernel_size: 9, enc_layers: 4, encoder_K: 8,
encoder_type: fft, endless_ds: True, ffn_act: gelu, ffn_padding: SAME, fft_size: 512,
fmax: 12000, fmin: 30, fs2_ckpt: , gen_dir_name: , gen_tgt_spk_id: -1,
hidden_size: 256, hop_size: 128, infer: False, keep_bins: 80, lambda_commit: 0.25,
lambda_energy: 0.0, lambda_f0: 1.0, lambda_ph_dur: 1.0, lambda_sent_dur: 1.0, lambda_uv: 1.0,
lambda_word_dur: 1.0, load_ckpt: , log_interval: 100, loud_norm: False, lr: 0.001,
max_beta: 0.06, max_epochs: 1000, max_eval_sentences: 1, max_eval_tokens: 60000, max_frames: 5000,
max_input_tokens: 1550, max_sentences: 12, max_tokens: 40000, max_updates: 160000, mel_loss: ssim:0.5|l1:0.5,
mel_vmax: 1.5, mel_vmin: -6.0, min_level_db: -120, norm_type: gn, num_ckpt_keep: 3,
num_heads: 2, num_sanity_val_steps: 1, num_spk: 20, num_test_samples: 0, num_valid_plots: 10,
optimizer_adam_beta1: 0.9, optimizer_adam_beta2: 0.98, out_wav_norm: False, pe_ckpt: checkpoints/m4singer_pe, pe_enable: True,
pitch_ar: False, pitch_enc_hidden_stride_kernel: ['0,2,5', '0,2,5', '0,2,5'], pitch_extractor: parselmouth, pitch_loss: l1, pitch_norm: log,
pitch_type: frame, pre_align_args: {'use_tone': False, 'forced_align': 'mfa', 'use_sox': True, 'txt_processor': 'zh_g2pM', 'allow_no_txt': False, 'denoise': False}, pre_align_cls: data_gen.singing.pre_align.SingingPreAlign, predictor_dropout: 0.5, predictor_grad: 0.1,
predictor_hidden: -1, predictor_kernel: 5, predictor_layers: 5, prenet_dropout: 0.5, prenet_hidden_size: 256,
pretrain_fs_ckpt: , processed_data_dir: xxx, profile_infer: False, raw_data_dir: data/raw/m4singer, ref_norm_layer: bn,
rel_pos: True, reset_phone_dict: True, residual_channels: 256, residual_layers: 20, save_best: False,
save_ckpt: True, save_codes: ['configs', 'modules', 'tasks', 'utils', 'usr'], save_f0: True, save_gt: True, schedule_type: linear,
seed: 1234, sort_by_len: True, spec_max: [-0.3894500136375427, -0.3796464204788208, -0.2914905250072479, -0.15550297498703003, -0.08502643555402756, 0.10698417574167252, -0.0739326998591423, -0.0541548952460289, 0.15501998364925385, 0.06483431905508041, 0.03054228238761425, -0.013737732544541359, -0.004876468330621719, 0.04368264228105545, 0.13329921662807465, 0.16471388936042786, 0.04605761915445328, -0.05680707097053528, 0.0542571023106575, -0.0076539707370102406, -0.00953489076346159, -0.04434828832745552, 0.001293870504014194, -0.12238839268684387, 0.06418416649103165, 0.02843189612030983, 0.08505241572856903, 0.07062800228595734, 0.00120724702719599, -0.07675088942050934, 0.03785804659128189, 0.04890783503651619, -0.06888376921415329, -0.0839693546295166, -0.17545585334300995, -0.2911079525947571, -0.4238220453262329, -0.262084037065506, -0.3002263605594635, -0.3845032751560211, -0.3906497061252594, -0.6550108790397644, -0.7810799479484558, -0.7503029704093933, -0.7995198965072632, -0.8092347383499146, -0.6196113228797913, -0.6684317588806152, -0.7735874056816101, -0.8324533104896545, -0.9601566791534424, -0.955253541469574, -0.748817503452301, -0.9106167554855347, -0.9707801342010498, -1.053107500076294, -1.0448424816131592, -1.1082794666290283, -1.1296544075012207, -1.071642279624939, -1.1003081798553467, -1.166810154914856, -1.1408926248550415, -1.1330615282058716, -1.1167492866516113, -1.0716774463653564, -1.035891056060791, -1.0092483758926392, -0.9675999879837036, -0.938962996006012, -1.0120564699172974, -0.9777995347976685, -1.029313564300537, -0.9459163546562195, -0.8519706130027771, -0.7751091122627258, -0.7933766841888428, -0.9019735455513, -0.9983296990394592, -1.505873441696167], spec_min: [-6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0], spk_cond_steps: [],
stop_token_weight: 5.0, task_cls: usr.diffsinger_task.DiffSingerTask, test_ids: [], test_input_dir: , test_num: 0,
test_prefixes: ['Alto-2#岁月神偷', 'Alto-2#奇妙能力歌', 'Tenor-1#一千年以后', 'Tenor-1#童话', 'Tenor-2#消愁', 'Tenor-2#一荤一素', 'Soprano-1#念奴娇赤壁怀古', 'Soprano-1#问春'], test_set_name: test, timesteps: 100, train_set_name: train, use_denoise: False,
use_energy_embed: False, use_gt_dur: False, use_gt_f0: False, use_midi: True, use_nsf: True,
use_pitch_embed: True, use_pos_embed: True, use_spk_embed: False, use_spk_id: True, use_split_spk_id: False,
use_uv: True, use_var_enc: False, val_check_interval: 2000, valid_num: 0, valid_set_name: valid,
validate: False, vocoder: vocoders.hifigan.HifiGAN, vocoder_ckpt: checkpoints/m4singer_hifigan, warmup_updates: 2000, wav2spec_eps: 1e-6,
weight_decay: 0, win_size: 512, work_dir: ,
| Binarizer: <class 'data_gen.singing.binarize.M4SingerBinarizer'>
Traceback (most recent call last):
File "data_gen/tts/bin/binarize.py", line 20, in <module>
binarize()
File "data_gen/tts/bin/binarize.py", line 15, in binarize
binarizer_cls().process()
File "/home/yeqiang/Downloads/ai/M4Singer/code/data_gen/singing/binarize.py", line 90, in process
self.load_meta_data()
File "/home/yeqiang/Downloads/ai/M4Singer/code/data_gen/singing/binarize.py", line 304, in load_meta_data
song_items = json.load(open(os.path.join(raw_data_dir, 'meta.json'))) # [list of dict]
FileNotFoundError: [Errno 2] No such file or directory: 'data/raw/m4singer/meta.json'
需要下载
https://drive.google.com/file/d/1xC37E59EWRRFFLdG3aJkVqwtLDgtFNqW/view?usp=share_link
地址来源M4Singer · GitHub
a) Download m4singer.zip, then unzip this file into data/raw
.
在src目录下,重新执行
export PYTHONPATH=.
CUDA_VISIBLE_DEVICES=0 python data_gen/tts/bin/binarize.py --config usr/configs/m4singer/base.yaml
histroy
511 mkdir data/raw -p
512 cd data/raw/
513 unzip ~/Downloads/ai/m4singer.zip
520 source venv3712
521 source venv3712/bin/activate
522 export PYTHONPATH=.
523 CUDA_VISIBLE_DEVICES=0 python data_gen/tts/bin/binarize.py --config usr/configs/m4singer/base.yaml
(venv3712) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ export PYTHONPATH=.
CUDA_VISIBLE_DEVICES=0 python data_gen/tts/bin/binarize.py --config usr/configs/m4singer/base.yaml
| Hparams chains: ['configs/config_base.yaml', 'configs/tts/base.yaml', 'configs/tts/fs2.yaml', 'configs/tts/base_zh.yaml', 'configs/singing/base.yaml', 'usr/configs/base.yaml', 'usr/configs/popcs_ds_beta6.yaml', 'usr/configs/m4singer/base.yaml']
| Hparams:
K_step: 51, accumulate_grad_batches: 1, audio_num_mel_bins: 80, audio_sample_rate: 24000, base_config: ['usr/configs/popcs_ds_beta6.yaml'],
binarization_args: {'shuffle': False, 'with_txt': True, 'with_wav': False, 'with_align': True, 'with_spk_embed': True, 'with_f0': True, 'with_f0cwt': True}, binarizer_cls: data_gen.singing.binarize.M4SingerBinarizer, binary_data_dir: data/binary/m4singer, check_val_every_n_epoch: 10, clip_grad_norm: 1,
content_cond_steps: [], cwt_add_f0_loss: False, cwt_hidden_size: 128, cwt_layers: 2, cwt_loss: l1,
cwt_std_scale: 0.8, datasets: ['m4singer'], debug: False, dec_ffn_kernel_size: 9, dec_layers: 4,
decay_steps: 50000, decoder_type: fft, dict_dir: , diff_decoder_type: wavenet, diff_loss_type: l1,
dilation_cycle_length: 1, dropout: 0.1, ds_workers: 4, dur_enc_hidden_stride_kernel: ['0,2,3', '0,2,3', '0,1,3'], dur_loss: mse,
dur_predictor_kernel: 3, dur_predictor_layers: 5, enc_ffn_kernel_size: 9, enc_layers: 4, encoder_K: 8,
encoder_type: fft, endless_ds: True, ffn_act: gelu, ffn_padding: SAME, fft_size: 512,
fmax: 12000, fmin: 30, fs2_ckpt: , gen_dir_name: , gen_tgt_spk_id: -1,
hidden_size: 256, hop_size: 128, infer: False, keep_bins: 80, lambda_commit: 0.25,
lambda_energy: 0.0, lambda_f0: 1.0, lambda_ph_dur: 1.0, lambda_sent_dur: 1.0, lambda_uv: 1.0,
lambda_word_dur: 1.0, load_ckpt: , log_interval: 100, loud_norm: False, lr: 0.001,
max_beta: 0.06, max_epochs: 1000, max_eval_sentences: 1, max_eval_tokens: 60000, max_frames: 5000,
max_input_tokens: 1550, max_sentences: 12, max_tokens: 40000, max_updates: 160000, mel_loss: ssim:0.5|l1:0.5,
mel_vmax: 1.5, mel_vmin: -6.0, min_level_db: -120, norm_type: gn, num_ckpt_keep: 3,
num_heads: 2, num_sanity_val_steps: 1, num_spk: 20, num_test_samples: 0, num_valid_plots: 10,
optimizer_adam_beta1: 0.9, optimizer_adam_beta2: 0.98, out_wav_norm: False, pe_ckpt: checkpoints/m4singer_pe, pe_enable: True,
pitch_ar: False, pitch_enc_hidden_stride_kernel: ['0,2,5', '0,2,5', '0,2,5'], pitch_extractor: parselmouth, pitch_loss: l1, pitch_norm: log,
pitch_type: frame, pre_align_args: {'use_tone': False, 'forced_align': 'mfa', 'use_sox': True, 'txt_processor': 'zh_g2pM', 'allow_no_txt': False, 'denoise': False}, pre_align_cls: data_gen.singing.pre_align.SingingPreAlign, predictor_dropout: 0.5, predictor_grad: 0.1,
predictor_hidden: -1, predictor_kernel: 5, predictor_layers: 5, prenet_dropout: 0.5, prenet_hidden_size: 256,
pretrain_fs_ckpt: , processed_data_dir: xxx, profile_infer: False, raw_data_dir: data/raw/m4singer, ref_norm_layer: bn,
rel_pos: True, reset_phone_dict: True, residual_channels: 256, residual_layers: 20, save_best: False,
save_ckpt: True, save_codes: ['configs', 'modules', 'tasks', 'utils', 'usr'], save_f0: True, save_gt: True, schedule_type: linear,
seed: 1234, sort_by_len: True, spec_max: [-0.3894500136375427, -0.3796464204788208, -0.2914905250072479, -0.15550297498703003, -0.08502643555402756, 0.10698417574167252, -0.0739326998591423, -0.0541548952460289, 0.15501998364925385, 0.06483431905508041, 0.03054228238761425, -0.013737732544541359, -0.004876468330621719, 0.04368264228105545, 0.13329921662807465, 0.16471388936042786, 0.04605761915445328, -0.05680707097053528, 0.0542571023106575, -0.0076539707370102406, -0.00953489076346159, -0.04434828832745552, 0.001293870504014194, -0.12238839268684387, 0.06418416649103165, 0.02843189612030983, 0.08505241572856903, 0.07062800228595734, 0.00120724702719599, -0.07675088942050934, 0.03785804659128189, 0.04890783503651619, -0.06888376921415329, -0.0839693546295166, -0.17545585334300995, -0.2911079525947571, -0.4238220453262329, -0.262084037065506, -0.3002263605594635, -0.3845032751560211, -0.3906497061252594, -0.6550108790397644, -0.7810799479484558, -0.7503029704093933, -0.7995198965072632, -0.8092347383499146, -0.6196113228797913, -0.6684317588806152, -0.7735874056816101, -0.8324533104896545, -0.9601566791534424, -0.955253541469574, -0.748817503452301, -0.9106167554855347, -0.9707801342010498, -1.053107500076294, -1.0448424816131592, -1.1082794666290283, -1.1296544075012207, -1.071642279624939, -1.1003081798553467, -1.166810154914856, -1.1408926248550415, -1.1330615282058716, -1.1167492866516113, -1.0716774463653564, -1.035891056060791, -1.0092483758926392, -0.9675999879837036, -0.938962996006012, -1.0120564699172974, -0.9777995347976685, -1.029313564300537, -0.9459163546562195, -0.8519706130027771, -0.7751091122627258, -0.7933766841888428, -0.9019735455513, -0.9983296990394592, -1.505873441696167], spec_min: [-6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0], spk_cond_steps: [],
stop_token_weight: 5.0, task_cls: usr.diffsinger_task.DiffSingerTask, test_ids: [], test_input_dir: , test_num: 0,
test_prefixes: ['Alto-2#岁月神偷', 'Alto-2#奇妙能力歌', 'Tenor-1#一千年以后', 'Tenor-1#童话', 'Tenor-2#消愁', 'Tenor-2#一荤一素', 'Soprano-1#念奴娇赤壁怀古', 'Soprano-1#问春'], test_set_name: test, timesteps: 100, train_set_name: train, use_denoise: False,
use_energy_embed: False, use_gt_dur: False, use_gt_f0: False, use_midi: True, use_nsf: True,
use_pitch_embed: True, use_pos_embed: True, use_spk_embed: False, use_spk_id: True, use_split_spk_id: False,
use_uv: True, use_var_enc: False, val_check_interval: 2000, valid_num: 0, valid_set_name: valid,
validate: False, vocoder: vocoders.hifigan.HifiGAN, vocoder_ckpt: checkpoints/m4singer_hifigan, warmup_updates: 2000, wav2spec_eps: 1e-6,
weight_decay: 0, win_size: 512, work_dir: ,
| Binarizer: <class 'data_gen.singing.binarize.M4SingerBinarizer'>
spkers: {'Alto-7', 'Tenor-1', 'Bass-3', 'Tenor-5', 'Bass-2', 'Alto-5', 'Soprano-1', 'Alto-3', 'Alto-6', 'Tenor-3', 'Tenor-7', 'Tenor-4', 'Tenor-2', 'Soprano-3', 'Alto-1', 'Soprano-2', 'Alto-4', 'Bass-1', 'Tenor-6', 'Alto-2'}
| spk_map: {'Alto-1': 0, 'Alto-2': 1, 'Alto-3': 2, 'Alto-4': 3, 'Alto-5': 4, 'Alto-6': 5, 'Alto-7': 6, 'Bass-1': 7, 'Bass-2': 8, 'Bass-3': 9, 'Soprano-1': 10, 'Soprano-2': 11, 'Soprano-3': 12, 'Tenor-1': 13, 'Tenor-2': 14, 'Tenor-3': 15, 'Tenor-4': 16, 'Tenor-5': 17, 'Tenor-6': 18, 'Tenor-7': 19}
| Build phone set: ['<AP>', '<SP>', 'a', 'ai', 'an', 'ang', 'ao', 'b', 'c', 'ch', 'd', 'e', 'ei', 'en', 'eng', 'er', 'f', 'g', 'h', 'i', 'ia', 'ian', 'iang', 'iao', 'ie', 'in', 'ing', 'iong', 'iou', 'j', 'k', 'l', 'm', 'n', 'o', 'ong', 'ou', 'p', 'q', 'r', 's', 'sh', 't', 'u', 'ua', 'uai', 'uan', 'uang', 'uei', 'uen', 'uo', 'v', 'van', 've', 'vn', 'x', 'z', 'zh']
Loaded the voice encoder model on cuda in 10.29 seconds.
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 217/217 [00:47<00:00, 4.57it/s]
| valid total duration: 1254.837s
Loaded the voice encoder model on cuda in 0.01 seconds.
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 217/217 [00:39<00:00, 5.43it/s]
| test total duration: 1254.837s
Loaded the voice encoder model on cuda in 0.01 seconds.
42%|█████████████████████████████████████████████████████▏ | 8670/20679 [19:57<21:41, 9.23it/s]| Skip item (Empty **gt** f0). item_name: Bass-1#父亲写的散文诗#0013, wav_fn: data/raw/m4singer/Bass-1#父亲写的散文诗/0013.wav
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20679/20679 [47:41<00:00, 7.23it/s]
| train total duration: 105705.472s
GPU使用率低、占用3%左右,显存占用1G+
此过程更消耗CPU
释放预训练模型(不确定此描述是否严谨)
(venv3712) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ mkdir checkpoints
(venv3712) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ ll ../m4singer_*
-rw-rw-r-- 1 yeqiang yeqiang 361083505 2023-09-26 21:21:01 ../m4singer_diff_e2e.zip
-rw-rw-r-- 1 yeqiang yeqiang 265208925 2023-09-26 20:54:37 ../m4singer_fs2_e2e.zip
-rw-rw-r-- 1 yeqiang yeqiang 943383863 2023-09-26 22:41:27 ../m4singer_hifigan.zip
-rw-rw-r-- 1 yeqiang yeqiang 35405898 2023-09-26 19:20:30 ../m4singer_pe.zip
(venv3712) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ cd checkpoints/
(venv3712) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code/checkpoints$ unzip ../../m4singer_pe.zip
Archive: ../../m4singer_pe.zip
inflating: m4singer_pe/config.yaml
inflating: m4singer_pe/model_ckpt_steps_280000.ckpt
(venv3712) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code/checkpoints$ unzip ../../m4singer_
m4singer_diff_e2e.zip m4singer_fs2_e2e.zip m4singer_hifigan.zip m4singer_pe.zip
(venv3712) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code/checkpoints$ unzip ../../m4singer_diff_e2e.zip
Archive: ../../m4singer_diff_e2e.zip
inflating: m4singer_diff_e2e/config.yaml
inflating: m4singer_diff_e2e/model_ckpt_steps_900000.ckpt
(venv3712) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code/checkpoints$ unzip ../../m4singer_fs2_e2e.zip
Archive: ../../m4singer_fs2_e2e.zip
inflating: m4singer_fs2_e2e/config.yaml
inflating: m4singer_fs2_e2e/model_ckpt_steps_320000.ckpt
(venv3712) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code/checkpoints$ unzip ../../m4singer_hifigan.zip
Archive: ../../m4singer_hifigan.zip
inflating: m4singer_hifigan/config.yaml
inflating: m4singer_hifigan/model_ckpt_steps_1970000.ckpt
训练模型
(venv3712) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ CUDA_VISIBLE_DEVICES=0 python tasks/run.py --config usr/configs/m4singer/fs2.yaml --exp_name m4singer_fs2_e2e --reset
| Hparams chains: ['configs/config_base.yaml', 'configs/tts/base.yaml', 'configs/tts/fs2.yaml', 'configs/tts/base_zh.yaml', 'configs/singing/base.yaml', 'configs/singing/fs2.yaml', 'usr/configs/base.yaml', 'usr/configs/popcs_ds_beta6.yaml', 'usr/configs/m4singer/base.yaml', 'usr/configs/m4singer/fs2.yaml']
| Hparams:
K_step: 51, accumulate_grad_batches: 1, audio_num_mel_bins: 80, audio_sample_rate: 24000, base_config: ['configs/singing/fs2.yaml', 'usr/configs/m4singer/base.yaml'],
binarization_args: {'shuffle': False, 'with_txt': True, 'with_wav': False, 'with_align': True, 'with_spk_embed': True, 'with_f0': True, 'with_f0cwt': True}, binarizer_cls: data_gen.singing.binarize.M4SingerBinarizer, binary_data_dir: data/binary/m4singer, check_val_every_n_epoch: 10, clip_grad_norm: 1,
content_cond_steps: [], cwt_add_f0_loss: False, cwt_hidden_size: 128, cwt_layers: 2, cwt_loss: l1,
cwt_std_scale: 0.8, datasets: ['m4singer'], debug: False, dec_ffn_kernel_size: 9, dec_layers: 4,
decay_steps: 50000, decoder_type: fft, dict_dir: , diff_decoder_type: wavenet, diff_loss_type: l1,
dilation_cycle_length: 1, dropout: 0.1, ds_workers: 4, dur_enc_hidden_stride_kernel: ['0,2,3', '0,2,3', '0,1,3'], dur_loss: mse,
dur_predictor_kernel: 3, dur_predictor_layers: 5, enc_ffn_kernel_size: 9, enc_layers: 4, encoder_K: 8,
encoder_type: fft, endless_ds: True, ffn_act: gelu, ffn_padding: SAME, fft_size: 512,
fmax: 12000, fmin: 30, fs2_ckpt: , gen_dir_name: , gen_tgt_spk_id: -1,
hidden_size: 256, hop_size: 128, infer: False, keep_bins: 80, lambda_commit: 0.25,
lambda_energy: 0.0, lambda_f0: 1.0, lambda_ph_dur: 1.0, lambda_sent_dur: 1.0, lambda_uv: 1.0,
lambda_word_dur: 1.0, load_ckpt: , log_interval: 100, loud_norm: False, lr: 1,
max_beta: 0.06, max_epochs: 1000, max_eval_sentences: 1, max_eval_tokens: 60000, max_frames: 5000,
max_input_tokens: 1550, max_sentences: 12, max_tokens: 40000, max_updates: 320000, mel_loss: ssim:0.5|l1:0.5,
mel_vmax: 1.5, mel_vmin: -6.0, min_level_db: -120, norm_type: gn, num_ckpt_keep: 3,
num_heads: 2, num_sanity_val_steps: 1, num_spk: 20, num_test_samples: 0, num_valid_plots: 10,
optimizer_adam_beta1: 0.9, optimizer_adam_beta2: 0.98, out_wav_norm: False, pe_ckpt: checkpoints/m4singer_pe, pe_enable: True,
pitch_ar: False, pitch_enc_hidden_stride_kernel: ['0,2,5', '0,2,5', '0,2,5'], pitch_extractor: parselmouth, pitch_loss: l1, pitch_norm: log,
pitch_type: frame, pre_align_args: {'use_tone': False, 'forced_align': 'mfa', 'use_sox': True, 'txt_processor': 'zh_g2pM', 'allow_no_txt': False, 'denoise': False}, pre_align_cls: data_gen.singing.pre_align.SingingPreAlign, predictor_dropout: 0.5, predictor_grad: 0.1,
predictor_hidden: -1, predictor_kernel: 5, predictor_layers: 5, prenet_dropout: 0.5, prenet_hidden_size: 256,
pretrain_fs_ckpt: , processed_data_dir: xxx, profile_infer: False, raw_data_dir: data/raw/m4singer, ref_norm_layer: bn,
rel_pos: True, reset_phone_dict: True, residual_channels: 256, residual_layers: 20, save_best: False,
save_ckpt: True, save_codes: ['configs', 'modules', 'tasks', 'utils', 'usr'], save_f0: True, save_gt: True, schedule_type: linear,
seed: 1234, sort_by_len: True, spec_max: [-0.3894500136375427, -0.3796464204788208, -0.2914905250072479, -0.15550297498703003, -0.08502643555402756, 0.10698417574167252, -0.0739326998591423, -0.0541548952460289, 0.15501998364925385, 0.06483431905508041, 0.03054228238761425, -0.013737732544541359, -0.004876468330621719, 0.04368264228105545, 0.13329921662807465, 0.16471388936042786, 0.04605761915445328, -0.05680707097053528, 0.0542571023106575, -0.0076539707370102406, -0.00953489076346159, -0.04434828832745552, 0.001293870504014194, -0.12238839268684387, 0.06418416649103165, 0.02843189612030983, 0.08505241572856903, 0.07062800228595734, 0.00120724702719599, -0.07675088942050934, 0.03785804659128189, 0.04890783503651619, -0.06888376921415329, -0.0839693546295166, -0.17545585334300995, -0.2911079525947571, -0.4238220453262329, -0.262084037065506, -0.3002263605594635, -0.3845032751560211, -0.3906497061252594, -0.6550108790397644, -0.7810799479484558, -0.7503029704093933, -0.7995198965072632, -0.8092347383499146, -0.6196113228797913, -0.6684317588806152, -0.7735874056816101, -0.8324533104896545, -0.9601566791534424, -0.955253541469574, -0.748817503452301, -0.9106167554855347, -0.9707801342010498, -1.053107500076294, -1.0448424816131592, -1.1082794666290283, -1.1296544075012207, -1.071642279624939, -1.1003081798553467, -1.166810154914856, -1.1408926248550415, -1.1330615282058716, -1.1167492866516113, -1.0716774463653564, -1.035891056060791, -1.0092483758926392, -0.9675999879837036, -0.938962996006012, -1.0120564699172974, -0.9777995347976685, -1.029313564300537, -0.9459163546562195, -0.8519706130027771, -0.7751091122627258, -0.7933766841888428, -0.9019735455513, -0.9983296990394592, -1.505873441696167], spec_min: [-6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0], spk_cond_steps: [],
stop_token_weight: 5.0, task_cls: usr.diffsinger_task.AuxDecoderMIDITask, test_ids: [], test_input_dir: , test_num: 0,
test_prefixes: ['Alto-2#岁月神偷', 'Alto-2#奇妙能力歌', 'Tenor-1#一千年以后', 'Tenor-1#童话', 'Tenor-2#消愁', 'Tenor-2#一荤一素', 'Soprano-1#念奴娇赤壁怀古', 'Soprano-1#问春'], test_set_name: test, timesteps: 100, train_set_name: train, use_denoise: False,
use_energy_embed: False, use_gt_dur: False, use_gt_f0: False, use_midi: True, use_nsf: True,
use_pitch_embed: False, use_pos_embed: True, use_spk_embed: False, use_spk_id: True, use_split_spk_id: False,
use_uv: True, use_var_enc: False, val_check_interval: 2000, valid_num: 0, valid_set_name: valid,
validate: False, vocoder: vocoders.hifigan.HifiGAN, vocoder_ckpt: checkpoints/m4singer_hifigan, warmup_updates: 2000, wav2spec_eps: 1e-6,
weight_decay: 0, win_size: 512, work_dir: checkpoints/m4singer_fs2_e2e,
| Mel losses: {'ssim': 0.5, 'l1': 0.5}
09/27 07:06:49 PM gpu available: True, used: True
| Copied codes to checkpoints/m4singer_fs2_e2e/codes/20230927190649.
| model Arch: FastSpeech2MIDI(
(encoder_embed_tokens): Embedding(61, 256, padding_idx=0)
(decoder): FastspeechDecoder(
(embed_positions): SinusoidalPositionalEmbedding()
(layers): ModuleList(
(0): TransformerEncoderLayer(
(op): EncSALayer(
(layer_norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=256, out_features=256, bias=False)
)
(layer_norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(ffn): TransformerFFNLayer(
(ffn_1): Conv1d(256, 1024, kernel_size=(9,), stride=(1,), padding=(4,))
(ffn_2): Linear(in_features=1024, out_features=256, bias=True)
)
)
)
(1): TransformerEncoderLayer(
(op): EncSALayer(
(layer_norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=256, out_features=256, bias=False)
)
(layer_norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(ffn): TransformerFFNLayer(
(ffn_1): Conv1d(256, 1024, kernel_size=(9,), stride=(1,), padding=(4,))
(ffn_2): Linear(in_features=1024, out_features=256, bias=True)
)
)
)
(2): TransformerEncoderLayer(
(op): EncSALayer(
(layer_norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=256, out_features=256, bias=False)
)
(layer_norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(ffn): TransformerFFNLayer(
(ffn_1): Conv1d(256, 1024, kernel_size=(9,), stride=(1,), padding=(4,))
(ffn_2): Linear(in_features=1024, out_features=256, bias=True)
)
)
)
(3): TransformerEncoderLayer(
(op): EncSALayer(
(layer_norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=256, out_features=256, bias=False)
)
(layer_norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(ffn): TransformerFFNLayer(
(ffn_1): Conv1d(256, 1024, kernel_size=(9,), stride=(1,), padding=(4,))
(ffn_2): Linear(in_features=1024, out_features=256, bias=True)
)
)
)
)
(layer_norm): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
)
(mel_out): Linear(in_features=256, out_features=80, bias=True)
(spk_embed_proj): Embedding(21, 256)
(dur_predictor): DurationPredictor(
(conv): ModuleList(
(0): Sequential(
(0): ConstantPad1d(padding=(1, 1), value=0)
(1): Conv1d(256, 256, kernel_size=(3,), stride=(1,))
(2): ReLU()
(3): LayerNorm((256,), eps=1e-12, elementwise_affine=True)
(4): Dropout(p=0.5, inplace=False)
)
(1): Sequential(
(0): ConstantPad1d(padding=(1, 1), value=0)
(1): Conv1d(256, 256, kernel_size=(3,), stride=(1,))
(2): ReLU()
(3): LayerNorm((256,), eps=1e-12, elementwise_affine=True)
(4): Dropout(p=0.5, inplace=False)
)
(2): Sequential(
(0): ConstantPad1d(padding=(1, 1), value=0)
(1): Conv1d(256, 256, kernel_size=(3,), stride=(1,))
(2): ReLU()
(3): LayerNorm((256,), eps=1e-12, elementwise_affine=True)
(4): Dropout(p=0.5, inplace=False)
)
(3): Sequential(
(0): ConstantPad1d(padding=(1, 1), value=0)
(1): Conv1d(256, 256, kernel_size=(3,), stride=(1,))
(2): ReLU()
(3): LayerNorm((256,), eps=1e-12, elementwise_affine=True)
(4): Dropout(p=0.5, inplace=False)
)
(4): Sequential(
(0): ConstantPad1d(padding=(1, 1), value=0)
(1): Conv1d(256, 256, kernel_size=(3,), stride=(1,))
(2): ReLU()
(3): LayerNorm((256,), eps=1e-12, elementwise_affine=True)
(4): Dropout(p=0.5, inplace=False)
)
)
(linear): Linear(in_features=256, out_features=1, bias=True)
)
(length_regulator): LengthRegulator()
(encoder): FastspeechMIDIEncoder(
(layers): ModuleList(
(0): TransformerEncoderLayer(
(op): EncSALayer(
(layer_norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=256, out_features=256, bias=False)
)
(layer_norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(ffn): TransformerFFNLayer(
(ffn_1): Conv1d(256, 1024, kernel_size=(9,), stride=(1,), padding=(4,))
(ffn_2): Linear(in_features=1024, out_features=256, bias=True)
)
)
)
(1): TransformerEncoderLayer(
(op): EncSALayer(
(layer_norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=256, out_features=256, bias=False)
)
(layer_norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(ffn): TransformerFFNLayer(
(ffn_1): Conv1d(256, 1024, kernel_size=(9,), stride=(1,), padding=(4,))
(ffn_2): Linear(in_features=1024, out_features=256, bias=True)
)
)
)
(2): TransformerEncoderLayer(
(op): EncSALayer(
(layer_norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=256, out_features=256, bias=False)
)
(layer_norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(ffn): TransformerFFNLayer(
(ffn_1): Conv1d(256, 1024, kernel_size=(9,), stride=(1,), padding=(4,))
(ffn_2): Linear(in_features=1024, out_features=256, bias=True)
)
)
)
(3): TransformerEncoderLayer(
(op): EncSALayer(
(layer_norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=256, out_features=256, bias=False)
)
(layer_norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(ffn): TransformerFFNLayer(
(ffn_1): Conv1d(256, 1024, kernel_size=(9,), stride=(1,), padding=(4,))
(ffn_2): Linear(in_features=1024, out_features=256, bias=True)
)
)
)
)
(layer_norm): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(embed_tokens): Embedding(61, 256, padding_idx=0)
(embed_positions): RelPositionalEncoding(
(dropout): Dropout(p=0.0, inplace=False)
)
)
(midi_embed): Embedding(300, 256, padding_idx=0)
(midi_dur_layer): Linear(in_features=1, out_features=256, bias=True)
(is_slur_embed): Embedding(2, 256)
)
| model Trainable Parameters: 24.195M
09/27 07:06:52 PM model and trainer restored from checkpoint: checkpoints/m4singer_fs2_e2e/model_ckpt_steps_320000.ckpt
Validation sanity check: 0%| | 0/1 [00:00<?, ?batch/s]
==============
valid results: {'total_loss': 0.5226, 'ssim': 0.2665, 'l1': 0.2351, 'pdur': 0.0188, 'wdur': 0.002, 'sdur': 0.0002}
==============
Epoch 1: : 1batch [00:01, 1.06s/batch, batch_size=12, l1=0.105, lr=0.00011, pdur=0.0135, sdur=0.00344, ssim=0.174, step=320000, wdur=0.00704]| Training end..
Epoch 1: : 1batch [00:01, 1.15s/batch, batch_size=12, l1=0.105, lr=0.00011, pdur=0.0135, sdur=0.00344, ssim=0.174, step=320000, wdur=0.00704]
几秒钟结束了,接着训练DiffSinger
(venv3712) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/M4Singer/code$ CUDA_VISIBLE_DEVICES=0 python tasks/run.py --config usr/configs/m4singer/diff.yaml --exp_name m4singer_diff_e2e --reset
| Hparams chains: ['configs/config_base.yaml', 'configs/tts/base.yaml', 'configs/tts/fs2.yaml', 'configs/tts/base_zh.yaml', 'configs/singing/base.yaml', 'usr/configs/base.yaml', 'usr/configs/popcs_ds_beta6.yaml', 'usr/configs/m4singer/base.yaml', 'usr/configs/m4singer/diff.yaml']
| Hparams:
K_step: 1000, accumulate_grad_batches: 1, audio_num_mel_bins: 80, audio_sample_rate: 24000, base_config: ['usr/configs/m4singer/base.yaml'],
binarization_args: {'shuffle': False, 'with_txt': True, 'with_wav': False, 'with_align': True, 'with_spk_embed': True, 'with_f0': True, 'with_f0cwt': True}, binarizer_cls: data_gen.singing.binarize.M4SingerBinarizer, binary_data_dir: data/binary/m4singer, check_val_every_n_epoch: 10, clip_grad_norm: 1,
content_cond_steps: [], cwt_add_f0_loss: False, cwt_hidden_size: 128, cwt_layers: 2, cwt_loss: l1,
cwt_std_scale: 0.8, datasets: ['m4singer'], debug: False, dec_ffn_kernel_size: 9, dec_layers: 4,
decay_steps: 100000, decoder_type: fft, dict_dir: , diff_decoder_type: wavenet, diff_loss_type: l1,
dilation_cycle_length: 4, dropout: 0.1, ds_workers: 4, dur_enc_hidden_stride_kernel: ['0,2,3', '0,2,3', '0,1,3'], dur_loss: mse,
dur_predictor_kernel: 3, dur_predictor_layers: 5, enc_ffn_kernel_size: 9, enc_layers: 4, encoder_K: 8,
encoder_type: fft, endless_ds: True, ffn_act: gelu, ffn_padding: SAME, fft_size: 512,
fmax: 12000, fmin: 30, fs2_ckpt: checkpoints/m4singer_fs2_e2e, gaussian_start: True, gen_dir_name: ,
gen_tgt_spk_id: -1, hidden_size: 256, hop_size: 128, infer: False, keep_bins: 80,
lambda_commit: 0.25, lambda_energy: 0.0, lambda_f0: 0.0, lambda_ph_dur: 1.0, lambda_sent_dur: 1.0,
lambda_uv: 0.0, lambda_word_dur: 1.0, load_ckpt: , log_interval: 100, loud_norm: False,
lr: 0.001, max_beta: 0.02, max_epochs: 1000, max_eval_sentences: 1, max_eval_tokens: 60000,
max_frames: 5000, max_input_tokens: 1550, max_sentences: 28, max_tokens: 36000, max_updates: 900000,
mel_loss: ssim:0.5|l1:0.5, mel_vmax: 1.5, mel_vmin: -6.0, min_level_db: -120, norm_type: gn,
num_ckpt_keep: 3, num_heads: 2, num_sanity_val_steps: 1, num_spk: 20, num_test_samples: 0,
num_valid_plots: 10, optimizer_adam_beta1: 0.9, optimizer_adam_beta2: 0.98, out_wav_norm: False, pe_ckpt: checkpoints/m4singer_pe,
pe_enable: True, pitch_ar: False, pitch_enc_hidden_stride_kernel: ['0,2,5', '0,2,5', '0,2,5'], pitch_extractor: parselmouth, pitch_loss: l1,
pitch_norm: log, pitch_type: frame, pndm_speedup: 5, pre_align_args: {'use_tone': False, 'forced_align': 'mfa', 'use_sox': True, 'txt_processor': 'zh_g2pM', 'allow_no_txt': False, 'denoise': False}, pre_align_cls: data_gen.singing.pre_align.SingingPreAlign,
predictor_dropout: 0.5, predictor_grad: 0.1, predictor_hidden: -1, predictor_kernel: 5, predictor_layers: 5,
prenet_dropout: 0.5, prenet_hidden_size: 256, pretrain_fs_ckpt: , processed_data_dir: xxx, profile_infer: False,
raw_data_dir: data/raw/m4singer, ref_norm_layer: bn, rel_pos: True, reset_phone_dict: True, residual_channels: 256,
residual_layers: 20, save_best: False, save_ckpt: True, save_codes: ['configs', 'modules', 'tasks', 'utils', 'usr'], save_f0: True,
save_gt: True, schedule_type: linear, seed: 1234, sort_by_len: True, spec_max: [-0.3894500136375427, -0.3796464204788208, -0.2914905250072479, -0.15550297498703003, -0.08502643555402756, 0.10698417574167252, -0.0739326998591423, -0.0541548952460289, 0.15501998364925385, 0.06483431905508041, 0.03054228238761425, -0.013737732544541359, -0.004876468330621719, 0.04368264228105545, 0.13329921662807465, 0.16471388936042786, 0.04605761915445328, -0.05680707097053528, 0.0542571023106575, -0.0076539707370102406, -0.00953489076346159, -0.04434828832745552, 0.001293870504014194, -0.12238839268684387, 0.06418416649103165, 0.02843189612030983, 0.08505241572856903, 0.07062800228595734, 0.00120724702719599, -0.07675088942050934, 0.03785804659128189, 0.04890783503651619, -0.06888376921415329, -0.0839693546295166, -0.17545585334300995, -0.2911079525947571, -0.4238220453262329, -0.262084037065506, -0.3002263605594635, -0.3845032751560211, -0.3906497061252594, -0.6550108790397644, -0.7810799479484558, -0.7503029704093933, -0.7995198965072632, -0.8092347383499146, -0.6196113228797913, -0.6684317588806152, -0.7735874056816101, -0.8324533104896545, -0.9601566791534424, -0.955253541469574, -0.748817503452301, -0.9106167554855347, -0.9707801342010498, -1.053107500076294, -1.0448424816131592, -1.1082794666290283, -1.1296544075012207, -1.071642279624939, -1.1003081798553467, -1.166810154914856, -1.1408926248550415, -1.1330615282058716, -1.1167492866516113, -1.0716774463653564, -1.035891056060791, -1.0092483758926392, -0.9675999879837036, -0.938962996006012, -1.0120564699172974, -0.9777995347976685, -1.029313564300537, -0.9459163546562195, -0.8519706130027771, -0.7751091122627258, -0.7933766841888428, -0.9019735455513, -0.9983296990394592, -1.505873441696167],
spec_min: [-6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0, -6.0], spk_cond_steps: [], stop_token_weight: 5.0, task_cls: usr.diffsinger_task.DiffSingerMIDITask, test_ids: [],
test_input_dir: , test_num: 0, test_prefixes: ['Alto-2#岁月神偷', 'Alto-2#奇妙能力歌', 'Tenor-1#一千年以后', 'Tenor-1#童话', 'Tenor-2#消愁', 'Tenor-2#一荤一素', 'Soprano-1#念奴娇赤壁怀古', 'Soprano-1#问春'], test_set_name: test, timesteps: 1000,
train_set_name: train, use_denoise: False, use_energy_embed: False, use_gt_dur: False, use_gt_f0: False,
use_midi: True, use_nsf: True, use_pitch_embed: False, use_pos_embed: True, use_spk_embed: False,
use_spk_id: True, use_split_spk_id: False, use_uv: True, use_var_enc: False, val_check_interval: 2000,
valid_num: 0, valid_set_name: valid, validate: False, vocoder: vocoders.hifigan.HifiGAN, vocoder_ckpt: checkpoints/m4singer_hifigan,
warmup_updates: 2000, wav2spec_eps: 1e-6, weight_decay: 0, win_size: 512, work_dir: checkpoints/m4singer_diff_e2e,
| Mel losses: {'ssim': 0.5, 'l1': 0.5}
| load HifiGAN: checkpoints/m4singer_hifigan/model_ckpt_steps_1970000.ckpt
Removing weight norm...
| Loaded model parameters from checkpoints/m4singer_hifigan/model_ckpt_steps_1970000.ckpt.
| HifiGAN device: cuda.
| load HifiGAN: checkpoints/m4singer_hifigan/model_ckpt_steps_1970000.ckpt
Removing weight norm...
| Loaded model parameters from checkpoints/m4singer_hifigan/model_ckpt_steps_1970000.ckpt.
| HifiGAN device: cuda.
| load 'model' from 'checkpoints/m4singer_pe/model_ckpt_steps_280000.ckpt'.
09/27 07:09:00 PM gpu available: True, used: True
| Copied codes to checkpoints/m4singer_diff_e2e/codes/20230927190900.
| load 'model' from 'checkpoints/m4singer_fs2_e2e/model_ckpt_steps_320000.ckpt'.
| model Arch: GaussianDiffusion(
(denoise_fn): DiffNet(
(input_projection): Conv1d(80, 256, kernel_size=(1,), stride=(1,))
(diffusion_embedding): SinusoidalPosEmb()
(mlp): Sequential(
(0): Linear(in_features=256, out_features=1024, bias=True)
(1): Mish()
(2): Linear(in_features=1024, out_features=256, bias=True)
)
(residual_layers): ModuleList(
(0): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(1,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
(1): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(2,), dilation=(2,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
(2): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(4,), dilation=(4,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
(3): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(8,), dilation=(8,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
(4): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(1,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
(5): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(2,), dilation=(2,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
(6): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(4,), dilation=(4,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
(7): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(8,), dilation=(8,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
(8): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(1,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
(9): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(2,), dilation=(2,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
(10): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(4,), dilation=(4,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
(11): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(8,), dilation=(8,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
(12): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(1,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
(13): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(2,), dilation=(2,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
(14): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(4,), dilation=(4,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
(15): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(8,), dilation=(8,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
(16): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(1,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
(17): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(2,), dilation=(2,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
(18): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(4,), dilation=(4,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
(19): ResidualBlock(
(dilated_conv): Conv1d(256, 512, kernel_size=(3,), stride=(1,), padding=(8,), dilation=(8,))
(diffusion_projection): Linear(in_features=256, out_features=256, bias=True)
(conditioner_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
)
)
(skip_projection): Conv1d(256, 256, kernel_size=(1,), stride=(1,))
(output_projection): Conv1d(256, 80, kernel_size=(1,), stride=(1,))
)
(fs2): FastSpeech2MIDI(
(encoder_embed_tokens): Embedding(61, 256, padding_idx=0)
(decoder): FastspeechDecoder(
(embed_positions): SinusoidalPositionalEmbedding()
(layers): ModuleList(
(0): TransformerEncoderLayer(
(op): EncSALayer(
(layer_norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=256, out_features=256, bias=False)
)
(layer_norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(ffn): TransformerFFNLayer(
(ffn_1): Conv1d(256, 1024, kernel_size=(9,), stride=(1,), padding=(4,))
(ffn_2): Linear(in_features=1024, out_features=256, bias=True)
)
)
)
(1): TransformerEncoderLayer(
(op): EncSALayer(
(layer_norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=256, out_features=256, bias=False)
)
(layer_norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(ffn): TransformerFFNLayer(
(ffn_1): Conv1d(256, 1024, kernel_size=(9,), stride=(1,), padding=(4,))
(ffn_2): Linear(in_features=1024, out_features=256, bias=True)
)
)
)
(2): TransformerEncoderLayer(
(op): EncSALayer(
(layer_norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=256, out_features=256, bias=False)
)
(layer_norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(ffn): TransformerFFNLayer(
(ffn_1): Conv1d(256, 1024, kernel_size=(9,), stride=(1,), padding=(4,))
(ffn_2): Linear(in_features=1024, out_features=256, bias=True)
)
)
)
(3): TransformerEncoderLayer(
(op): EncSALayer(
(layer_norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=256, out_features=256, bias=False)
)
(layer_norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(ffn): TransformerFFNLayer(
(ffn_1): Conv1d(256, 1024, kernel_size=(9,), stride=(1,), padding=(4,))
(ffn_2): Linear(in_features=1024, out_features=256, bias=True)
)
)
)
)
(layer_norm): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
)
(mel_out): Linear(in_features=256, out_features=80, bias=True)
(spk_embed_proj): Embedding(21, 256)
(dur_predictor): DurationPredictor(
(conv): ModuleList(
(0): Sequential(
(0): ConstantPad1d(padding=(1, 1), value=0)
(1): Conv1d(256, 256, kernel_size=(3,), stride=(1,))
(2): ReLU()
(3): LayerNorm((256,), eps=1e-12, elementwise_affine=True)
(4): Dropout(p=0.5, inplace=False)
)
(1): Sequential(
(0): ConstantPad1d(padding=(1, 1), value=0)
(1): Conv1d(256, 256, kernel_size=(3,), stride=(1,))
(2): ReLU()
(3): LayerNorm((256,), eps=1e-12, elementwise_affine=True)
(4): Dropout(p=0.5, inplace=False)
)
(2): Sequential(
(0): ConstantPad1d(padding=(1, 1), value=0)
(1): Conv1d(256, 256, kernel_size=(3,), stride=(1,))
(2): ReLU()
(3): LayerNorm((256,), eps=1e-12, elementwise_affine=True)
(4): Dropout(p=0.5, inplace=False)
)
(3): Sequential(
(0): ConstantPad1d(padding=(1, 1), value=0)
(1): Conv1d(256, 256, kernel_size=(3,), stride=(1,))
(2): ReLU()
(3): LayerNorm((256,), eps=1e-12, elementwise_affine=True)
(4): Dropout(p=0.5, inplace=False)
)
(4): Sequential(
(0): ConstantPad1d(padding=(1, 1), value=0)
(1): Conv1d(256, 256, kernel_size=(3,), stride=(1,))
(2): ReLU()
(3): LayerNorm((256,), eps=1e-12, elementwise_affine=True)
(4): Dropout(p=0.5, inplace=False)
)
)
(linear): Linear(in_features=256, out_features=1, bias=True)
)
(length_regulator): LengthRegulator()
(encoder): FastspeechMIDIEncoder(
(layers): ModuleList(
(0): TransformerEncoderLayer(
(op): EncSALayer(
(layer_norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=256, out_features=256, bias=False)
)
(layer_norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(ffn): TransformerFFNLayer(
(ffn_1): Conv1d(256, 1024, kernel_size=(9,), stride=(1,), padding=(4,))
(ffn_2): Linear(in_features=1024, out_features=256, bias=True)
)
)
)
(1): TransformerEncoderLayer(
(op): EncSALayer(
(layer_norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=256, out_features=256, bias=False)
)
(layer_norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(ffn): TransformerFFNLayer(
(ffn_1): Conv1d(256, 1024, kernel_size=(9,), stride=(1,), padding=(4,))
(ffn_2): Linear(in_features=1024, out_features=256, bias=True)
)
)
)
(2): TransformerEncoderLayer(
(op): EncSALayer(
(layer_norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=256, out_features=256, bias=False)
)
(layer_norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(ffn): TransformerFFNLayer(
(ffn_1): Conv1d(256, 1024, kernel_size=(9,), stride=(1,), padding=(4,))
(ffn_2): Linear(in_features=1024, out_features=256, bias=True)
)
)
)
(3): TransformerEncoderLayer(
(op): EncSALayer(
(layer_norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=256, out_features=256, bias=False)
)
(layer_norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(ffn): TransformerFFNLayer(
(ffn_1): Conv1d(256, 1024, kernel_size=(9,), stride=(1,), padding=(4,))
(ffn_2): Linear(in_features=1024, out_features=256, bias=True)
)
)
)
)
(layer_norm): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(embed_tokens): Embedding(61, 256, padding_idx=0)
(embed_positions): RelPositionalEncoding(
(dropout): Dropout(p=0.0, inplace=False)
)
)
(midi_embed): Embedding(300, 256, padding_idx=0)
(midi_dur_layer): Linear(in_features=1, out_features=256, bias=True)
(is_slur_embed): Embedding(2, 256)
)
)
| model Trainable Parameters: 39.281M
09/27 07:09:01 PM model and trainer restored from checkpoint: checkpoints/m4singer_diff_e2e/model_ckpt_steps_900000.ckpt
Validation sanity check: 0%| | 0/1 [00:00<?, ?batch/s]===> gaussion start.
sample time step: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:01<00:00, 116.90it/s]
sample time step: 96%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ | 191/200 [00:01<00:00, 119.51it/s]
==============
valid results: {'total_loss': 0.065, 'mel': 0.0536, 'pdur': 0.0098, 'wdur': 0.0014, 'sdur': 0.0002}
==============
Epoch 1: : 0batch [00:00, ?batch/s]Traceback (most recent call last):
File "tasks/run.py", line 15, in <module>
run_task()
File "tasks/run.py", line 10, in run_task
task_cls.start()
File "/home/yeqiang/Downloads/ai/M4Singer/code/tasks/base_task.py", line 257, in start
trainer.fit(task)
File "/home/yeqiang/Downloads/ai/M4Singer/code/utils/pl_utils.py", line 489, in fit
self.run_pretrain_routine(model)
File "/home/yeqiang/Downloads/ai/M4Singer/code/utils/pl_utils.py", line 582, in run_pretrain_routine
self.train()
File "/home/yeqiang/Downloads/ai/M4Singer/code/utils/pl_utils.py", line 1358, in train
self.run_training_epoch()
File "/home/yeqiang/Downloads/ai/M4Singer/code/utils/pl_utils.py", line 1392, in run_training_epoch
output = self.run_training_batch(batch, batch_idx)
File "/home/yeqiang/Downloads/ai/M4Singer/code/utils/pl_utils.py", line 1514, in run_training_batch
loss = optimizer_closure()
File "/home/yeqiang/Downloads/ai/M4Singer/code/utils/pl_utils.py", line 1480, in optimizer_closure
split_batch, batch_idx, opt_idx, self.hiddens)
File "/home/yeqiang/Downloads/ai/M4Singer/code/utils/pl_utils.py", line 1588, in training_forward
output = self.model.training_step(*args)
File "/home/yeqiang/Downloads/ai/M4Singer/code/tasks/base_task.py", line 128, in training_step
loss_ret = self._training_step(sample, batch_idx, optimizer_idx)
File "/home/yeqiang/Downloads/ai/M4Singer/code/usr/task.py", line 57, in _training_step
log_outputs = self.run_model(self.model, sample)
File "/home/yeqiang/Downloads/ai/M4Singer/code/usr/diffsinger_task.py", line 301, in run_model
midi_dur=sample.get('midi_dur'), is_slur=sample.get('is_slur'))
File "/home/yeqiang/Downloads/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/yeqiang/Downloads/ai/M4Singer/code/usr/diff/shallow_diffusion_tts.py", line 242, in forward
ret['diff_loss'] = self.p_losses(x, t, cond)
File "/home/yeqiang/Downloads/ai/M4Singer/code/usr/diff/shallow_diffusion_tts.py", line 214, in p_losses
x_recon = self.denoise_fn(x_noisy, t, cond)
File "/home/yeqiang/Downloads/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/yeqiang/Downloads/ai/M4Singer/code/usr/diff/net.py", line 123, in forward
x, skip_connection = layer(x, cond, diffusion_step)
File "/home/yeqiang/Downloads/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/yeqiang/Downloads/ai/M4Singer/code/usr/diff/net.py", line 71, in forward
y = self.dilated_conv(y) + conditioner
File "/home/yeqiang/Downloads/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/yeqiang/Downloads/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 257, in forward
self.padding, self.dilation, self.groups)
RuntimeError: CUDA out of memory. Tried to allocate 70.00 MiB (GPU 0; 5.78 GiB total capacity; 3.06 GiB already allocated; 27.62 MiB free; 3.14 GiB reserved in total by PyTorch)
Exception ignored in: <function tqdm.__del__ at 0x7f1024e51440>
Traceback (most recent call last):
File "/home/yeqiang/Downloads/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/tqdm/std.py", line 1124, in __del__
File "/home/yeqiang/Downloads/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/tqdm/std.py", line 1337, in close
File "/home/yeqiang/Downloads/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/tqdm/std.py", line 1516, in display
File "/home/yeqiang/Downloads/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/tqdm/std.py", line 1127, in __repr__
File "/home/yeqiang/Downloads/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/tqdm/std.py", line 1477, in format_dict
TypeError: cannot unpack non-iterable NoneType object
哦豁,2060 表示显存不够!
另外,前面的打包数据集过程应该是用不上显卡,不是torch版本有问题