Update README.md

This commit is contained in:
indextts
2025-09-08 08:43:57 +00:00
parent 44233b5b0c
commit 83a71ad75c

174
README.md
View File

@ -84,180 +84,6 @@ The key contributions of **indextts2** are summarized as follows:
- We will publicly release the code and pre-trained weights to facilitate future research and practical applications. - We will publicly release the code and pre-trained weights to facilitate future research and practical applications.
## Model Download
| **HuggingFace** | **ModelScope** |
|----------------------------------------------------------|----------------------------------------------------------|
| [😁 IndexTTS2](https://huggingface.co/IndexTeam/IndexTTS-2.0) | [IndexTTS-2](https://modelscope.cn/models/IndexTeam/IndexTTS-2.0) |
| [IndexTTS-1.5](https://huggingface.co/IndexTeam/IndexTTS-1.5) | [IndexTTS-1.5](https://modelscope.cn/models/IndexTeam/IndexTTS-1.5) |
| [IndexTTS](https://huggingface.co/IndexTeam/Index-TTS) | [IndexTTS](https://modelscope.cn/models/IndexTeam/Index-TTS) |
## Usage Instructions
### Environment Setup
1. Download this repository:
```bash
git clone https://github.com/index-tts/index-tts.git
```
2. Install dependencies:
```bash
conda create -n indextts2 python=3.10
conda activate indextts2
pip install -r requirements.txt
```
3. Download models:
Download by `huggingface-cli`:
```bash
huggingface-cli download IndexTeam/IndexTTS-1.5 \
config.yaml bigvgan_discriminator.pth bigvgan_generator.pth bpe.model dvae.pth gpt.pth unigram_12000.vocab \
--local-dir checkpoints
```
Recommended for China users. 如果下载速度慢,可以使用镜像:
```bash
export HF_ENDPOINT="https://hf-mirror.com"
```
Or by `wget`:
```bash
wget https://huggingface.co/IndexTeam/IndexTTS-1.5/resolve/main/bigvgan_discriminator.pth -P checkpoints
wget https://huggingface.co/IndexTeam/IndexTTS-1.5/resolve/main/bigvgan_generator.pth -P checkpoints
wget https://huggingface.co/IndexTeam/IndexTTS-1.5/resolve/main/bpe.model -P checkpoints
wget https://huggingface.co/IndexTeam/IndexTTS-1.5/resolve/main/dvae.pth -P checkpoints
wget https://huggingface.co/IndexTeam/IndexTTS-1.5/resolve/main/gpt.pth -P checkpoints
wget https://huggingface.co/IndexTeam/IndexTTS-1.5/resolve/main/unigram_12000.vocab -P checkpoints
wget https://huggingface.co/IndexTeam/IndexTTS-1.5/resolve/main/config.yaml -P checkpoints
```
4. Run test script:
Do a quick test run
```bash
from indextts.infer_indextts2 import IndexTTS2
tts = IndexTTS2(cfg_path="checkpoints/config.yaml", model_dir="checkpoints", is_fp16=False, use_cuda_kernel=False)
text="这是一个有很好情感表现力的自回归语音生成大模型,它还可以控制合成语音的时长,希望能受到大家的喜欢。"
tts.infer(spk_audio_prompt='test_data/input.wav', text=text, output_path="gen.wav", verbose=True)
```
额外指定一个情感参考音频 Specify an additional emotional reference audio
```bash
from indextts.infer_indextts2 import IndexTTS2
tts = IndexTTS2(cfg_path="checkpoints/config.yaml", model_dir="checkpoints", is_fp16=False, use_cuda_kernel=False)
text="这是一个有很好情感表现力的自回归语音生成大模型,它还可以控制合成语音的时长,希望能受到大家的喜欢。"
tts.infer(spk_audio_prompt='test_data/input.wav', text=text, output_path="gen.wav", emo_audio_prompt="test_data/low.wav", verbose=True)
```
当指定情感参考音频时还可以额外指定参数emo_alphaemo_alpha代表参考情感音频的程度默认为1.0
```bash
from indextts.infer_indextts2 import IndexTTS2
tts = IndexTTS2(cfg_path="checkpoints/config.yaml", model_dir="checkpoints", is_fp16=False, use_cuda_kernel=False)
text="这是一个有很好情感表现力的自回归语音生成大模型,它还可以控制合成语音的时长,希望能受到大家的喜欢。"
tts.infer(spk_audio_prompt='test_data/input.wav', text=text, output_path="gen.wav", emo_audio_prompt="test_data/low.wav", emo_alpha=0.5, verbose=True)
```
也可以不指定情感参考音频,而给定各基础情感(喜|怒|哀|惧|厌恶|低落|惊喜|平静)的强度包括8个float的list
```bash
from indextts.infer_indextts2 import IndexTTS2
tts = IndexTTS2(cfg_path="checkpoints/config.yaml", model_dir="checkpoints", is_fp16=False, use_cuda_kernel=False)
text="这是一个有很好情感表现力的自回归语音生成大模型,它还可以控制合成语音的时长,希望能受到大家的喜欢。"
tts.infer(spk_audio_prompt='test_data/input.wav', text=text, output_path="gen.wav", emo_vector=[0, 1.0, 0, 0, 0, 0, 0, 0], verbose=True)
```
可以使用文本情感描述指导情感的合成使用参数use_emo_text
```bash
from indextts.infer_indextts2 import IndexTTS2
tts = IndexTTS2(cfg_path="checkpoints/config.yaml", model_dir="checkpoints", is_fp16=False, use_cuda_kernel=False)
text="这是一个有很好情感表现力的自回归语音生成大模型,它还可以控制合成语音的时长,希望能受到大家的喜欢。"
tts.infer(spk_audio_prompt='test_data/input.wav', text=text, output_path="gen.wav", use_emo_text=True, verbose=True)
```
当不指定emo_text根据输入的合成文案内容推理指定时根据指定的文案推
```bash
from indextts.infer_indextts2 import IndexTTS2
tts = IndexTTS2(cfg_path="checkpoints/config.yaml", model_dir="checkpoints", is_fp16=False, use_cuda_kernel=False)
text="这是一个有很好情感表现力的自回归语音生成大模型,它还可以控制合成语音的时长,希望能受到大家的喜欢。"
tts.infer(spk_audio_prompt='test_data/input.wav', text=text, output_path="gen.wav", use_emo_text=True, emo_text='有一丢丢伤心', verbose=True)
```
Specify the duration of the synthesized speech
```bash
from indextts.infer_indextts2 import IndexTTS2
tts = IndexTTS2(cfg_path="checkpoints/config.yaml", model_dir="checkpoints", is_fp16=False, use_cuda_kernel=False)
text="这是一个有很好情感表现力的自回归语音生成大模型,它还可以控制合成语音的时长,希望能受到大家的喜欢。"
tts.infer(spk_audio_prompt='test_data/input.wav', text=text, output_path="gen.wav", use_speed=True, target_dur=7.5, verbose=True)
```
5. Use as command line tool:
```bash
# Make sure pytorch has been installed before running this command
pip install -e .
indextts "大家好我现在正在bilibili 体验 ai 科技说实话来之前我绝对想不到AI技术已经发展到这样匪夷所思的地步了" \
--voice reference_voice.wav \
--model_dir checkpoints \
--config checkpoints/config.yaml \
--output output.wav
```
Use `--help` to see more options.
```bash
indextts --help
```
#### Web Demo
```bash
pip install -e ".[webui]"
python webui.py
# use another model version:
python webui.py --model_dir IndexTTS-1.5
```
Open your browser and visit `http://127.0.0.1:7860` to see the demo.
#### Note for Windows Users
On Windows, you may encounter [an error](https://github.com/index-tts/index-tts/issues/61) when installing `pynini`:
`ERROR: Failed building wheel for pynini`
In this case, please install `pynini` via `conda`:
```bash
# after conda activate index-tts
conda install -c conda-forge pynini==2.1.5
pip install WeTextProcessing==1.0.3
pip install -e ".[webui]"
```
#### Sample Code
```python
from indextts.infer import IndexTTS
tts = IndexTTS(model_dir="checkpoints",cfg_path="checkpoints/config.yaml")
voice="reference_voice.wav"
text="大家好我现在正在bilibili 体验 ai 科技说实话来之前我绝对想不到AI技术已经发展到这样匪夷所思的地步了比如说现在正在说话的其实是B站为我现场复刻的数字分身简直就是平行宇宙的另一个我了。如果大家也想体验更多深入的AIGC功能可以访问 bilibili studio相信我你们也会吃惊的。"
tts.infer(voice, text, output_path)
```
## 👉🏻 IndexTTS 👈🏻
### IndexTTS2: [[Paper]](https://arxiv.org/abs/2506.21619); [[Demo]](https://index-tts.github.io/index-tts2.github.io/); [[ModelScope]](); [[HuggingFace]]()
### IndexTTS1: [[Paper]](https://arxiv.org/abs/2502.05512); [[Demo]](https://index-tts.github.io/); [[ModelScope]](https://huggingface.co/spaces/IndexTeam/IndexTTS); [[HuggingFace]](https://huggingface.co/spaces/IndexTeam/IndexTTS)
## Acknowledge ## Acknowledge
1. [tortoise-tts](https://github.com/neonbjb/tortoise-tts) 1. [tortoise-tts](https://github.com/neonbjb/tortoise-tts)
2. [XTTSv2](https://github.com/coqui-ai/TTS) 2. [XTTSv2](https://github.com/coqui-ai/TTS)