Files
FLUX.1-Turbo-Alpha/README_ZH.md
ai-modelscope c0c78aeff3 update readme
2024-10-15 17:44:00 +08:00

81 lines
2.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
license: other
license_name: flux-1-dev-non-commercial-license
license_link: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md
language:
- en
base_model: black-forest-labs/FLUX.1-dev
library_name: diffusers
tags:
- Text-to-Image
- FLUX
- Stable Diffusion
pipeline_tag: text-to-image
---
<div style="display: flex; justify-content: center; align-items: center;">
<img src="./images/images_alibaba.png" alt="alibaba" style="width: 20%; height: auto; margin-right: 5%;">
<img src="./images/images_alimama.png" alt="alimama" style="width: 20%; height: auto;">
</div>
本仓库包含了由阿里妈妈创意团队开发的基于[FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev)模型的8步蒸馏版。
# 介绍
该模型是基于FLUX.1-dev模型的8步蒸馏版lora。我们使用特殊设计的判别器来提高蒸馏质量。该模型可以用于T2I、Inpainting controlnet和其他FLUX相关模型。建议guidance_scale=3.5和lora_scale=1。我们的更低步数的版本将在后续发布。
- Text-to-Image.
![](./images/T2I.png)
- 配合[alimama-creative/FLUX.1-dev-Controlnet-Inpainting-Beta](https://huggingface.co/alimama-creative/FLUX.1-dev-Controlnet-Inpainting-Beta)。我们模型可以很好地适配Inpainting controlnet并与原始输出保持相似的结果。
![](./images/inpaint.png)
# 使用指南
## diffusers
该模型可以直接与diffusers一起使用
```python
import torch
from diffusers.pipelines import FluxPipeline
model_id = "black-forest-labs/FLUX.1-dev"
adapter_id = "alimama-creative/FLUX.1-Turbo-Alpha"
pipe = FluxPipeline.from_pretrained(
model_id,
torch_dtype=torch.bfloat16
)
pipe.to("cuda")
pipe.load_lora_weights(adapter_id)
pipe.fuse_lora()
prompt = "A DSLR photo of a shiny VW van that has a cityscape painted on it. A smiling sloth stands on grass in front of the van and is wearing a leather jacket, a cowboy hat, a kilt and a bowtie. The sloth is holding a quarterstaff and a big book."
image = pipe(
prompt=prompt,
guidance_scale=3.5,
height=1024,
width=1024,
num_inference_steps=8,
max_sequence_length=512).images[0]
```
## comfyui
- 文生图加速链路: [点击这里](./workflows/t2I_flux_turbo.json)
- Inpainting controlnet 加速链路: [点击这里](./workflows/alimama_flux_inpainting_turbo_8step.json)
# 训练细节
该模型在1M公开数据集和内部源图片上进行训练这些数据美学评分6.3+而且分辨率大于800。我们使用对抗训练来提高质量我们的方法将原始FLUX.1-dev transformer固定为判别器的特征提取器并在每个transformer层中添加判别头网络。在训练期间我们将guidance scale固定为3.5并使用时间偏移量3。
混合精度: bf16
学习率: 2e-5
批大小: 64
训练分辨率: 1024x1024