diff --git a/README.md b/README.md index 27421db..5cc592c 100644 --- a/README.md +++ b/README.md @@ -1,47 +1,104 @@ --- -license: Apache License 2.0 +license: other +license_name: flux-1-dev-non-commercial-license +license_link: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md -#model-type: -##如 gpt、phi、llama、chatglm、baichuan 等 -#- gpt +language: + - en +library_name: diffusers +pipeline_tag: text-to-image -#domain: -##如 nlp、cv、audio、multi-modal -#- nlp - -#language: -##语言代码列表 https://help.aliyun.com/document_detail/215387.html?spm=a2c4g.11186623.0.0.9f8d7467kni6Aa -#- cn - -#metrics: -##如 CIDEr、Blue、ROUGE 等 -#- CIDEr - -#tags: -##各种自定义,包括 pretrained、fine-tuned、instruction-tuned、RL-tuned 等训练方法和其他 -#- pretrained - -#tools: -##如 vllm、fastchat、llamacpp、AdaSeq 等 -#- vllm +tags: +- Text-to-Image +- ControlNet +- Diffusers +- Flux.1-dev +- image-generation +- Stable Diffusion +base_model: black-forest-labs/FLUX.1-dev --- -### 当前模型的贡献者未提供更加详细的模型介绍。模型文件和权重,可浏览“模型文件”页面获取。 -#### 您可以通过如下git clone命令,或者ModelScope SDK来下载模型 -SDK下载 -```bash -#安装ModelScope -pip install modelscope -``` +# FLUX.1-dev-ControlNet-Union-Pro-2.0 + +This repository contains an unified ControlNet for FLUX.1-dev model released by [Shakker Labs](https://huggingface.co/Shakker-Labs). We provide an [online demo](https://huggingface.co/spaces/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro-2.0). + +# Keynotes +In comparison with [Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro](https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro), +- Remove mode embedding, has smaller model size. +- Improve on canny and pose, better control and aesthetics. +- Add support for soft edge. Remove support for tile. + +# Model Cards +- This ControlNet consists of 6 double blocks and 0 single block. Mode embedding is removed. +- We train the model from scratch for 300k steps using a dataset of 20M high-quality general and human images. We train at 512x512 resolution in BFloat16, batch size = 128, learning rate = 2e-5, the guidance is uniformly sampled from [1, 7]. We set the text drop ratio to 0.20. +- This model supports multiple control modes, including canny, soft edge, depth, pose, gray. You can use it just as a normal ControlNet. +- This model can be jointly used with other ControlNets. + +# Showcases + +
![]() |
+
![]() |
+
![]() |
+
![]() |
+
![]() |
+
如果您是本模型的贡献者,我们邀请您根据模型贡献文档,及时完善模型卡片内容。
\ No newline at end of file +# Recommended Parameters +You can adjust controlnet_conditioning_scale and control_guidance_end for stronger control and better detail preservation. For better stability, we suggest to use multi-conditions. +- Canny: use cv2.Canny, controlnet_conditioning_scale=0.7, control_guidance_end=0.8. +- Soft Edge: use [AnylineDetector](https://github.com/huggingface/controlnet_aux), controlnet_conditioning_scale=0.7, control_guidance_end=0.8. +- Depth: use [depth-anything](https://github.com/DepthAnything/Depth-Anything-V2), controlnet_conditioning_scale=0.8, control_guidance_end=0.8. +- Pose: use [DWPose](https://github.com/IDEA-Research/DWPose/tree/onnx), controlnet_conditioning_scale=0.9, control_guidance_end=0.65. +- Gray: use cv2.cvtColor, controlnet_conditioning_scale=0.9, control_guidance_end=0.8. + +# Resources +- [InstantX/FLUX.1-dev-IP-Adapter](https://huggingface.co/InstantX/FLUX.1-dev-IP-Adapter) +- [InstantX/FLUX.1-dev-Controlnet-Canny](https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Canny) +- [Shakker-Labs/FLUX.1-dev-ControlNet-Depth](https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Depth) +- [Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro](https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro) + +# Acknowledgements +This model is developed by [Shakker Labs](https://huggingface.co/Shakker-Labs). The original idea is inspired by [xinsir/controlnet-union-sdxl-1.0](https://huggingface.co/xinsir/controlnet-union-sdxl-1.0). All copyright reserved. diff --git a/conds/canny.png b/conds/canny.png new file mode 100644 index 0000000..8b81e65 Binary files /dev/null and b/conds/canny.png differ diff --git a/config.json b/config.json new file mode 100644 index 0000000..6eaa501 --- /dev/null +++ b/config.json @@ -0,0 +1,19 @@ +{ + "_class_name": "FluxControlNetModel", + "_diffusers_version": "0.31.0.dev0", + "attention_head_dim": 128, + "axes_dims_rope": [ + 16, + 56, + 56 + ], + "guidance_embeds": true, + "in_channels": 64, + "joint_attention_dim": 4096, + "num_attention_heads": 24, + "num_layers": 6, + "num_mode": null, + "num_single_layers": 0, + "patch_size": 1, + "pooled_projection_dim": 768 +} diff --git a/configuration.json b/configuration.json new file mode 100644 index 0000000..b5f6ce7 --- /dev/null +++ b/configuration.json @@ -0,0 +1 @@ +{"framework": "pytorch", "task": "text-to-image", "allow_remote": true} \ No newline at end of file diff --git a/diffusion_pytorch_model.safetensors b/diffusion_pytorch_model.safetensors new file mode 100644 index 0000000..78feaff --- /dev/null +++ b/diffusion_pytorch_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:00e83e5328bbdaa4d0525ac549353a5088fe4512fdbec4722226ac5d66817a72 +size 135 diff --git a/images/canny.png b/images/canny.png new file mode 100644 index 0000000..71226c8 Binary files /dev/null and b/images/canny.png differ diff --git a/images/depth.png b/images/depth.png new file mode 100644 index 0000000..b1ac36d Binary files /dev/null and b/images/depth.png differ diff --git a/images/gray.png b/images/gray.png new file mode 100644 index 0000000..4ef2a09 Binary files /dev/null and b/images/gray.png differ diff --git a/images/pose.png b/images/pose.png new file mode 100644 index 0000000..d09a42f Binary files /dev/null and b/images/pose.png differ diff --git a/images/softedge.png b/images/softedge.png new file mode 100644 index 0000000..095690a Binary files /dev/null and b/images/softedge.png differ