diff --git a/README.md b/README.md index 27421db..5cc592c 100644 --- a/README.md +++ b/README.md @@ -1,47 +1,104 @@ --- -license: Apache License 2.0 +license: other +license_name: flux-1-dev-non-commercial-license +license_link: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md -#model-type: -##如 gpt、phi、llama、chatglm、baichuan 等 -#- gpt +language: + - en +library_name: diffusers +pipeline_tag: text-to-image -#domain: -##如 nlp、cv、audio、multi-modal -#- nlp - -#language: -##语言代码列表 https://help.aliyun.com/document_detail/215387.html?spm=a2c4g.11186623.0.0.9f8d7467kni6Aa -#- cn - -#metrics: -##如 CIDEr、Blue、ROUGE 等 -#- CIDEr - -#tags: -##各种自定义,包括 pretrained、fine-tuned、instruction-tuned、RL-tuned 等训练方法和其他 -#- pretrained - -#tools: -##如 vllm、fastchat、llamacpp、AdaSeq 等 -#- vllm +tags: +- Text-to-Image +- ControlNet +- Diffusers +- Flux.1-dev +- image-generation +- Stable Diffusion +base_model: black-forest-labs/FLUX.1-dev --- -### 当前模型的贡献者未提供更加详细的模型介绍。模型文件和权重,可浏览“模型文件”页面获取。 -#### 您可以通过如下git clone命令,或者ModelScope SDK来下载模型 -SDK下载 -```bash -#安装ModelScope -pip install modelscope -``` +# FLUX.1-dev-ControlNet-Union-Pro-2.0 + +This repository contains an unified ControlNet for FLUX.1-dev model released by [Shakker Labs](https://huggingface.co/Shakker-Labs). We provide an [online demo](https://huggingface.co/spaces/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro-2.0). + +# Keynotes +In comparison with [Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro](https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro), +- Remove mode embedding, has smaller model size. +- Improve on canny and pose, better control and aesthetics. +- Add support for soft edge. Remove support for tile. + +# Model Cards +- This ControlNet consists of 6 double blocks and 0 single block. Mode embedding is removed. +- We train the model from scratch for 300k steps using a dataset of 20M high-quality general and human images. We train at 512x512 resolution in BFloat16, batch size = 128, learning rate = 2e-5, the guidance is uniformly sampled from [1, 7]. We set the text drop ratio to 0.20. +- This model supports multiple control modes, including canny, soft edge, depth, pose, gray. You can use it just as a normal ControlNet. +- This model can be jointly used with other ControlNets. + +# Showcases + + + + + + + + + + + + + + + + + +
canny
softedge
pose
depth
gray
+ +# Inference ```python -#SDK模型下载 -from modelscope import snapshot_download -model_dir = snapshot_download('LiblibAI/FLUX.1-dev-ControlNet-Union-Pro-2.0') -``` -Git下载 -``` -#Git模型下载 -git clone https://www.modelscope.cn/LiblibAI/FLUX.1-dev-ControlNet-Union-Pro-2.0.git +import torch +from diffusers.utils import load_image +from diffusers import FluxControlNetPipeline, FluxControlNetModel + +base_model = 'black-forest-labs/FLUX.1-dev' +controlnet_model_union = 'Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro-2.0' + +controlnet = FluxControlNetModel.from_pretrained(controlnet_model_union, torch_dtype=torch.bfloat16) +pipe = FluxControlNetPipeline.from_pretrained(base_model, controlnet=controlnet, torch_dtype=torch.bfloat16) +pipe.to("cuda") + +# replace with other conds +control_image = load_image("./conds/canny.png") +width, height = control_image.size + +prompt = "A young girl stands gracefully at the edge of a serene beach, her long, flowing hair gently tousled by the sea breeze. She wears a soft, pastel-colored dress that complements the tranquil blues and greens of the coastal scenery. The golden hues of the setting sun cast a warm glow on her face, highlighting her serene expression. The background features a vast, azure ocean with gentle waves lapping at the shore, surrounded by distant cliffs and a clear, cloudless sky. The composition emphasizes the girl's serene presence amidst the natural beauty, with a balanced blend of warm and cool tones." + +image = pipe( + prompt, + control_image=control_image, + width=width, + height=height, + controlnet_conditioning_scale=0.7, + control_guidance_end=0.8, + num_inference_steps=30, + guidance_scale=3.5, + generator=torch.Generator(device="cuda").manual_seed(42), +).images[0] ``` -

如果您是本模型的贡献者,我们邀请您根据模型贡献文档,及时完善模型卡片内容。

\ No newline at end of file +# Recommended Parameters +You can adjust controlnet_conditioning_scale and control_guidance_end for stronger control and better detail preservation. For better stability, we suggest to use multi-conditions. +- Canny: use cv2.Canny, controlnet_conditioning_scale=0.7, control_guidance_end=0.8. +- Soft Edge: use [AnylineDetector](https://github.com/huggingface/controlnet_aux), controlnet_conditioning_scale=0.7, control_guidance_end=0.8. +- Depth: use [depth-anything](https://github.com/DepthAnything/Depth-Anything-V2), controlnet_conditioning_scale=0.8, control_guidance_end=0.8. +- Pose: use [DWPose](https://github.com/IDEA-Research/DWPose/tree/onnx), controlnet_conditioning_scale=0.9, control_guidance_end=0.65. +- Gray: use cv2.cvtColor, controlnet_conditioning_scale=0.9, control_guidance_end=0.8. + +# Resources +- [InstantX/FLUX.1-dev-IP-Adapter](https://huggingface.co/InstantX/FLUX.1-dev-IP-Adapter) +- [InstantX/FLUX.1-dev-Controlnet-Canny](https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Canny) +- [Shakker-Labs/FLUX.1-dev-ControlNet-Depth](https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Depth) +- [Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro](https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro) + +# Acknowledgements +This model is developed by [Shakker Labs](https://huggingface.co/Shakker-Labs). The original idea is inspired by [xinsir/controlnet-union-sdxl-1.0](https://huggingface.co/xinsir/controlnet-union-sdxl-1.0). All copyright reserved. diff --git a/conds/canny.png b/conds/canny.png new file mode 100644 index 0000000..8b81e65 Binary files /dev/null and b/conds/canny.png differ diff --git a/config.json b/config.json new file mode 100644 index 0000000..6eaa501 --- /dev/null +++ b/config.json @@ -0,0 +1,19 @@ +{ + "_class_name": "FluxControlNetModel", + "_diffusers_version": "0.31.0.dev0", + "attention_head_dim": 128, + "axes_dims_rope": [ + 16, + 56, + 56 + ], + "guidance_embeds": true, + "in_channels": 64, + "joint_attention_dim": 4096, + "num_attention_heads": 24, + "num_layers": 6, + "num_mode": null, + "num_single_layers": 0, + "patch_size": 1, + "pooled_projection_dim": 768 +} diff --git a/configuration.json b/configuration.json new file mode 100644 index 0000000..b5f6ce7 --- /dev/null +++ b/configuration.json @@ -0,0 +1 @@ +{"framework": "pytorch", "task": "text-to-image", "allow_remote": true} \ No newline at end of file diff --git a/diffusion_pytorch_model.safetensors b/diffusion_pytorch_model.safetensors new file mode 100644 index 0000000..78feaff --- /dev/null +++ b/diffusion_pytorch_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:00e83e5328bbdaa4d0525ac549353a5088fe4512fdbec4722226ac5d66817a72 +size 135 diff --git a/images/canny.png b/images/canny.png new file mode 100644 index 0000000..71226c8 Binary files /dev/null and b/images/canny.png differ diff --git a/images/depth.png b/images/depth.png new file mode 100644 index 0000000..b1ac36d Binary files /dev/null and b/images/depth.png differ diff --git a/images/gray.png b/images/gray.png new file mode 100644 index 0000000..4ef2a09 Binary files /dev/null and b/images/gray.png differ diff --git a/images/pose.png b/images/pose.png new file mode 100644 index 0000000..d09a42f Binary files /dev/null and b/images/pose.png differ diff --git a/images/softedge.png b/images/softedge.png new file mode 100644 index 0000000..095690a Binary files /dev/null and b/images/softedge.png differ