Skip to content

Latest commit

 

History

History

ArtAug

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

FLUX Aesthetics Enhancement LoRA

Introduction

This is a LoRA model trained for FLUX.1-dev, which enhances the aesthetic quality of images generated by the model. The improvements include, but are not limited to: rich details, beautiful lighting and shadows, aesthetic composition, and clear visuals. This model does not require any trigger words.

Methodology

workflow

The ArtAug project is inspired by reasoning approaches like GPT-o1, which rely on model interaction and self-correction. We developed a framework aimed at enhancing the capabilities of image generation models through interaction with image understanding models. The training process of ArtAug consists of the following steps:

  1. Synthesis-Understanding Interaction: After generating an image using the image generation model, we employ a multimodal large language model (Qwen2-VL-72B) to analyze the image content and provide suggestions for modifications, which then lead to the regeneration of a higher quality image.

  2. Data Generation and Filtering: Interactive generation involves long inference times and sometimes produce poor image content. Therefore, we generate a large batch of image pairs offline, filter them, and use them for subsequent training.

  3. Differential Training: We apply differential training techniques to train a LoRA model, enabling it to learn the differences between images before and after enhancement, rather than directly training on the dataset of enhanced images.

  4. Iterative Enhancement: The trained LoRA model is fused into the base model, and the entire process is repeated multiple times with the fused model until the interaction algorithm no longer provides significant enhancements. The LoRA models produced in each iteration are combined to produce this final model.

This model integrates the aesthetic understanding of Qwen2-VL-72B into FLUX.1[dev], leading to an improvement in the quality of generated images.

Usage

Please see ./artaug_flux.py for more details.

Since this model is encapsulated in the universal FLUX LoRA format, it can be loaded by most LoRA loaders, allowing you to integrate this LoRA model into your own workflow.

Examples

FLUX.1-dev FLUX.1-dev + ArtAug LoRA
image_1_base image_1_enhance
image_2_base image_2_enhance
image_3_base image_3_enhance
image_4_base image_4_enhance
image_5_base image_5_enhance
image_6_base image_6_enhance