Using FLUX Models on Invoke

Modified on Wed, 30 Oct at 2:28 PM

Supported Flux.1 base models

  1. Pro - Not supported
    1. Accessed via API only
  2. Dev - Open-weight, guidance-distilled model requiring license for commercial use
    1. Professional Edition: Supported for users with a commercial license
    2. Community Edition: Supported for all users
  3. Schnell - Fastest Flux model designed for local development and personal use
    1. Supported in both Professional and Community Editions


Accessing Flux models

Professional Edition:

  • Flux Dev - We are working to allow users to self-upgrade directly in the Invoke application but currently users can request commercial access to the dev model through this form
  • Flux Schnell- Users can use the ‘FLUX Schnell’ model by using the ‘Add account models to project’ dropdown on the Model Management tab within Project Settings or upload their own version of the Schnell model for use
    • Note: Enterprise users may need to have an Account Admin add Flux models to the Enterprise account


Community Edition:

  • Flux Dev - Users can use the ‘FLUX Dev (Quantized)’ model found on the starter models tab or upload a version dev model for use
  • Flux Schnell - Users can use the ‘FLUX Schnell’ or ‘FLUX Schnell (Quantized)’ models found on the ‘Starter Models’ tab within Model Manger or upload their own version of the dev model for use


Troubleshooting errors when uploading Flux models

Right now, there’s a wide range of different formats being used by model trainers and fine-tuners across the ecosystem, and unfortunately, there isn’t any clear standardization. Model trainers are often not specifying the formats they use, which can cause issues when uploading models.


We’ve chosen to support the most commonly used format variances, but those are not well labeled on the sites that host Flux LoRAs, so it’s hard to give guidance on which ones work and which don’t yet. In general, you can understand current model support through the following rules:

  • Models with full (non-quantized) model weights (float8, float16, bfloat16, float32) should work
  • bitsandbytes NF4 quantized models should work
  • GGUF quantized models will be supported soon
  • Most LoRA models trained with diffusers or kohya should work. Please report variants to [email protected] and we'll work on adding support


We’re working on driving that standardization through the Open Model Initiative, but for now, we’re focused on optimizing for the most widely adopted formats. If your model isn’t working, it could be due to the format it’s been trained in. Feel free to reach out if you need more help!


Troubleshooting slow generation speeds for Flux

A common cause of slowness is unnecessary offloads of large models from VRAM / RAM. To avoid unnecessary model offloads, make sure that your ram and vram config settings are properly configured in ${INVOKEAI_ROOT}/invokeai.yaml.


Example configuration:

# In ${INVOKEAI_ROOT}/invokeai.yaml

# ...


# ram is the number of GBs of RAM used to keep models warm in memory.

# Set ram to a value slightly below you system RAM capacity. Make sure to leave room for other processes and non-model

# Invoke memory. 24GB could be a reasonable starting point on a system with 32GB of RAM.

# If you hit RAM out-of-memory errors or find that your system RAM is full resulting in slowness, then adjust this value

# downward.

ram: 24


# vram is the number of GBs of VRAM used to keep models warm on the GPU.

# Set VRAM to a value slightly below your system VRAM capacity. Leave room for non-model VRAM memory overhead.

# 20GB is a reasonable starting point on a 24GB GPU.

# If you hit VRAM out-of-memory errors, then adjust this value downward.

vram: 20


Flux model support

Transformer Models

Model Format

Support Level

Notes

Official BFL-format Weights (fp16, bf16, or fp32)

Supported


BFL-format weights (cast to fp8)

Supported

Supported, but we cast to fp16 so fp8 offers no memory savings


fp8 formats not recommended because typically worse performance than quantized models of the same size

Quantized bitsandbytes NF4

Supported

Install the base model via ‘Starter Models’ list in Invoke

Diffusers-format weights (fp16 or fp32)

Not Supported


BFL-format weights with GGUF quantization

Supported




T5 Text Encoder Models

Model Format

Support Level

Notes

Standard weights (fp16, bf16, or fp32)

Supported


bitsandbytes LLM.int8()

Supported

Install the base model via ‘Starter Models’ list in Invoke

Standard weights (cast to fp8)

Not Supported


Standard weights with GGUF quantization

Supported




CLIP Text Encoder Models

Model Format

Support Level

Notes

Standard huggingface/tranformers-format (fp16, bf16, or fp32)

Supported




LoRA Models

Model Format

Support Level

Notes

Diffusers LoRA

Supported


Kohya LoRA

Supported

Kohya LoRA transformer models, including text encoder layers, are now fully supported


LyCORIS variants (LoHA, LoKr, etc.) are supported when using standard models but remain limited when applied on top of quantized models


Support for text encoder layers in T5 models is not yet available

OneTrainer LoRA

Not Supported





Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article