Model Types
Invoke uses two broad types of models—Interpretive and Rendering—each with their own strengths and weaknesses.
Interpretive Models | Rendering Models | |
How to think about them | Low learning curve, low to medium creative control | High learning curve, high creative control |
What they do | Parse natural-language prompts, follow reference images. | Focus on pixel-level synthesis with fine-grained control. |
When to use | You’d rather “say” what you want than draw or composite what you want. | You need surgical control over style, composition, detail, etc. |
Typical output | Good directional drafts that match your prompts or reference images. | Highly controllable work tuned via prompt tags, control layers, and custom-trained models. |
Model families | ChatGPT-4o FLUX Kontext Imagen 3 Imagen 4 | Any Stable Diffusion XL (SDXL) model (eg. JuggernautXL) Any Stable Diffusion 1.5 (SD 1.5) model Any FLUX.1 (dev) model |
Model | Type | Control Level | Learning Curve | Best For | Prompt Style | Key Strengths | Limitations |
---|---|---|---|---|---|---|---|
ChatGPT-4o (API) | Interpretive | Low-Medium | Low | Directional drafts, natural language instructions, text | Instruction + Long-form | Intuitive prompting, follows complex instructions | Limited pixel-level control |
FLUX Kontext (API) | Interpretive | Low-Medium | Low | Quick iterations, prompt-based transformation | Instruction | Fast, good prompt adherence | Less creative control |
Imagen 3 (API) | Interpretive | Medium | Low-Medium | High-quality photorealistic outputs | Long-form | Good image quality, natural language understanding | Limited editing capabilities |
Imagen 4 (API) | Interpretive | Medium | Low-Medium | Latest generation photorealism | Long-form | Cutting-edge quality, improved prompt handling | Limited editing capabilities |
JuggernautXL (SDXL) | Rendering | High | High | Detailed creative work, style control | Prompt tags | Fine-grained control, extensive customization, established ecosystem | Full control involves a learning curve |
SD 1.5 Models | Rendering | High | High | Extremely efficient styling and rendering work, customization, specialized tasks | Prompt tags | Extensive customization, established ecosystem | Requires technical knowledge |
FLUX.1 (dev) | Rendering | High | High | Professional quality, prompt adherence, high quality customization | Long-form | Developer-focused, high precision | Technical complexity |
How to write prompts for different models
Prompt style | What it looks like | Best for |
Instruction | “Generate a neon-lit logo” “Replace the cobblestone street with flat stones” “Add a misty fog to the scene” | ChatGPT-4o, FLUX Kontext |
Long-form | “A cinematic wide-angle shot of a misty rainforest at dawn with soft volumetric light. The rainforest is filled with vibrant diverse flora and fauna” | Imagen, FLUX Dev, ChatGPT-4o, FLUX Kontext |
Prompt tags | “ultra-wide, 32 mm, concept art, vaporwave palette, award-winning” | SD 1.5, SDXL (eg. JuggernautXL) |
Pro tip: Negative prompts (“deformed, blurry, watermark”) work best on Rendering models.
Recommendations for getting started
- Brand-new to AI? Start with Interpretive Models like Imagen or ChatGPT-4o before learning Rendering Models like FLUX or SDXL.
- Looking to change something small about an image with text guidance? Choose FLUX Kontext, upload an image as a Global Reference Image, and add a short instruction to your prompt field.
- Looking to master what the pros use? Watch our YouTube series and explore control layers and inpainting techniques with FLUX Dev and SDXL models.
Troubleshooting and tips
Symptom | Likely Cause | Quick Fix |
I put in a text prompt but the image comes out kind of janky (weird faces, weird hands, etc). | Rendering models are weaker at generating “error-free” images with just text prompts. | If you are just using text prompts, try using an interpretive model. |
I’m giving directions to change one part, but it’s changing the whole image. | Certain models do not have the capability of using instructive prompts + targeted guidance. | Either:
|
I can’t get it to generate in the exact style that I want. | The models don’t perfectly understand your style. | Try using a reference image with an interpretive model like FLUX Kontext. If that isn’t sufficient, you can explore training your own LoRA model in our Model Training app. |
More resources
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article