Getting Started with Invoke

Modified on Tue, 13 Jun 2023 at 12:00 PM

First, we're excited to welcome you to Invoke AI!

Our tool has grown and evolved from a simple toolkit to a powerful and flexible application with the ability to help you co-create with AI.

Whether you’ve used image generation technology before or not, getting started with a more advanced tool like Invoke AI requires some learning - But we’re here to help.

Our team is comprised of experts who have been working in this space since the early days of Stable Diffusion’s release, and we’ve crafted a guide to help you understand the basics, how pro-grade tools differ from others, and help you in generating your first set of images.


We think it is important to understand the basic mechanics of how this technology works, so that you understand the tools you are using.

In training the technology used by Invoke, Machines have been presented an image and description. They learn how those images appear as “Noise” is progressively added to an image. 

The computer is then tasked with recreating the image by reversing the “noise”, based on the original description, from memory - This process is called `denoising`. At first, the machine fails but with enough time it starts to get very good at creating coherent images based on only text descriptions. It becomes so good that it can even do this without needing an original image at all!

This technique is how the machine learns to take a text description of an image (your “prompt”) and create images from it. With a trained model, you can simply pass in a new Prompt, and it will try to generate the described image.


To create the most accurate images, you need to provide the AI with detailed prompts. Just like an artist, the AI benefits from clear instructions about your desired image. You will typically want to include words that effectively describe the aesthetic you're looking to generate, from mood, lighting, medium, to compositional terms like "Close-up" or "portrait".

Read more in depth about Tips on Crafting Prompts


Depending on the type of images and content that the machine is trained on, it may have a different understanding of what the world looks like. Different models, just like different artists, have a different understanding of the words you are using. A "portrait of a calm lazy cat" may come out totally different, depending on the model selected!

Invoke's FantasyAndArt
XpucT's Deliberate V2
Seek.Art's MEGA V2

Read more in depth about Models here - What is a model? Which should I use?

Generation Settings

CFG Scale: The _CFG Scale_ controls how hard the AI tries to match the generated image to the input prompt. You can go as high or low as you like, but generally values greater than 10 will begin to cause image generation quality issues, and values lower than 5 will produce unexpected images. There are complex interactions between _Steps_, _CFG Scale_ and the _Scheduler_, so experiment to find out what works for you, however we recommend a CFG scale of around 7-8.

Scheduler & Steps: Schedulers guide the process of removing noise (denoising) from data. They determine:

  1. The number of steps to take to remove the noise.
  2. Whether the steps are random (stochastic) or predictable (deterministic).
  3. The specific method (algorithm) used for denoising.

Schedulers can be intricate and there's often a balance to strike between how quickly they can denoise data and how well they can do it. It's typically advised to experiment with different schedulers to see which one gives the best results. There has been a lot written on the internet about different schedulers, as well as exploring what the right level of "steps" are for each. You can save generation time by reducing the number of steps used, but you'll want to make sure that you are satisfied with the quality of images produced!

Width & Height: These settings control the width and height of the output image that will be generated. For the best results, we recommend hovering over the model to review the tooltip for the model you've selected (typically 512x512 to 768x768), and ensure that you generate images that are around the suggested resolution for the model you've selected, until you learn more about how to control, tweak and edit images at larger resolutions to correct for issues.

Some suggestions for getting started:

Follow our guide to prompting, or search the internet for "Stable Diffusion" prompts that are aligned with the style you're looking to achieve. Be sure that prompts you use are formatted for Invoke, by following our Prompt Syntax


Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons

Feedback sent

We appreciate your effort and will try to fix the article