Getting Started with Invoke Training

Modified on Thu, 20 Feb at 11:33 AM

Getting Started with Training on Invoke Platform

Invoke’s Model Training, available on all of our product plans, allows your team to securely train models using your intellectual property in a safe and easy-to-use environment. This guide will walk you through the process of setting up your first training session, from creating a dataset to initiating a training job.

Step 1: Creating a Dataset

Before you can train a model, you need to create and prepare your dataset:

Name Your Dataset: Start by providing a name and a description for your dataset.
Upload Images:
- Click on the 'Upload Items' button.
- You can either drag a folder into the upload area or click to select files from your file explorer.
- Confirm the upload and wait for your files to be uploaded successfully.

Step 2: Determine Your Objective

Invoke currently offers two main types of model training: Concept Models and Embeddings. Understanding the differences between these types is crucial for selecting the right approach to preparing the dataset for your needs.

Concept Models

Concept Models (also referred to as LoRAs) involve detailed, intensive training where the model learns to understand and generate new content based on the specific characteristics detailed in your dataset. This type of training is powerful for tasks that require a deep understanding of complex features or styles, especially those that would not already exist in the model. To train a Concept Model effectively:

Detailed Captioning Required: Each image must be captioned meticulously to describe all relevant and irrelevant aspects of the style or feature you are focusing on. You will want to ensure that you are consistent in your captioning - If there is a certain theme or pattern, make sure that you leverage the same description in each image.
High Customization: Allows for significant control over what the model learns, making it ideal for specialized applications.

Embeddings

Embeddings are a simpler tool to use and train, and focus on creating a more concise prompt word that is optimized to invoke the concepts provided in the embedding, without the need for extensive retraining of the model. This is used as a more efficient way of prompting for something that a model is already capable of producing.

No Captioning Required: Embeddings do not require detailed captions of the dataset, making them quicker and easier to implement.
Quick Integration: Embeddings can be used to more efficiently prompt and articulate concepts that already exist in models, ideal for when you need to leverage the functionality of a model with minimal effort.

If you prepare a dataset with full captions, you can always use it with both types of Training!

Step 3: Setting Up Your Training Job

Once your Dataset is ready, you’ll now want to set up a new training job:

Click the “Create Job” button on the top right of your Training Dashboard
Select Your Dataset and Model Details:
- Choose the dataset you prepared earlier.
- Specify the model name and select the architecture suitable for your needs.
Training Options: Select whether you are creating a Concept Model or an Embedding.
Choose a Base Model - This will be the context that the Concept Model or Embedding is optimized for, although they’ll still work on many other models.
Choose a Training Template - These simply help optimize the training settings for various objectives you might have.
Concept Models - Caption Prefix (Optional):
1. When training, you want to ensure that you have a Prompt Trigger in your captions, whether that is in the captions themselves, or added during training using the Caption Prefix tool. This is a phrase used across the dataset for the primary concept you are trying to train into the model. Use something unique that wouldn’t already be a common phrase (e.x., TKXW).
2. If you have not already added this to your dataset, you can use Caption Prefix to prepend this to every caption in the dataset.
3. Do not add this if your concept is already referenced in the captions - You can leave this section blank.
Embeddings - Initial Phrase: An Initial Phrase is a starting point for a prompt that would be a decent starting point for the type of content you’re looking to use from your dataset. For example, if you are training a style embedding, try finding a short word that approximates the style (e.g., “painting”) - 8 characters max!
Learning Rate: Think of the learning rate as the speed at which a student (in this case, the AI model) learns a new topic. If the student tries to learn too quickly, they might miss important details or misunderstand the information. But if they learn too slowly, it might take a long time to understand the topic fully.
- Speed of Learning: The learning rate adjusts the speed at which the AI learns from mistakes. A high learning rate means the AI quickly changes its approach based on new information, while a low learning rate means it changes slowly and cautiously.
- Finding Balance: The goal is to find a good balance where the AI is learning fast enough to progress but not so fast that it makes too many mistakes or misses out on learning important patterns. We recommend starting on our default value!
Training Steps: Training steps are like the number of practice problems a student does while studying. Each step gives the AI a new problem to solve, and it learns a little more after solving each one.
- Practice Makes Perfect: Each training step involves the AI looking at a set of examples (typically batches of four images from your dataset), trying to predict the correct answers, seeing where it went wrong, and then improving. The more problems it solves (or training steps it completes), the better it gets. We recommend no less than 2000 steps!

Step 4: Training Execution

After setting up your job:

Start the Training: Initiate the training process by clicking on the ‘Begin Training’ button.
Monitor Progress: You can monitor the training progress through the dashboard.

Step 5: Validation and Adjustment

Upon completion of the training:

Review the Results: Check the validation outputs to see how well your model has learned the intended styles or concepts.
Select Models: You can select multiple models from the resulting training. These will be displayed on the left side of the screen.
Finalize Your Model: Select the best-performing iterations of your model to finalize it. Unused models will be deleted to free up storage on your account, and the selected models will be available to add to your projects from your account settings.

Conclusion

Training your own models on the Invoke platform allows you to leverage your unique intellectual property securely and effectively. If you encounter any difficulties or have questions during the training process, do not hesitate to contact our support team. We’re excited to see what you create with your new training capabilities!

Video

You can see a quick video with an example of the model training process below.