Basics

Introduction

Diffusitron Studio is a webapp that transforms your ideas into images using Stable Diffusion. Designed for both beginners and advanced users, Diffusitron Studio offers a range of advanced features including text-to-image models, LoRAs, image-to-image, ControlNet, generative upscaling, background removal, and inpainting. Diffusitron Studio comes to “life” with an integrated virtual assistant (Diffie), who is always willing to help you create unique art by giving you ideas, automating some basic processes, and keeping you company in general. Our user-submitted community feed is a great source of inspiration, templates, and a unique learning experience for creating AI art.

Become a Creator

Creators (PRO) subscribers currently enjoy access to the complete range of features along with commercialization ownership. Upgrade to a Creator subscription by navigating to the Profile section located at the top right of your screen and clicking on “Subscription”.

Diffie Time and Top-Up

Diffusitron Studio charges for computation time. The platform approximates the amount of time remaining until you run out of credit. This is displayed next to your profile at the top right of the screen. The platform also provides an estimate of how many diffusions remain and reports the time and the dollar amount that your last diffusion consumed. To check these metrics, hover your mouse over the timer as shown below:

If you are a Pro user and run out of credit, you can always top-up by navigating to your Profile section and clicking “Top Up” in the account tab.

Note: studio charges based on computation time, and therefore some diffusions may cost more than others depending on the model and parameters that you select.

Canvas

The first thing that will probably capture your attention when you open Studio is the canvas; a virtual sheet of (practically) infinite size where you can place your generated images, place uploaded images, move them around, re-arrange them and so on. You can navigate the canvas similarly to a map by using your mouse. The horizonal panel at the top of the canvas also includes some basic controls (e.g., zoom-in/out, clear canvas, upload image, etc.)

Workspace

This is a purple frame that appears on the canvas and can be moved around. This is where you should expect the image to appear when you perform a task such as diffusion, upscaling etc. Its dimensions are shown below it and can be adjusted by dragging its corners. There are preset dimensions that you can choose from (see Workspace Size).

Whenever you click on an image that is placed somewhere in your canvas, a control sidebar will pop-up on the right side. It offers convenient one-click access to a plethora of studio features, most of which will be covered in this tutorial. This is a list of tools that can be found in the sidebar. The icons in the sidebar serve as shortcuts to these tools.

Preview: this automatically expands the image, zooms in and brings it to the front so you can inspect it. In the case of generative upscaling (see SUPIR Upscale) and inpainting (see Inpainter), it also allows you to see how the image has changed (“before” vs. “after” comparison).
Download the image
2x Upscale: simple, non-generative upscaling that increases each dimension of the image by a factor of 2
SUPIR Upscale: generative upscaling (see SUPIR Upscale)
Enhance: this is a one-click image enhancement tool, which can quickly elevate the quality of an image by changing/adding details. (see ControlNet-Enhance)
Remove Background: loads the image to the background removal tool (see Background Removal)
Image-to-image: brings up the the image-to-image tool and automatically loads the selected image. (see Image-to-Image)
ControlNet: brings up ControlNet and automatically loads the image (see ControlNet)
Inpaint: brings up Inpainter with the selected image loaded and ready to be processed (see Inpainter)

Get Started

Get started by clicking on the “Create” icon or clicking on “Diffie AI” icon, both of which are located on the left side of your screen. The content below discusses features and workflows that are available when clicking “Create”. To understand how to work with Diffie, please go to the Diffie AI section.

Basic Settings

Prompts

Writing an effective prompt is crucial in AI art, as it directly influences the style, content, and quality of the resulting artwork. Begin by entering your prompt in Studio, using the box labeled “Prompt”. Clearly define your objective or the concept you want to convey. Is it a person, thing, theme, mood, or idea? The clearer you are in expressing your intent, the better.

Use concise and focused language, avoiding overly complex instructions. Incorporate descriptions and specific details about style, elements, colors and reference works. You can even specify the intended audience or purpose of the artwork.

Remember to experiment, as AI art is an iterative process, and exploring different prompts will lead to diverse and surprising results. You can even use emojis! Don’t be discouraged - most AI artists don’t get it right the first time. The more you learn about prompt generation and advanced settings, the better you can iterate to get it just right.

PRO TIP: Make sure the most crucial element - the focal point - is written first in your prompt. Provide detailed descriptions, followed by the setting, including location, mood, camera angles, and lighting.

Structuring your prompt using brackets and numbers to assign weights to specific components can guide the AI to emphasize certain aspects of the image. Break down the prompt into individual components that use brackets to assign weights, with additional brackets adding emphasis. Double, triple, quadruple the brackets to add emphasis.

PRO TIP: If you add brackets around everything then nothing is important, as everything is the same and you risk lowering the quality of your work. Use them sparingly.

Incorporating weight numbers within the components of your prompt also communicates their importance. Adding a number between 1.1 and 2.0 at the end of a word or phrase will increase emphasis, while numbers between 0.0 and 0.9 decrease it.

PRO TIP: If you’re unsure where to start, especially with prompt weighting, simply type your concept in the prompt box and click the “Enhance Prompt” located just below the box. This feature optimizes your prompt with a single click!

Keep in mind that while adding brackets or numbers can enhance results, it’s not necessary for producing high-quality art. It is simply another advanced feature at your disposal.

This is an example of a well-crafted prompt.

Negative Prompts

Negative prompts are written in the Studio box titled “Things you don’t want to see” and involve instructing the AI on what to avoid or what content not to include. This includes characteristics, elements, styles, colors, shapes, textures, or subjects. Similar to positive prompts, specificity and concise language are highly important, as vagueness may lead to unintended results.

Negative prompts can also incorporate brackets and weights in the same way as positive prompts.

PRO TIP: Many experienced AI artists have a master set of negative prompts that they regularly use, tailored to their preferred style of subject matter. They then add a few theme-specific phrases at the beginning of each new artwork. Observing how other artists do this through the community Feed or on Discord is a fantastic way to create your own master list of negative prompts.

This is an example negative prompt that will work for most diffusions involving people as the focal point.

Image-to-Image

Image-to-image is a way to manipulate and transform images, creating new variations based on the entire initial image provided. It’s useful when you want to retain most elements of the original image, particularly color and composition. Image-to-image maps between input images and output images, and takes into account your prompt and negative prompt. The strength of this mapping depends on the percentage allocated by the user. A higher image-to-image strength percentage results in the image closely resembling the initial upload, with less emphasis on new set parameters, like prompts. At 100% image strength, the image remains identical to the original with no variation. Lower strength will look a lot less like the original image, with more emphasis on the positive and negative prompts for guidance.

The keyboard icon in the top right hand of the image-to-image window allows precise percentage input without using the slider.

Clicking on “Provide Initial Image” opens a menu for selecting your image input. You can import directly from your phone’s camera or gallery, use images from the Diffusitron Feed as templates, or select an image from your own Diffusion History. To replace an initial image with a new one, click the image thumbnail next to the percentage slider to reopen the input menu.


Original	Image-to-Image

Using a high image-to-image percentage retains the image elements, even if you change the model. Here the model was changed but all the other parameters remained the same.

Models

When creating an image, selecting the best model is essential. Models are trained on large datasets of existing artworks or creative content, learning to generate new content that mimics the style, structure or characteristics of the training data. Therefore each model excels at only certain tasks while struggling with others. Choosing the right model is crucial for achieving the desired outputs once you have your prompt, but it’s also an iterative process.

PRO TIP: After inputting the prompt, the best thing to do is to try several models or all available models and narrow down the one(s) that best match your desired effect. Then, iterate on the prompt again and run it through the narrowed-down models.

In some cases, merging models can have the best outcomes. Some users use specific models to generate people with particular characteristics, then run the image again through image-to-image with another model to maintain those characteristics while applying different styles. For example, transforming an anime character into a realistic photographic representation. ControlNet also offers similar possibilities. Given that some models excel at adding detail to an image while others focus more on composition, combining models can be highly effective!


Real Cartoon Anime	EpicRealism

This is an example of how different models can change a prompt. With all other parameters being the same, the image on the left was made with the Real Cartoon Anime Model, while the image on the right is made with the EpicRealism Model.

Models are constantly evolving, with new ones frequently added. By clicking “Select Model” users can access a wide range of SD1.5 and SDXL models. The latter produce better quality images but consume more Diffie time, on the average.

LoRAs

While selecting a model is required when creating an image, LoRAs are optional. LoRAs, or Latent Overlapping Representation for Adaptations, are small, fine-tuned models that are used in combination with existing models. Simply put, LoRAs enhance the capability of the models. They can introduce new elements to an image, such as characters, art styles, poses, details, and more. By incorporating a LoRA in with your chosen model you can generate greater diversity and variation. The overlapping latent regions enable smooth transitions between different image styles or characteristics, are much easier to use than image-to-image to add these elements, and result in more artistically interesting outputs, all while seamlessly integrating with the model. LoRAs can be accessed by clicking on “Select LoRA” just below “Select Model”.

Note: Many LoRAs work with SD1.5 models, but significantly fewer are compatible with SDXL. The list of available LoRAs automatically changes based on your current model selection.


No LoRA	Manga LoRA

This is an example of how adding a LoRA to a Model can enhance the style. The image on the left was made with the Reliberate Model, while the image on the right was diffused with all the same parameters, including the Reliberate Model, but the Manga LoRA was added. The LoRA retains the texture and style of the original model, but adds the mood, desaturation and lighting of the dataset the LoRA was trained on.

Workspace Size

The workspace size is the ratio of the width of the image to the height of the image and it is reflected in the frame that appears on canvas.

The AI will create an image according to the selected aspect ratio UNLESS an initial image is uploaded to image-to-image, inpainting, ControlNet, or SUPIR upscale. Workspace sizes are optimized for the best quality; in some cases, this may involve the AI generating a smaller image, which is then upscaled to the desired aspect ratio. This approach is necessary to maintain image quality, as a direct upscale could result in a lower-quality image based on the AI’s training.

The currently supported aspect ratios for SD1.5 models are as follows:

Ratio	Image Size in Pixels
1:1	512 x 512
1:1	1024 x 1024
3:4	1152 x 1536
9:16	1152 x 2048
4:5	1024 x 1280
4:3	1536 x 1152
16:9	2048 x 1152

The currently supported aspect ratios for SDXL models are as follows:

Ratio	Image Size in Pixels
1:1	1024 x 1024
3:4	1152 x 1536
9:16	1152 x 2048
4:5	1024 x 1280
4:3	1536 x 1152
16:9	2048 x 1152

Advanced Settings

Upscale

You can achieve larger images (higher dimensions) by using the “Image Upscale” button. The upscaling factors currently supported range from 1 (no upscaling, default) to 4.

Note: Upscaling with ratio above 2 is a PRO feature.

Sampler/Scheduler

Here you can select the algorithm that is applied by the AI to denoise the image. Note that denoising is the core procedure in all text-to-image generative AI models, and it should not be confused with standard image denoising. Denoising is a predictive process applied in multiple steps (see below) and it greatly affects the quality of the generated image. The default option is DPM++ SDE Karras, which works well in the majority of cases. However, there is a long list of options available to choose from. For a detailed description/tutorial on samplers see: Stable Diffusion Samplers: A Comprehensive Guide.

Number of Sampling Steps

This parameter refers to the number of steps the AI takes in generating your image. Increasing the step value generally adds more details but also requires more computation and prolongs the diffusion time. Currently, the number of steps supported range from 1 to 50.

PRO TIP: Although it is possible to have higher step counts, most images do not improve beyond 30 steps, so the tradeoff in computation time and image quality tends to diminish above 30 in most cases.

CFG Scale

CFG stands for Classifier-free Guidance, which adjusts how much the output will be like your prompt by employing neural networks, particularly in style transfer and image synthesis tasks. Be cautious with this setting as it can have profound undesirable effects on your results, especially at higher values.

PRO TIP: Higher CFG values tend to produce more creative images, while lower values are a better bet if you have specific images and styles in mind. CFG has an effect on the content in terms of the spatial arrangement and semantic information, as well as texture, color, and style. If you are producing consistently odd and low quality images, check your CFG setting first!

CFG scale ranges in the app from 1 to 20.


CFG at 1.0	CFG at 10.0	CFG at 20.0

These images have the same prompt, model and parameters, and the only difference is the CFG scale.

Seed Values

Seed is the initial input parameter or random value that influences the generation process. You can use the same seed to obtain the same image output as a previously generated image, while using random seeds allows for exploration of different outputs. It is usually best to enable it for beginners. If you have a specific AI image you want to recreate, and you have all the other information to do so (such as prompts, negative prompts, CFG, steps, etc.), inputting the seed value can precisely guide the generation process. Seed values serve as a starting point for the generation process and control the creative output of the models.

Images per Diffuse

As the name suggests this setting determines how many images will be diffused at once with the same parameters/prompts. If the seed value is set randomly the generated images will be different. If the seed is fixed, then it is only meaningful to diffuse one image and the setting defaults to 1.

Note: Generating more than 1 image at a time is a PRO feature.

Note: Generating more than 1 image at a time will default to random seed values.

Inpainter (Pro Feature)

Inpainting allows you to selectively target and modify specific parts of an image while leaving the rest of the image as is. Note that this is a PRO feature.

First, upload the image you wish to inpaint (directly from local store or from canvas). A section will appear below. This is the masking screen, where your image is displayed in preview alongside five options on the bottom: undo, redo, set, clear and set mask.

You can chose from a wide range of brush sizes (“Line Width”), ranging from 1 to 100 pixels, to precisely define the areas of your artwork. You can use the undo and redo buttons as needed. If you are satisfied with your choice, click “Set Mask”. Please note that studio will not remember your choice if you close the window without clicking “Set Mask”.

PRO TIP: Inpainting requires a defined mask. If inpainting is not working, ensure that a mask has been selected. This is the first step to troubleshooting if you encounter issues, as your mask may not have been saved or may have reset from a previous session. You must select Set Mask before diffusing.

Once you have chosen your masking area, specify what the AI should replace the areas with by typing in the prompt box (titled “Fill painted areas with this”) and negative prompt box (titled “Exclude this”).


Setting	Original	Inpainted

Masking only certain areas will ensure that you can pinpoint exactly what should be different, or fix mistakes when an image is nearly perfect.

Similar to image-to-image, the strength of inpainting depends on the percentage allocated by the user. You can accurately input the percentage using the horizontal bar below the masked image. A high percentage indicates that the masked area will be completely transformed into a new image, while 100% strength will solely rely on the new prompt and negative prompt.

PRO TIP: Be careful when using a high percentage, as it may result in images lacking cohesion with the original artwork, resembling a poorly-cut collage with sharp demarcation edges.

A low percentage strength will minimally impact the original image, resulting in subtle differences. Inpainting strength varies greatly depending on the artist’s intent. Click DIFFUSE on the bottom of your app to generate the new image.

ControlNet (Pro Feature)

ControlNet has been added to provide users with more structural and artistic control over their images. It utilizes fine-tuned models to condition the input images. For example, some models enable the generation of entirely new images but preserve poses, while others will keep the integrity of the edge detections. Note that this is a PRO feature.

PRO TIP: It is always a good idea to test out several models to determine which is best for your specific need. The choice of model depends on factors such as the type of initial image uploaded and the desired output.

Begin by uploading an initial image by clicking “Provide an initial image” in the ControlNet tab. You can load an initial image from your local storage. Alternatively, if an initial image is already on canvas, simply click on the current image to bring up the sidebar and select ControlNet from there.

Select the ControlNet model you intend to use in “Selected ControlNet Model”. Make sure your prompt and negative prompt reflect the image you want the AI to create, as it will be changing your image with those prompts and the model strength you set. In the case of ControlNet, you must choose a ControlNet model, a model from the Model gallery, and you have the option of also choosing a LoRa. A low percentage ControlNet strength will minimally impact your initial image.

Basic Settings: ControlNet models have two additional parameters; Eta and Guess Mode - Eta: you can think of it as a dial that controls how much the process focuses on the details and noise of the image. A higher value makes the process focus more on the fine details, while a smaller value emphasizes broader, smoother features. - Guess Mode: when selected, the prompt and negative prompt will be ignored and the AI will try to automatically recognize the object(s) in the imported image. Note that you will not be able to see what objects the AI detects and what descriptions it attaches to them. Use with caution.

Advanced Settings: these are similar to those in regular diffusion workflows, as previously discussed.


Original Image	Controlnet Settings	Control Net Output

ControlNet can allow you to retain the pose and composition of an image, even when the style and content is very different. Here, the prompt, negative prompt, and even Model and LoRA have been changed, but the composition and pose is still intact.

Enhance

This is a one-click image enhancement tool, which can quickly elevate the quality of an image by changing/adding details. The tool is “trained” to add details that will maximize quality, and therefore it might intervene and make significant changes, but it will retain the main theme.


Original	Enhanced

Typical outcome of enhancing an image with the “enhance” tool found in the image control sidebar. Original on the left, enhanced image on the right.

ControlNet Scribble

This type of ControlNet allows you to manually draw/scribble using a paintbrush and use it as a reference initial image. The steps required are as follows:

Select Scribble
A section appears below including a white-background canvas on which you can draw.
Select the line width for the brush tool.
Draw.
Save Canvas: your scribble now appears in the initial image section above.
Scroll down and click “Get Detection” if you want to see the mask that will be created internally by the tool (OPTIONAL)
Choose parameters in Basic/Advanced Settings and Diffuse!


Scribble	Generated

Example use of Scribble ControlNet, from drawing (left) to generated image (right).

Background Removal (Pro Feature)

Studio supports background or object removal for uploaded or generated images. Simply navigate to the “Background Removal” tab, load an image from local storage or canvas into “Image to Remove Background” and click “Diffuse”. This will remove the background and preserve the main object. Instead, if you wish to preserve the background and remove the main object, click “Remove Object” before diffusing. Note that this is a PRO feature.


Original Image	Background Removed	Object Removed

Background Removal tool: original (left), background removed (middle), object removed (right).

Note: there is a limit to the size of the initial image when removing background, which is 2K x 2K pixels. Any sizes greater than the one above will lead to an error.