Image from sketch

Pipeline

How to create image from sketch

The simple workflow for creating AI images from simple sketch

In this brief article we will show you how to create images from sketch using artificial intelligence tools. We will use the neural network Stable Diffusion 1.5 and Stable Diffusion XL, as they provide the most controllable process of generation, but we also will share results of generating images based on sketch in MidJourney and how we tried to replicate this process in DALL-E 3.
Together we will try creating image of the space with a red planet with rings and a yellow sun. Let’s start :)

Step 1. Drawing sketch
In Phygital+ interface you can draw a sketch without leaving your browser. In order to do that, you need to click on the Brush icon in the upper right corner and switch to Sketching mode.
Note: you can draw sketch in any other software, but for the convenience’s sake we recommend using Phygital+ interface

So, we drew a red planet and a yellow sun. In our example we want to make a square image, so we don’t change the aspect ratio, but if you need to do so, you can switch to 16:9 or 9:16. As soon as we finish, we need to press ‘Sketch’ button to add our sketch to the workspace.

Step 2. Creating a first iteration of image concept with Stable Diffusion 1.5

Now we need to create Stable Diffusion 1.5 node and connect our sketch as a Start image. Then we should write a prompt with the simple structure: <our subject>, <keywords>.
We want to create a concept of the cosmos with a planet and a sun, so we write down to the beginning of the prompt ‘red planet with rings, yellow sun’.

Now we need to add keywords, some of them we can type down write away — in our vision we have a fantastic concept, so we add stunning fantasy galaxy art, digital artwork to the prompt. And intricate details, milky way for better composition.
If we want Stable Diffusion to work great, it’s better to add more words to guide generation into the right direction (with simple words you are very likely to get creepy results). If you have troubles writing prompts, you can ask AI assistant in the chat window for help. Let’s write what we have on the image. We want more detailization of a starry sky, so we type ‘galaxy stars, space’ and get a ready-to-use end of the prompt.

Let’s copy the words that would fit out needs and we get this prompt: red planet with rings, yellow sun, stunning fantasy galaxy art, digital artwork, intricate details, milky way, awe-inspiring, nebula, astro art, deep space, long exposure, stars, space explorations, high detail, glowing, radiant, magical, holographic colors, by Roger Dean, Steve Gildea, Chris Foss, and Vincent Di Fate, trending on Artstation HQ.

Now we need to set the parameter Start Image Skip to 0.75.
Start image skip (Denoising strength) determines how close to the Start image your generation will be. The lower the number is, the more similar to the reference a generated image will be.
If you put this parameter to the number that is too high, you can lose the composition in the final result.

Our goal at this step is to transform sketch into an undetailed concept. 0.75 is an optimal number to get rid of sketch brush strokes and save composition.
At this step you can choose and model or Style (Styles — custom finetuned checkpoints / models of Stable Diffusion). We use Reliberate model, but you can try any of 90 available models, such as DreamShaper, Absolute Reality, RunDiffusion or Noosphere.
Let’s press Generate, it will be our first iteration, so let’s not focus too much on the lack of details and artistry.

Step 3. Upscaling
We need to use Upscale node to increase resolution of our image. Let’s connect the result which we liked the most as a Start image and press Start. Meanwhile we can copy our initial Stable Diffusion 1.5 node by CtrlC+CtrlV , change Start Image Skip to 0.2, put the Number of images to 1, and connect the result image from Upscale node to Start image. If we do that, we will get our concept in higher resolution.

Step 4. Adding details to our concept
Now we can create a more detailed version using Stable Diffusion XL img2img node. Connect the image from the last SD 1.5 node, copy the prompt, put Start Image Skip to 0.7. You can change prompt if you want — in our case we decided to add a little more stars:
red planet with rings and yellow sun, many stars, multiple colorful stars, stunning fantasy galaxy art, digital artwork, intricate details, milky way, awe-inspiring, nebula, astro art, deep space, long exposure, stars, space explorations, high detail, glowing, radiant, magical, holographic colors, by Roger Dean, Steve Gildea, Chris Foss, and Vincent Di Fate, trending on Artstation HQ

Our image from sketch is ready!

Step 5. Creating variations
We can also get variations of this image by creating a new SD XL (img2img) node, copying the prompt and setting Start image skip at 0.4.

You can recreate this pipeline and take a closer look at the settings in our product in the template Image from Sketch.
Is it possible to create images from sketch in MidJourney and DALL-E 3?
At the moment it’s very hard to do so due to the specifics of settings of these tools.
MidJourney. It allows you to use any image as a reference and set its weight using the parameter -iw. You can put any decimal number from 0 to 2 (2 is the strongest influence of the Start image). We took our sketch from the Step 1, uploaded it to Imgur and put the link to the image in the beginning of the prompt. However, even though we used the same prompt and same settings, and changed only image weight, we couldn’t get the same composition as in our sketch.
Plus, it leaned too much towards replicating the sketch style.

But MidJourney is amazingly good at the lighting!

DALL-E 3. Using Start image is currently not possible in DALL-E 3, but we tried to involve GPT-4 to somehow make it possible. We asked it to create a prompt description for DALL-E 3 based on our sketch. These are the results we’ve got. Well, unfortunately, the sketch composition is mirrored, even though we have ‘in the upper right corner’ in the prompt.

In the last node on the screenshot we tried to change the prompt a little bit to hopefully make it look more like the initial composition of the sketch, but we couldn’t get. But these images are stunning nevertheless :)

So, despite the good and HQ images by DALL-E 3 and MidJourney, Stable Diffusion still performs the best with the image references and sketch images in particular. By tweaking Start Image Skip (Denoising Strength) parameter you can achieve very versatile results. Here are the results of these 3 tools:

Another quick note: you can experiment and try using the sketch right away in Stable Diffusion XL (img2img) without using Stable Diffusion 1.5, but the results may turn out to be worse. In general, working in iterations is one of the best strategies while working with neural networks. Flexible and controllable process of generation in Stable Diffusion allows to achieve the best results.
Let us know, if you tried doing the same task with AI, which tools you’ve used and how you’ve liked the results :)
In Phygital+ we have all the best neural networks from MidJourney, DALL-E 3, Stable Diffusion to GPT-4, ControlNet and DreamBooth in one workspace, and our node interface allows to control the content creation process even more and combine all AI tools in one pipeline.
Very soon we will share our new top list of best prompts for DALL-E 3, MidJourney, Stable Diffusion and a comparison of these tools. Stay tuned for that and subscribe not to miss it!