14 minutes 2 seconds
🇬🇧 English
Speaker 1
00:00
All right, welcome everyone to the first video episode of Why Try AI. And today I'll be showing you how to use free AI art generators to test out ideas and text prompts and improve them so that you can then use them in the more sophisticated models. Now, there are a ton of amazing AI art generators out there that can make mind-blowing art that looks like a real artist has drawn them. I encourage everyone to try them because it's really, really amazing technology.
Speaker 1
00:32
Now, unfortunately, a lot of them require some form of sign-up steps, registration process or even a waiting list. Looking at you, Dali, too. You get some free credits to play around with most of them, to generate some new images and so on, but sooner or later you have to start paying to get more images. Arguably it's not a lot of money for what's essentially wizard powers, but it's still something to consider.
Speaker 1
00:59
And for the casual user that's maybe where they stop. And I think it's a shame because it's amazing tech that will only get better. So today I wanted to introduce you to 2 free tools that allow you to do a lot of testing quickly and without signing up at all. They are crayon that you see here on the screen.
Speaker 1
01:18
It used to be called Dali mini and deep AI text to image generator. Now they both do exactly the same thing. They're 100% free. They require no registration.
Speaker 1
01:29
You just type in your text prompt in the box here, you go submit and you wait for the AI to come back with an image. So they do away with all of the extra bells and whistles that the other models give you, like defining the number of iterations the AI should use, tweaking the image settings, deciding which algorithm it should use behind the scenes and all of this nitty-gritty stuff that's maybe overwhelming for the casual user. Instead, both DeepAI and Crayon, they force you to just focus on your text prompt and since that's basically at the core of how you will talk to these AIR generators, it's actually the most important skill to master early on. The good news as well with these 2 tools is that they return multiple images for each text prompt.
Speaker 1
02:19
So Crayon will give you 9 and Deep AI will give you 4, which is great if you're trying to see whether your text prompt gives the same result each time, because if most of the 9 images look correct, you're probably at a place where you can take that text prompt and then move it on to 1 of the better models to produce your final result. So before we get started, 2 caveats. These 2 models are definitely of poor quality, So what you'll get out here is not nearly as polished as what you can expect. So don't use them as a Benchmark for what the eye is capable of you'll be disappointed with these ones.
Speaker 1
02:55
You're not looking for perfection You are instead looking for consistency So you want to get as many of the images to return what you want to make sure that your text prompt is good. And the second caveat is just because something does work in Crayon and DeepAI, doesn't 100% guarantee that popping that same text prompt into another tool will give you the same outcome. The models are different, so what happens behind the scenes, the algorithms that run may spit out different results. But you are very likely to be further ahead in your iterative process, and you'll save yourself many steps before you start paying credits for the paid tools.
Speaker 1
03:36
So, normally I'd recommend having 2 of them open at the same time, so you can throw the same text prompt into both DeepAI and Crayon. For the sake of this video, I'll just be using Crayon to show you how the process works. So without further ado, let's get started. Now for my test case, I wanted the AI to give me a sort of dark dystopian movie post of a giant mouse hovering over a city.
Speaker 1
04:01
Now you probably already know how that might look like based on the many monster movies you've been exposed to. So it's something dark and sinister and maybe black and white. Now following my own advice which you can read about in the blog post that accompanies this video, I'm going to start out with the minimum possible amount of information and iterate from there. So here is what my first prompt will look like.
Speaker 1
04:26
It's a giant mouse looking down on a city movie poster. Let's see what that returns. Now I'm gonna press go and usually this model takes around 1 minute it says here up to 2 minutes in my experience about 1 minute to return an image. I'm going to pause the video and return to it when it's done.
Speaker 1
04:45
Okay so we have our first output and you can clearly see that this is very far from what I had in mind. Yeah, we have a giant mouse, we do have a city, but this is definitely not the sinister dystopian look I was going for. It's very goofy and cartoonish and laughable, right? So clearly we need to do something with our text prompt to get slightly better.
Speaker 1
05:08
Before I do that though, I wanna 1 housekeeping note is that every time you generate anything, no matter how far it is from what you want, I strongly recommend you take note of which text prompt you used and then somehow also the images you got. Crayon is fantastic for this because it has this tiny screenshot button and pressing that it generates a screenshot of all of the 9 images you see here on the screen, along with the text prompt that generated them. So in the future, you can look through all of your generated images and see how the different tags you've added or removed affected the final outcome, which is super useful because in the future it will help you get better. So now what do I want to change here?
Speaker 1
05:52
I mean, I this is too cartoonish or maybe too unrealistic, right? I want something more realistic. So I'm going to try to say photo of a giant mouse looking down in the city and I'm gonna remove this movie poster because that doesn't seem to have any benefit and I want it to be sort of hovering over the city so I want to be looking up at it from the ground so I'm gonna write view from the ground And let's see what that gives us. Okay, so arguably, this is potentially even worse than the first image, it's a giant mouse anymore, it is photorealistic.
Speaker 1
06:27
So, you know, can blame the computer for giving me what I want. And it also literally gave me the view of the ground instead of the view from the ground. So I guess these kind of relational terms are not quite good for feeding the model. So I need something that gives me the angle that I want, that I'm looking up at it and I need something that gives it that sinister look.
Speaker 1
06:47
So here's what I'm trying next. And that's a silhouette of a giant mouse looming over a city. So it gives it that sinister feel seen from below. I'm not mentioning the ground.
Speaker 1
06:57
So hopefully the ground will not be in the picture. Let's see how that goes. Okay, I feel like we're slowly getting there, I mean the aesthetic is kind of getting through now, it's a bit dark, it's a bit sinister, we do have that view from below, but I'm not happy with the fact that it's not really a giant mouse and that we get the whole mouse body squeezed into it. So clearly this is not the giant mouse looming over a city and taking up the bigger part of the screen that I want.
Speaker 1
07:25
So what I'm gonna try next is to say, torso of a giant mouse looming over a city scene from below, right? This should give the AI something to go on. Let's find out. So this is nightmare fuel and definitely not in the way that I wanted.
Speaker 1
07:44
So now the torso is like anatomically correct-ish and awful and takes over the entire picture so not what we want and it looks like dropping the silhouette part was a terrible idea because it took away that sinister aesthetic so I want to put that back in again. Shall we try to keep the torso and see if the silhouette helps us fix it? Let's see because the rest I think might work. Let's keep it this way.
Speaker 1
08:18
Okay, so this was sort of a step in the right direction in that it does look a bit more sinister and dark because of the silhouette turn, but clearly the torso part still is too dominant and doesn't give us anything close to what we want. So I'll have to drop the torso part. And I'm thinking of actually trying to go back to my idea of using a photo image in combination with the silhouette to see if we can, of a silhouette of a, So here's what I'm trying to do here. I'm trying to add the silhouette back in.
Speaker 1
08:53
I'm trying to give it that photo look that it's a realistic monster, right? I'm adding the word scary because, well, it shouldn't be a happy mouse. Anthropomorphic so that it is a human-like mouse, and looming over a city seen from below, that seems to work well, so I'm keeping those in. Let's see where that gets us.
Speaker 1
09:10
Yes, okay, now we're really getting somewhere, I feel. We've captured the sort of dystopian look with the black and white silhouette. We have a lot of the angles correct in many of the pictures, I think especially the top right seems good. It's a giant mouse, clearly bigger than the city, looming over a city.
Speaker 1
09:30
Some of the images are a bit queer and goofy and off angle, but we are the closest we've been to my initial thought. So what I'm gonna try and do and see if that improves or worsens things is to add something else I had in mind. So I need to add something else. With laser eyes.
Speaker 1
09:50
Bear with me, bear with me. Maybe it will give it that even more futuristic, sinister look. So let's try to see what that gets us. Yeah, so that was not a good idea and we took a step back because instead of laser eyes we have a mouse having a bad trip at a rave.
Speaker 1
10:10
So I'm gonna remove the laser eyes and take us back. What I'm going to try is use the adjective intricate, which I see on some forums gives the AI and not just the AI to flesh out some of the details. And I'm going to double down on the whole scary sinister mouse. So I gave it a few more adjectives.
Speaker 1
10:31
Let's see if that moves us in the right direction. I actually think we got it. I actually think this is very close. It returns a lot of the images that hit on the points that I need.
Speaker 1
10:41
I like the 1 in the top right. I like potentially this 1 and this 1 and maybe even this 1 and actually that. So we have 5, 6 images that are workable. So this tells me we're getting very close to telling the AI what it is we want and getting the images consistently out.
Speaker 1
10:58
I'm just going to try 1 more idea that I thought of and normally this is a very good move if you're trying to hit a specific style of an artist that you know because the AI has been trained on images by the different types of artists and it can usually mimic that aesthetic extremely well. So what I had in mind is to try and give it a modifier called By Frank Miller. I'm thinking of Sin City fame, right? It fits really well with this dark black and white aesthetic, so I'm gonna see if that improves things further.
Speaker 1
11:32
Okay, so I've made a bit of a goof when pasting my prompt last time because I actually dropped the intricate and sinister modifiers which I liked from before, so I had to rerun the prompt with those and by Frank Miller. And now I am of 2 minds about the result here. I don't feel like adding Frank Miller in this case has improved the aesthetic too much. It's still the same look but it seems to have thrown a wrench in how the AI places the mouse because only a few images which are really good, maybe this 1 is great although it's Mickey Mouse-ish and this 1 but it's much less consistent.
Speaker 1
12:09
So I'm going to go ahead and remove it by Frank Miller modifier and I generally recommend doing this a lot. If you see an element that takes your step back, don't be afraid to throw it out. You've seen me do this 3 times in this video. We've dropped the torso horrifying modifier.
Speaker 1
12:26
We've dropped the laser eyes because the AI didn't quite know what to do with them and where to place the lasers. And now I'm going to go ahead and drop the Frank Miller modifier as well. So the last thing I've done now is gone to the DeepAI and actually pasted the same prompt in it. I can actually run it again just to see if the consistency works.
Speaker 1
12:49
That's the last thing I'm going to do because right now I'm at a place where I think the aesthetic and the angle and the look is quite close to where I want and I feel comfortable taking this text prompt into a more sophisticated program where I have a bit more control about the prompt weights, other settings and so on, where I can maybe get it to the final version. But I have now saved myself quite a bit of time and potentially money working with those other programs. So this model doesn't spit out a mouse that looks quite as scary, but I can see the consistency in where the city is and where the mouse is. So I think I'm quite happy.
Speaker 1
13:32
I could probably iterate further, but this gives you a very good idea of how to work with these 2 tools At no cost and very quickly I think it took me about 10 minutes without the video recording part to run through these 10 or so iterations back and forth and get to a prompt that I'm quite happy with and can work further with. So I'm very happy to see if you try it for yourself and this works for you. Share your experiences and let me know how it goes and if you found it useful then you know let me know so I can make more videos in the future. Bye!
Omnivision Solutions Ltd