Training DALL E -2: Generating Art Styles, Photorealistic Images, and Detailed Prompts
Increase the quality of Dall-E 2 generations.

In the previous article, we discussed biases, assumptions, and filters. Today I’d like to speak more about how to get desired results without generating too many variations.
I’ve already said how excited I was when I got the access and etc, but when I realized that we all had very limited, 50, generations per day, I decided to use it mindfully. So, where’s the dog buried? Well, basically it appears that, if you know how to talk with the system, you can save up quite the number of generations and still get what you want.
- Use commas to separate content in your prompt, to get better results, mimic the title of the original file. e.g. Imagine, you want to generate something in Paul Cézanne style, doesn’t matter how you write your prompt, the system will still do a good job to mimic the style of the artist and if you’re still not satisfied, after a couple of variations, you will definitely find what you want. But you can help the system and in your original prompt mention key information like the original title, year, medium, colors and etc. e.g:



It’s pretty obvious that the detailed description generated better results at the first attempt than when you tried to generate with a very general description. My suggestion would be, if you want to generate something specific, be as detailed as possible.
2. When it comes to generating photorealistic images, use lenses instead of using adjectives like realistic, photorealistic, life-like, real-life… Example: You want to generate a close-up photo of a sleeping cat, the general prompt may look like this : “a sleeping chubby British short haired cat, realistic”, so far it looks normal but once you see the result, you may need to generate a couple of variations, by adding extra details in order to get your photorealistic cat. You can save time by adding lenses to your prompt. I literally had an Eureka moment, when I realized that. I thought if the system gets data about mediums, why can’t it have an access to the detailed info of the photo? I was mind-blown when I checked the results.




I didn’t have to generate any more variations, because I was totally satisfied with the initial results. This little trick can help you generate human features as well, if you remember, in the previous article, I said that generating real-life people or features is against policy, that’s why you will almost never see realistic faces or people. Now check this.


Even though the system still filtered out the “full-body” part from my prompt, overall facial features were pretty realistic to me.
That’s not all, let’s see how this works when you want to generate real-life facial features.
I’ve observed that whenever you ask Dall-E 2 to generate human+(random subject) unless you specifically ask to focus on humans, it will always avoid including all humane features. So I wanted to check what would I get if I ask to generate something like this: “red-head girl is holding a chubby ginger cat, photorealistic”

Well, not what I was expecting, but at least two decent variations, that I can use to generate more variations, but remember, we want to narrow down the options as fast as possible. Instead of using Photorealistic, I will go for suggesting the preferable lenses.


Slick! I know! This is honestly such a game changer the quality is nonsensically amazing!
Bonus


Now you can generate cinematic shots and give them certain vibe.




Even though there’s hardly any hint that I asked the system to generate Tbilisi, the mood, colors and some of the details do match with Tbilisi Streets.


