Better Programming

Advice for programmers.

Follow publication

Training DALL E -2: Generating Art Styles, Photorealistic Images, and Detailed Prompts

Increase the quality of Dall-E 2 generations.

Elenee Ch
Better Programming
Published in
5 min readJul 11, 2022

--

In the previous article, we discussed biases, assumptions, and filters. Today I’d like to speak more about how to get desired results without generating too many variations.

I’ve already said how excited I was when I got the access and etc, but when I realized that we all had very limited, 50, generations per day, I decided to use it mindfully. So, where’s the dog buried? Well, basically it appears that, if you know how to talk with the system, you can save up quite the number of generations and still get what you want.

  1. Use commas to separate content in your prompt, to get better results, mimic the title of the original file. e.g. Imagine, you want to generate something in Paul Cézanne style, doesn’t matter how you write your prompt, the system will still do a good job to mimic the style of the artist and if you’re still not satisfied, after a couple of variations, you will definitely find what you want. But you can help the system and in your original prompt mention key information like the original title, year, medium, colors and etc. e.g:
Prompt 1. Mother hen and 4 little cute chickens are playing in the yard, Paul Cézanne style
Prompt 2: A vivid oil painting of a Mother hen and 4 little chickens playing in the yeard by Paul Cézanne, 1894 59 × 72,4 cm Collection particulière aux États-Unis // Original title: Peinture à l’huile sur toile de Paul Cézanne, 1894, 59 × 72,4 cm, Collection particulière aux États-Unis

It’s pretty obvious that the detailed description generated better results at the first attempt than when you tried to generate with a very general description. My suggestion would be, if you want to generate something specific, be as detailed as possible.

2. When it comes to generating photorealistic images, use lenses instead of using adjectives like realistic, photorealistic, life-like, real-life… Example: You want to generate a close-up photo of a sleeping cat, the general prompt may look like this : “a sleeping chubby British short haired cat, realistic”, so far it looks normal but once you see the result, you may need to generate a couple of variations, by adding extra details in order to get your photorealistic cat. You can save time by adding lenses to your prompt. I literally had an Eureka moment, when I realized that. I thought if the system gets data about mediums, why can’t it have an access to the detailed info of the photo? I was mind-blown when I checked the results.

Prompt 1: a sleeping chubby British short haired cat, realistic
Prompt 2: a chubby British short haired cat, Samyang/Rokinon Xeen 50mm T1.5

I didn’t have to generate any more variations, because I was totally satisfied with the initial results. This little trick can help you generate human features as well, if you remember, in the previous article, I said that generating real-life people or features is against policy, that’s why you will almost never see realistic faces or people. Now check this.

Left: “Michelangelo, David, full body, OM system 12–40mm PRO II 40mm, 1/100 sec, f/2 8, ISO 800” // Right: “A very detailed statue of two philosophers judging passerby’s outfits in the middle of a street, Italy, afternoon, Sigma 40mm f/1.4 DG HSM

Even though the system still filtered out the “full-body” part from my prompt, overall facial features were pretty realistic to me.

That’s not all, let’s see how this works when you want to generate real-life facial features.

I’ve observed that whenever you ask Dall-E 2 to generate human+(random subject) unless you specifically ask to focus on humans, it will always avoid including all humane features. So I wanted to check what would I get if I ask to generate something like this: “red-head girl is holding a chubby ginger cat, photorealistic”

Prompt 1: red-head girl is holding a chubby ginger cat, photorealistic

Well, not what I was expecting, but at least two decent variations, that I can use to generate more variations, but remember, we want to narrow down the options as fast as possible. Instead of using Photorealistic, I will go for suggesting the preferable lenses.

Prompt 2: red-head girl is holding a chubby ginger cat, Samyang/Rokinon Xeen 50mm T1.5

Slick! I know! This is honestly such a game changer the quality is nonsensically amazing!

Bonus

Prompt: Pomegranate, OM system 12–40mm PRO II 40mm, 1/100 sec, f/2 8, ISO 800”

Now you can generate cinematic shots and give them certain vibe.

Left: Shinkansen in the rainy weather, Sigma 50mm T1.5 FF High-Speed Prime // Right: “A serene photo of a mountain blue lake and a boat, OM-D E-M5 Mark III | M.Zuiko Digital ED 12–40mm F2.8 PRO | 1/50sec | F9 | ISO64
Prompt: A rainy day in Tbilisi streets, Sigma 50mm T1.5 FF High-Speed Prime

Even though there’s hardly any hint that I asked the system to generate Tbilisi, the mood, colors and some of the details do match with Tbilisi Streets.

Prompt: A hot summer day in old streets of Tbilisi, Georgia, Sigma 50mm T1.5 FF High-Speed Prime

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Elenee Ch
Elenee Ch

Written by Elenee Ch

Haya! 👋 Ene here! I’m spending my free time making illustrations, AI research and read about UX/UI.

Responses (1)

Write a response