DALL-E Facts for Kids

Quick facts for kids
DALL·E
Watermark present on DALL·E images
An image generated by DALL·E 2, from the prompt "Teddy bears working on new AI research underwater with 1990s technology"
Developer(s)	OpenAI
Initial release	5 January 2021; 4 years ago (2021-01-05)

Stable release	DALL·E 3 / 10 August 2023; 2 years ago (2023-08-10)

Type	Text-to-image model

DALL·E, DALL·E 2, and DALL·E 3 are special computer programs. They were made by a company called OpenAI. These programs use a type of artificial intelligence called deep learning. Their job is to create digital images from simple descriptions. These descriptions are often called "prompts".

The first DALL·E program came out in January 2021. Its improved version, DALL·E 2, was released the next year. DALL·E 3 became available in October 2023. It was added directly into ChatGPT for some users. Microsoft also uses DALL·E in its Bing Image Creator tool.

How DALL·E Started
How DALL·E Works
- What is CLIP?
What DALL·E Can Do
- Changing Images
- What DALL·E Can't Do Well
Important Things to Think About
Other Similar Programs
See also

How DALL·E Started

OpenAI first showed DALL·E on January 5, 2021. It used a special version of an AI model called GPT-3. This model was changed to create pictures instead of just text.

On April 6, 2022, OpenAI announced DALL·E 2. This new version could make pictures look more real. It also created them with better detail. DALL·E 2 could mix different ideas, styles, and features.

Later, in July 2022, DALL·E 2 became available to more people. Users could make a certain number of images for free each month. They could also buy more if they wanted. Before this, only a few researchers could use it. This was because of concerns about how AI might be used.

By September 2022, DALL·E 2 was open for everyone to use. In September 2023, OpenAI announced DALL·E 3. This newest version can understand even more details in your prompts. It can also follow complex instructions better.

OpenAI also made DALL·E 2 available for other companies to use. This means developers can add DALL·E to their own apps. Microsoft added DALL·E 2 to its Designer app and Bing Image Creator.

The name DALL·E is a mix of two names. It comes from WALL-E, a robot character from a Pixar movie. It also comes from Salvador Dalí, a famous artist.

In February 2024, OpenAI started adding hidden watermarks to images made by DALL·E. These watermarks help show that the image was created by AI.

How DALL·E Works

DALL·E uses a type of AI model called a Transformer. These models are very good at understanding patterns. The first such model, GPT-1, was made by OpenAI in 2018. It grew into GPT-2 and then GPT-3.

DALL·E is a special version of GPT-3. It was taught using many pairs of images and text descriptions. It learns how words relate to different parts of pictures. When you give it a prompt, it uses this knowledge. It then creates a new image that matches your words.

DALL·E was released along with another AI model called CLIP. CLIP stands for Contrastive Language-Image Pre-training. CLIP's job is to check DALL·E's work. It helps pick the best images that DALL·E creates. CLIP was trained on millions of image and text pairs from the internet. It learns to understand what an image is about.

DALL·E 2 uses a different method called a diffusion model. Imagine an image starting as random noise. A diffusion model slowly removes the noise. It adds details until a clear image appears. This process is guided by the text prompt you give it.

What is CLIP?

CLIP is a way to train two AI models together. One model takes in text and turns it into a special code. The other model takes in an image and turns it into a special code.

These models are trained using many image-caption pairs. The goal is for the text code and image code to be very similar if they match. If they don't match, their codes should be very different. This helps the AI understand how words and pictures relate.

What DALL·E Can Do

DALL·E can create many kinds of images. It can make realistic photos, paintings, or even emoji. It can also move and change objects within its pictures. It can place things correctly without you telling it exactly where. For example, if you ask for a radish blowing its nose, DALL·E will draw the handkerchief in a good spot.

DALL·E can also guess details that you don't mention. If you ask for something related to Christmas, it might add Christmas decorations. It can also add shadows to images even if you don't ask for them. DALL·E understands many different art and design styles.

It can create images for many different descriptions. It can even combine ideas, which is a key part of human creativity. DALL·E can also solve visual puzzles. These are like the ones humans take to test their intelligence.

An image of accurate text generated by DALL·E 3 based on the text prompt "An illustration of an avocado sitting in a therapist's chair, saying 'I just feel so empty inside' with a pit-sized hole in its center. The therapist, a spoon, scribbles notes"

DALL·E 3 is even better at following complex instructions. It can also create text within images more clearly and correctly. DALL·E 3 is built right into ChatGPT Plus.

Changing Images

Two "variations" of Girl With a Pearl Earring generated with DALL·E 2

If you give DALL·E 2 an existing image, it can make "variations" of it. These are new images that are similar to the original. It can also edit the image. This includes adding to it or filling in missing parts.

DALL·E 2 has features called "inpainting" and "outpainting." Inpainting lets you fill in a missing area of an image. Outpainting lets you expand an image beyond its original edges. DALL·E uses the existing parts of the image to make sure new parts fit in. It matches shadows, reflections, and textures.

What DALL·E Can't Do Well

DALL·E 2 doesn't always understand language perfectly. It might mix up "A yellow book and a red vase" with "A red book and a yellow vase." It can also struggle with more than three objects or negative instructions. Sometimes, features might appear on the wrong object.

It also has trouble with text. Even if the letters look clear, the words often don't make sense. DALL·E is also not very good at scientific images, like astronomy or medical pictures.

An attempt to generate Japanese text using the prompt "a person pointing at a tanuki, with a speech bubble that says 'これは狸です！'", which results in the text being rendered with nonsensical kanji and kana

Important Things to Think About

DALL·E 2 learns from huge amounts of public data. This can sometimes lead to unfair results. For example, if you ask for "a doctor," it might show more men than women. To fix this, OpenAI sometimes secretly adds words to your prompts. For example, it might add "black man" or "Asian woman" to make results more diverse.

One concern is that DALL·E and similar tools could be used to create fake images. These fake images could spread wrong information. To try and stop this, the software blocks prompts about famous people. It also checks uploaded images for harmful content. However, people can sometimes find ways around these blocks. For example, "blood" might be blocked, but "red liquid" might not be.

Another concern is that these AI tools could affect jobs for artists and designers. DALL·E 3 tries to prevent this. It is designed to block users from creating art in the style of living artists.

Other Similar Programs

OpenAI has not shared the secret code for DALL·E. Because of this, other groups have tried to create their own similar programs. These are called open-source models.

One example is Craiyon, which used to be called DALL·E Mini. It was released in 2022. Craiyon was trained on data from the internet. It became very popular for creating funny images.

DALL-E facts for kids

Contents