DALL-E: Text to Image Generation

Research Breakthroughs The Transformer Era

Impact

At a glance

Date: January 2021
Era: The Transformer Era (2018–2021)
Category: Research Breakthroughs
Impact: 4 / 5
Organizations: OpenAI

What Happened

OpenAI unveiled DALL-E, a model that could generate images from text descriptions — 'an armchair in the shape of an avocado' became iconic. Built on GPT-3's architecture adapted for images, it showed that language models could bridge the gap between text and visual creativity.

Why It Mattered

Launched the text-to-image revolution. Showed AI could be genuinely creative. Set the stage for DALL-E 2, Midjourney, and Stable Diffusion — the tools that brought AI art to the mainstream.

Organizations

OpenAI

Featured in these guides

History guide

History of OpenAI

A curated history of OpenAI from its 2015 founding through GPT, ChatGPT, multimodal systems, and reasoning-era products.

History guide

History of Generative AI

The history of generative AI, from GANs in 2014 through DALL-E, Stable Diffusion, ChatGPT, Midjourney, and Sora's text-to-video.

Frequently asked questions

When did DALL-E: Text to Image Generation happen?+

DALL-E: Text to Image Generation took place in January 2021.

Who was behind DALL-E: Text to Image Generation?+

For DALL-E: Text to Image Generation, organizations involved were OpenAI.

Why was DALL-E: Text to Image Generation important?+

Launched the text-to-image revolution. Showed AI could be genuinely creative. Set the stage for DALL-E 2, Midjourney, and Stable Diffusion — the tools that brought AI art to the mainstream.

Which era of AI history does DALL-E: Text to Image Generation belong to?+

DALL-E: Text to Image Generation is part of the The Transformer Era era (2018–2021) — a major breakthrough in the research breakthroughs category.

Related Milestones

2019.02Research

GPT-2: 'Too Dangerous to Release'

OpenAI announced GPT-2 (1.5 billion parameters) but initially refused to release the full model, calling it 'too dangerous' due to its ability to generate convincing fake text. The decision was controversial — some praised the caution, others called it a publicity stunt. The full model was eventually released in November 2019.

Alec RadfordOpenAI

2020.06Research

GPT-3: The 175 Billion Parameter Leap

OpenAI released GPT-3 with 175 billion parameters — 100x larger than GPT-2. Without any fine-tuning, GPT-3 could write essays, code, poetry, translate languages, and answer questions through 'few-shot learning' (learning from just a few examples in the prompt). The API launched in beta, enabling thousands of applications.

Tom BrownOpenAI

2023.03Research

GPT-4: Multimodal Intelligence

OpenAI released GPT-4, a multimodal model that could understand both text and images. It passed the bar exam (90th percentile), scored 1410 on the SAT, and demonstrated remarkably nuanced reasoning. It was a massive leap from GPT-3.5 in accuracy, safety, and capability.

OpenAI

2018.06Research

GPT-1: Generative Pre-training

OpenAI released GPT-1, demonstrating that a Transformer trained on vast amounts of text using unsupervised pre-training could then be fine-tuned for specific NLP tasks. With 117 million parameters, it showed the potential of scaling language models.