OpenAI logo

GPT-4: Multimodal Intelligence

At a glance

Date
March 2023
Era
Generative AI Revolution (20222024)
Category
Research Breakthroughs
Impact
5 / 5
Organizations
OpenAI

What Happened

OpenAI released GPT-4, a multimodal model that could understand both text and images. It passed the bar exam (90th percentile), scored 1410 on the SAT, and demonstrated remarkably nuanced reasoning. It was a massive leap from GPT-3.5 in accuracy, safety, and capability.

Why It Mattered

Redefined what AI could do. GPT-4's performance on professional exams forced every industry to reckon with AI's capabilities. Microsoft invested $10B+ in OpenAI. The AI arms race between companies intensified dramatically.

Organizations

Tags

Frequently asked questions

When did GPT-4: Multimodal Intelligence happen?+

GPT-4: Multimodal Intelligence took place in March 2023.

Who was behind GPT-4: Multimodal Intelligence?+

For GPT-4: Multimodal Intelligence, organizations involved were OpenAI.

Why was GPT-4: Multimodal Intelligence important?+

Redefined what AI could do. GPT-4's performance on professional exams forced every industry to reckon with AI's capabilities. Microsoft invested $10B+ in OpenAI. The AI arms race between companies intensified dramatically.

Which era of AI history does GPT-4: Multimodal Intelligence belong to?+

GPT-4: Multimodal Intelligence is part of the Generative AI Revolution era (2022–2024) — a landmark, field-defining moment in the research breakthroughs category.

Related Milestones

OpenAI logo, creators of Sora
Research

Sora: AI Video Generation

OpenAI previewed Sora, a model that could generate photorealistic videos up to a minute long from text descriptions. The quality stunned the world — realistic physics, complex camera movements, and coherent scenes that looked like professional cinematography.

OpenAI
OpenAI logo, creators of o1
Research

OpenAI o1: Reasoning Models

OpenAI released o1, a model trained to 'think before it speaks' using chain-of-thought reasoning at inference time. It could solve complex math, coding, and science problems by spending more compute thinking through multi-step solutions — trading speed for accuracy on hard problems.

OpenAI
AI-generated image by DALL-E
Research

DALL-E: Text to Image Generation

OpenAI unveiled DALL-E, a model that could generate images from text descriptions — 'an armchair in the shape of an avocado' became iconic. Built on GPT-3's architecture adapted for images, it showed that language models could bridge the gap between text and visual creativity.

OpenAI
OpenAI logo
Research

GPT-1: Generative Pre-training

OpenAI released GPT-1, demonstrating that a Transformer trained on vast amounts of text using unsupervised pre-training could then be fine-tuned for specific NLP tasks. With 117 million parameters, it showed the potential of scaling language models.

Alec RadfordOpenAI
GPT-2 language model generating text about itself
Research

GPT-2: 'Too Dangerous to Release'

OpenAI announced GPT-2 (1.5 billion parameters) but initially refused to release the full model, calling it 'too dangerous' due to its ability to generate convincing fake text. The decision was controversial — some praised the caution, others called it a publicity stunt. The full model was eventually released in November 2019.

Alec RadfordOpenAI

Get the latest AI milestones as they happen

Join the newsletter. No spam, just signal.