General Intelligence

Take a Look at How Far Image Generation A.I. Has Come in Just 5 Years

DALL-E can create images based on text descriptions alone

Dave Gershgorn
OneZero
Published in
5 min readJan 11, 2021

--

Image source: xia yuan/Getty Images

OneZero’s General Intelligence is a roundup of the most important artificial intelligence and facial recognition news of the week.

OpenAI is earning a reputation for building some of the A.I. industry’s most futuristic prototypes.

The Microsoft-backed research outfit is now led by Y Combinator founder Sam Altman. It’s best known for its powerful text generator, GPT-3, but in the last few years has also built a robotic hand that taught itself to solve a Rubik’s Cube, a team of superhuman esports algorithms, an algorithm that composes convincingly human music, and algorithms that can play games and use tools to learn complex strategies.

Last week, OpenAI released DALL-E, an A.I. system that can generate images based on written text. For example, in response to the prompt “a leather purse in the shape of an avocado. a leather purse imitating an avocado,” the system can generate dozens of iterations on the idea of a leather avocado purse.

Source: OpenAI

The company hasn’t made DALL-E, which is a mix of the name Salvador Dalí and WALL-E, available to the public or even the select group of developers it typically invites to trial new software, but examples on its website suggest the system can create extremely realistic and detailed images.

DALL-E is proficient across art styles, including illustration and landscapes. It can also generate text to make signs on buildings and separate delineate between making sketches and full-color images of the same scene. A.I. researchers refer to this kind of far-reaching capability as generalization, meaning the algorithm isn’t specifically geared to one kind of task or art style.

OpenAI credits the proficiency of the algorithm to two main factors. First, the algorithm is enormous. It uses a mind-boggling 12 billion parameters, which can be thought of as knobs turned by the algorithm to tune how it understands ideas…

--

--

Dave Gershgorn
OneZero

Senior Writer at OneZero covering surveillance, facial recognition, DIY tech, and artificial intelligence. Previously: Qz, PopSci, and NYTimes.