General Intelligence

A.I.’s Most Important Dataset Gets a Privacy Overhaul, a Decade Too Late

The authors of the image dataset that changed the world have made one long-overdue tweak

Dave Gershgorn
OneZero
Published in
3 min readMar 19, 2021

--

Photo illustration source: John M. Lund Photography Inc.

ImageNet is arguably the most important dataset in recent A.I. history. It’s a collection of millions of images that were compiled in 2009 to test a simple idea: If a computer vision algorithm had more examples to learn from, would it be more accurate? Were the underperforming algorithms of the day simply starved for data?

To encourage others to test the same hypothesis, the authors of ImageNet started a competition to see who could train the most accurate algorithm using the dataset. By 2012, results from the academic competition had attracted the full attention of tech industry giants, who began to compete and hire winners. It is no exaggeration to say that the results from the ImageNet competition gave rise to the A.I. boom we’re in today.

Now, more than a decade after its debut, ImageNet’s authors have made a tweak to the dataset that changed the world: They’ve blurred all the faces.

“The dataset was created to benchmark object recognition — at a time when it barely worked,” the researchers wrote in a blog post announcing the change…

--

--

Dave Gershgorn
OneZero

Senior Writer at OneZero covering surveillance, facial recognition, DIY tech, and artificial intelligence. Previously: Qz, PopSci, and NYTimes.