The Era of DNA Database Hacks Is Here

A major data breach shows genetic information is vulnerable to attack

Emily Mullin


A close up of a gloved hand holding up a DNA autoradiogram next to a petri dish to illustrate genetic research.
Photo: Tek Image/Science Photo Library/Getty Images

On the morning of July 19, hackers accessed the online DNA database GEDmatch and temporarily allowed police to search the profiles of more than 1 million users that were previously not accessible to law enforcement. GEDmatch is a genealogy tool that allows users to upload their DNA profiles generated from genetic testing services like 23andMe, Ancestry, and MyHeritage and search for relatives.

It took three hours until GEDmatch became aware of the breach and pulled the site offline completely. Users have to give permission for their profiles to be included in police searches, but the breach overrode privacy settings and made user profiles on the site visible to all other users, including law enforcement officials who use the site.

“Our banks are always being probed, our DNA will probably be probed, too.”

The breach is likely to erode users’ trust in the database, which has become a valuable law enforcement tool for solving cold cases, like the high-profile Golden State Killer case. It may also be a sign of what’s to come: more attempted hackings of DNA databases, which contain a wealth of extremely personal information, like family relationships, ancestry, and potential health risks.

“Our banks are always being probed, our DNA will probably be probed, too,” says Leah Larkin, PhD, a genealogist and genetic privacy advocate who runs The DNA Geek, a company that helps people locate relatives.

In a July 20 statement on its website, Verogen, the San Diego forensic genetics firm that acquired GEDmatch in December, said there is no evidence that any user data was compromised or downloaded during the breach. But on July 21, genetic testing company MyHeritage reported in a blog post that the hackers appeared to have retrieved user emails from GEDmatch to orchestrate another attack. MyHeritage said perpetrators set up a fake MyHeritage website and sent a phishing email to users to log in to the website, ostensibly so the hackers could steal passwords.

At least 16 people fell victim to the fake website. MyHeritage says it has attempted to contact the more than 100 users who received the phishing email. It’s not known whether the hackers used those passwords to access users’ MyHeritage accounts.

The motivation for the attacks is unclear. The hackers could have been after passwords, emails, or credit card information, or they could have wanted access to genealogical data or genetic information.

“Any kind of consumer profile with a company could certainly be rich with personal information,” says Rachele Hendricks-Sturrup, health policy counsel at the Washington, D.C.-based Future of Privacy Forum. One reason hackers might target genetic data is to make fake patient profiles and file fraudulent insurance claims, she says.

Sensitive information gleaned from genetic data could also be used for blackmail or corporate espionage if it gets into the wrong hands. A criminal enterprise might want access to find out if they have relatives in the database. Even having a second or third cousin in the database makes it possible for cops to link offenders to crimes.

A Verogen spokesperson told OneZero that GEDmatch encodes DNA data uploaded by users, then deletes the raw DNA files. But there’s still plenty of information on GEDmatch that could be of interest to hackers. For instance, you could see people’s family trees and the amount of DNA that users share.

The ability to view familial connections has made GEDmatch a powerful forensic tool. To find the Golden State Killer, police uploaded crime scene DNA from the suspect, which matched to several of his distant relatives on the site. From there, investigators worked with genealogist Barbara Rae-Venter to piece together his family tree using publicly available records. Eventually, they were able to identify Joseph James DeAngelo as a suspect. Police made the arrest in April 2018, and last month, DeAngelo pleaded guilty to 13 murders he committed across California in the 1970s and 1980s.

After GEDmatch’s role in the investigation was revealed, the site’s founders changed its terms of service to warn users that law enforcement could use the database to investigate homicides and rapes. Initially, all of its users were automatically opted in to law enforcement searches. But when GEDmatch made an exception for a violent assault case, it came under fire for expanding the database’s use to investigate a wider range of crimes.

As a result, GEDmatch changed its policy so that users would have to proactively opt in to make their profiles available in police searches. Of its nearly 1.5 million current users, about 280,000 have opted in to law enforcement matching, Verogen told OneZero. On its website, Verogen says the database has been used to solve more than 70 cold cases.

But the recent breach could jeopardize the usefulness of the site as a forensic tool if users decide to leave. The more people in the database who opt in to law enforcement matching, the greater likelihood that police will get a relative match and identify a suspect. Colleen Fitzpatrick, PhD, founder of Identifinders International, a genealogical service that works with law enforcement, says the breach will almost certainly lead some users to remove their DNA profiles from the site in the short term. But she thinks the database will eventually recover.

“The online world we live in has its risks,” she says. “Now that this has happened, the security is going to increase.” If Verogen can assure users that it’s taking steps to address these security issues, Fitzpatrick says, people will continue to use GEDmatch.

For its part, Verogen contacted the FBI and notified European regulators after learning about the breach. The company told OneZero it is contracting with a cybersecurity firm that will provide ongoing analysis and monitoring and recommend security protocols, as well as alert the company of any potential threats. GEDmatch came back online on July 25.

It’s not the first time that flaws in GEDmatch have been exposed. Last year, geneticists Graham Coop, PhD, and Michael Edge, PhD, warned GEDmatch and consumer genetic testing companies about privacy and security issues that can arise on sites where uploads of DNA data are allowed. They found it may be possible to piece together most of someone’s genome or pick out people with genetic variants associated with specific traits, such as Alzheimer’s disease.

“The data is valuable if you know how to use it.”

After notifying the companies, they posted their findings online as a preprint, then went on to get the paper peer-reviewed and published in January.

Another group at the University of Washington also warned GEDmatch about a related set of privacy issues in a paper they posted online. The researchers warned that it was possible to determine fine-grained genetic details about other users, as well as to upload fake DNA profiles made to look like real people. Fake profiles could be used to impersonate a relative, defraud someone, or obscure a suspect’s unidentified DNA to make it harder for police to identify that person.

Edge tells OneZero that most of the companies took the privacy concerns seriously. “Although they didn’t necessarily completely eliminate the kinds of privacy risks we pointed to, they did think about them and describe measures they had in place that would mitigate them given the amount of information they wanted users to have access to.”

Notably, GEDmatch didn’t respond to the privacy concerns outlined in the UC Davis paper, Edge says. “After the preprints came out, they made some positive changes, but probably changes that would not be sufficient to protect privacy.”

When Verogen acquired GEDmatch in December, the company said it increased security protocols to eliminate the possibility of attacks based on “known vulnerabilities.”

“When we took the helm at GEDmatch late last year, our biggest concern was security threats carried out through the use of uploading fake kits to the site,” a Verogen spokesperson told OneZero. “This type of attack had left GEDmatch exposed in the past, and we worked diligently to eliminate threats of this nature.”

Edge says he doesn’t see any indication that the recent breach had anything to do with the methods outlined in the papers. “Hacks can happen to anybody,” he says.

Fitzpatrick isn’t convinced that the hackers were after genetic data. She thinks they could have just done it for “the joy of creating a scare.”

But Larkin, the genetic privacy advocate, isn’t so sure. She says there are plenty of other websites to hack if that was the case, and she hopes Verogen will ramp up security and roll out an informed consent document that clearly lays out the risks of using GEDmatch.

“I’m sure there will continue to be hacking attempts and it will come down to how secure these databases are,” she says. “The data is valuable if you know how to use it.”



Emily Mullin

Former staff writer at Medium, where I covered biotech, genetics, and Covid-19 for OneZero, Future Human, Elemental, and the Coronavirus Blog.