By Chelsea Whyte
Policing power may be about to get much stronger, thanks to another advance in genetic analysis. A new technique can link the patchy, limited DNA information held in forensic databases to the rich DNA libraries held by family tree-building websites, raising further questions about genetic privacy.
Earlier this year, an ancestry database used by people looking to trace their family history was used to identify the suspected Golden State Killer, a serial killer active in California decades ago. Since his arrest in April, genealogy databases – which allow consumers to upload their DNA sequences – have been used to crack several other cold cases.
These stores of DNA data meant for consumers were needed because forensic databases hold only limited information. Now a new technique could link the two, further expanding police use of DNA data.
Bridging the gap
The US national DNA database used by police and the FBI – called CODIS – doesn’t store whole DNA sequence data. Instead, it focusses on up to 20 specific stretches of repetitive DNA code. These regions vary between individuals, so can help identify people. But consumer genetic databases store different data instead – single-letter variations in DNA across hundreds of thousands of sites in the human genome. With more data points, you can more accurately pin down a person’s relationship to others.
“When police have DNA evidence, usually it’s very minute quantities. Currently, they have this dilemma: should we run a CODIS set on our DNA or use the more sophisticated techniques?” says Yaniv Erlich of genetic ancestry company MyHeritage.
But a new computational model can link people in CODIS database to records in genealogy databases, says Noah Rosenberg at Stanford University, who led the team that built it. The model relies on the fact that the two separate types of genetic markers are located on the genome in roughly the same location – the longer stretches in CODIS are surrounded by the single-letter variations used by ancestry databases. When tested, it could identify up to 32 per cent of parent-offspring pairs, and up to 36 per cent of sibling pairs, in a test sample of 872 people.
“This could expand the number of cold cases that are solvable,” says Natalie Ram at the University of Baltimore, Maryland. She says it also brings up questions of how private our genetic information is.
In some states, Ram says, the law allows police to take DNA samples from non-criminals – people who have been detained or arrested but not charged with a crime. So, for example, a protester who is taken into police custody could have their DNA stored and later used to track down their third cousin who is suspected of a crime.
The forensic system was designed to be as minimally informative as possible, so that it can’t reveal information beyond identity, says Rosenberg. But the DNA data in genealogy websites can reveal physical or medical characteristics, so the ability to link between systems means that CODIS can now be used to deduced more detailed information about a person’s genetic makeup.
The number of matches between CODIS and consumer sites is, for now, limited by the fact that they tend to cover quite different populations. There are more samples from minorities in the forensic database, while genealogy sites are mostly used by white people of European descent.
The new technique currently only works with close relatives, but with sufficient DNA evidence. Erlich and his team calculate that more than half the adults in the US can be identified in this way.
More on these topics: