Precision medicine promises to tailor the diagnosis and treatment of disease to your unique genetic makeup. A physician may use the presence of certain genetic markers to diagnose a disease, or choose a drug to treat another.
The studies linking genetic markers to disease, however, focus largely on white European populations, neglecting other races and ethnic groups for an analysis published in the journal Cell on Thursday. The researchers argue that the lack of diversity in genomic studies damages our scientific understanding of the genetic basis of disease across all populations and worsens health inequalities.
The analysis reports that 78 percent of all individuals enrolled in genomic studies by 2018 were of European descent, 10 percent Asian, 2 percent African, 1 percent Hispanic, and less than 1 percent for all other groups.
"This is just incredible," says Sarah Tishkoff, evolutionary geneticist at the University of Pennsylvania's Perelman School of Medicine, author of this analysis. "It really limits our understanding."
Ignoring genomic diversity can mean missing information that could benefit everyone. For example, the authors of the study point to PCSK9, a gene important for the regulation of cholesterol. Studying the mutations of West African populations provided additional insights into the underlying biology and led to a new class of drugs that benefit people of all races.
"We have just decided to learn all sorts of things about the genome and what it does," says Alice Popejoy, a postdoctoral researcher at Stanford University who is not involved in this analysis.
The genetics of illnesses ranges from relatively simple to mysterious complexes. An extreme case is Mendelian disease, where a gene variant essentially guarantees that you will have this disease, regardless of your genetic background. Think of Huntington's disease or muscular dystrophy.
The other extreme is disease, in which many different genes seem to be involved in addition to environmental factors. Think of hypertension or a disease of the coronary arteries. The lack of diversity of data sets can be particularly problematic for researchers studying polygenic diseases.
The number of polygenic diseases far surpasses that of Mendelian diseases and is therefore a research focus. But trying to identify the genes involved in a polygenic disease is like looking for an unknown number of needles in a huge haystack.
Imagine our genome as a long line of about 3 billion base pairs, the letters that make up the letters up our genetic code. A researcher can use genetic markers that are present in most people to orient themselves. These markers appear at regular intervals over the entire letter line.
Our researchers can then conduct a so-called genome-wide association study or GWAS, in which they sequence these genetic markers in thousands of people, some of whom have a specific disease. In order to deal with disease-causing genes, she is looking for markers that appear again and again in people with the disease. If a marker is strongly associated with the presence of the disease, the researcher concludes that a disease gene must be nearby.
This conclusion is possible because closely spaced letters tend to be linked as a block that is passed down as a block of generations. The blocks may vary in size, but in general, geneticists assume that the disease-causing gene is in the same block when a marker is associated with a disease. However, the authors of this analysis argue that the conclusion when comparing markers can be flawed for different reasons from different ethnic groups. First, the genes themselves may have changed in different populations either by selection or by chance. For example, Tishkoff cites a gene that is strongly associated with non-diabetic kidney disease. This condition is rare in Europeans, but more common in West Africans. The researchers identified two mutations in a gene that appears to be associated with this disease and further research has suggested that this gene is more prevalent in West African populations because it offers some protection against sleeping sickness. Tishkoff says that if we looked at just European variations, we would have overlooked this example of how disease-causing genes can also be beneficial in some environments.
Aside from the changing genes, the genetic markers also act as guides, according to the authors, they can be mixed and rearranged in different populations. In fact, the basic theory of evolution says so.
Homo sapiens originated in Africa about 300,000 to 200,000 years ago and left the continent much later in small outbreaks of migration. Our genomes reflect this history, with Africans harboring far more genetic diversity than any other human population.
According to Tishkoff, populations with a greater variety tend to have smaller blocks of the genome that are interconnected. However, this blocking pattern may change during a migration event.
Think of Africa's gene pool as a real pool filled with marbles of every color. "They pick up a handful of marbles and you get a very small selection of this variation," says Tishkoff. Each time a small group of people left Africa, they carried only a small fraction of that diversity, and the populations that emerge from these migration events tend to connect larger chunks of the genome.
These Different Connection Patterns Comparison of populations can cause problems because the markers associated with a disease-causing gene in European populations can occur in a completely different part of the genome in African or Hispanic populations, Tishkoff said. A marker that has accurately labeled a gene at increased risk for heart disease in Europeans could be genomically far removed from the same gene in other populations and render the marker meaningless.
Tishkoff emphasizes that ignoring genomic diversity means that genetically-based health care is in some cases worse for populations of non-European descent. Polygenic risk assessments for diseases that are calibrated using GWAS studies and can be used to inform treatment may be less accurate when applied to other populations, leading to false positives or underestimating the risk of certain diseases.
"There are many reasons for health inequalities, obviously the biggest actor is probably just the unequal access to health care," says Tishkoff. "But if we want all people to benefit from genome research in humanity to the maximum, we need to include them in the studies."
Popejoy agrees, though she emphasizes that the genetics of health inequalities is only a small part of the problem. People "should not get the impression that health differences are due to differences in genetic structure between ethnic groups," she says. "Environmental issues and widespread systemic and structural racism that increase environmental impact are more important."
However, both Popejoy and Tishkoff say that much more could be done to increase diversity in genome studies. "We need changes from top to bottom as well as bottom to top," says Popejoy.
"Funding agencies must fund the study of ethnically diverse populations," says Tishkoff. "We already see the needle moving, with initiatives like NIH's All of Us." This research initiative aims to collect genomic data from various populations while striving to provide its participants with results.
Given the history of unethical medical research in minority communities, Popejoy says that researchers need to engage intensively with the people affected by a research agenda. "Researchers need to recognize the value scientifically and ethically when studying different populations, but they must also demonstrate the value they study," says Popejoy.
Jonathan Lambert is an intern at NPR's science desk. You can follow him on Twitter: @evolambert