Genomics Has Revealed An Age Undreamed Of

Tolunay Karavar/The Great Genghis Khan Statue Tolunay Karavar/The Great Genghis Khan Statue

On June 26, 2000, President Bill Clinton announced the completion of the draft of the human genome at a press conference with the two project leads, Francis Collins and J. Craig Venter. A genome is all the genetic information of an organism. Scientists had conceived of the Human Genome Project in the 1980s, and, in the first half of the 1990s, expected it to be an endeavor that would go on for decades. But an unexpected technological revolution of faster computers and better chemistry accelerated the ten-year effort toward the finish line, just as the 20th century came to a close.

The American-led international effort cost more than $3 billion dollars and involved thousands of people. Since then, the last 23 years of the 21st century have seen a sea change in the landscape of genomics, from blue-sky basic science to mass-market consumer products. Companies like Nebula now provide entire genome sequences that are medical-grade quality for $200; down from a price point of $20,000 just 13 years ago. We’ve gone from a single mapped genome—that of humanity—to more than a million genomes. This is a case where quantity has a quality all its own; the commoditization of genomic sequencing has radically transformed how we do genetics.

Yet at the dawn of this brave new genomic era, it is not health and well-being outcomes that have been revolutionized. Rather, genomics as a window into the human past has vindicated Alfred Tennyson’s poetic assertion that nature is “red in tooth and claw.” Where a few decades ago archaeologists and historians had to cobble together inferences from pottery shards, slotting their data into theories that owed more to political fashions of the present than scientific facts of the past, today they can chart the rise and fall of peoples from the clear evidence of the genes.

Collins and Venter promised a shiny future of good health and a more enlightened understanding of humanity’s place in the world, but their invention has, instead, unleashed knowledge of a bygone age of brutality reminiscent of Conan the Barbarian’s Hyperborean Age. Historians can list Genghis Khan’s concubines, but it is genetics that tells us that 10% of Central Asian men are his direct paternal descendants, bringing home the magnitude of his conquests. But obviously, we aren’t fated to relive the brutality of the past; just as technology can open a window back in time, it can unlock the door to a brighter future. The question is what advances we as species wish to make.

The Book of Nature Has a Billion Pages

A single human genome has two copies of each gene, of which there are 19,000. These 19,000 genes are distributed across three billion base pairs of adenine, cytosine, guanine, and thymine, or ACGT for short. Notably, the number of genes that humans have has been discovered only within the last twenty years, even though genetics as a scientific field is over 150 years old. The reason for this recent explosion in our knowledge is that, before the 1990s, genetics probed a digital process—the recombination of discrete units of heredity from the same and different individuals—with analog means. The correlation of characteristics between parents and offspring is intuitively obvious, but the mechanisms by which inheritance occurs are not self-evident.

Our naĂŻve assumption is that the characteristics blend together, resulting in a child who is a synthesis of the traits of the parents e.g. a short parent and a tall parent will produce medium-height offspring. But the implication of this model is that, over the generations, all human variation should be blended away as each generation is the average of the previous one. That simply does not occur. Humans remain as variable as they have been in the past. The insight of Mendelian genetics is that inheritance does not proceed through blending, but through the rearrangement of discrete units of variation.

At about the same time that Charles Darwin was revolutionizing our understanding of the tree of life with his theory of evolutionary change through natural selection, an Austrian monk named Gregor Mendel stumbled upon the framework that would later be called genetics. Between 1856 and 1863, he realized that inheritance seemed to be mediated by particular units of inheritance he called “factors,” and would later be called genes. Mendel hypothesized complex organisms had two copies of many factors, discrete bundles of information that were rearranged every generation through the law of segregation—that you inherit one copy of a gene from each parent—and the law of independent assortment, that you inherit factors independently from each other.

Mendel came to these insights through a famous set of experiments where he crossed lines of peas with distinct characteristics and noted that some traits bred true and others did not. Two short pea plants always produced short pea plants. But two tall pea plants sometimes also produced short pea plants. A model of blending inheritance cannot explain recessive traits, but a Mendelian framework can. Whereas intuitive blendings of inheritance take the visible traits as the only variables of interest in understanding intergenerational change in characteristics, Mendelian genetics implies that phenotypes emerge from the interactions of underlying genotypes.

These genotypes are the true factors through which variation is preserved from generation to generation; an organism’s visible characteristics are only pointers to the true underlying heritable variation present in the genes. Darwin’s Origin of Species was published in 1859 to great fanfare, but Darwin famously lacked a plausible mechanism of inheritance that could maintain the variation that was necessary for natural selection. Mendel provided the answer, but the Austrian monk’s single 1866 paper, “Experiments on Plant Hybridization,” was ignored by the scientific community of the time, only to be rediscovered around 1900, when the modern field of genetics was born.

But twentieth-century genetics very much worked within Gregor Mendel’s methodological framework. Genes were analytical units, abstractions necessary to explain the patterns of inheritance visible in breeding experiments, but not understood in physical terms. Genetics proceeded through analyses of patterns of inheritance in pedigrees and populations, a laborious matter of inspection and inference. The journey to where we are today, when we can read out the sequence of any organism that we choose, began in the 1940s when biologists realized nucleic acids were the medium through which genetic information was transmitted.

After James Watson and Francis Crick’s elucidation of the structure of DNA in 1953, the molecular biological revolution that it ushered in allowed geneticists to conceive of the idea of mapping genes in a direct physical manner, rather than inferring them through the transmission of phenotypes within pedigrees. But even as late as 1975 only one hundred genetic positions were mapped in the human genome across all populations. The first complete biophysical genetic map of an organism, Haemophilus influenzae, was published in 1995 with a1.83 million base pair length sequence. By 2020 there were tens of thousands of different species sequenced. The story of the mutation of genetics from a data-poor to a data-rich science is one of exponential technological change; it is very much a synergy between rapid advances in computing and novel innovations in chemistry.

But more interesting than the exponential growth in data are the surprising things we have inferred from the data. In the heady early days of the publication of the draft of the human genome over twenty years ago, co-author Francis Collins asserted that the combination of molecular biology and genomics would “make a significant impact” on our attempt to understand and cure cancer. Despite some early instances where genomic sequencing was performed on cancer patients, like Steve Jobs in 2009, the overall impact of the new science on healthcare has been modest at best. Instead, paleoanthropology, prehistory, and history were transformed as genetics surveyed the pedigrees of the human past with a power and precision that would have been unimaginable a generation ago.

Even though the Swedish geneticist Svante Pääbo published a paper in 1984 on the DNA of mummies, pioneering the field that would become paleogenomics, it is clear that much of his work in the 1980s and early 1990s was simply reporting sample contamination; the DNA detected was that of lab workers or people who had handled artifacts and specimens. But in 2022, Pääbo was awarded the Nobel Prize in Physiology and Medicine for the transformative work that began in the 2000s. He and his colleagues had learned from earlier errors, and taken to the new genomic technology with gusto.

Demin Barnaul/Denisova cave in Siberia, which was inhabited by Neanderthals, Denisovans and modern humans, and their mixed progeny.

The first modern human genetic map was published one hundred years after the founding of the field, but the first prehistoric human genetic map was published ten years after that, when Pääbo’s group released the draft of the Neanderthal genome in May 2010. The team then unveiled the genome of a new human species, Denisovans. Named after the Denisova cave in Siberia, where a broken finger bone and a single molar yielded their genome, they are a whole additional branch of humanity distinct from but closer to Neanderthals than modern humans. While Neanderthals are well-known from paleontology and archaeology, Denisovans were novel because they have been identified only from their distinct genetic markers. Genomics was resurrecting the DNA of literally vanished species of humans that were totally unknown to science.

Humanity Was Once Not One Species, But Many

DNA is a robust macromolecule, a feature that is useful in maintaining high information fidelity across generations. The rate at which it degrades can vary significantly depending on several factors, including the surrounding environment, temperature, and the presence of water or enzymes. But DNA is a stable enough molecule that it can easily persist for thousands of years under favorable conditions and, in some rare cases, over one million years.

Svante Pääbo’s esteem in the field of science owes much to the punctilious processes that he has pioneered for the analysis of ancient DNA preserved in the teeth and bones of subfossils. The task is not trivial due to contamination, but paleogeneticists have now perfected the techniques well enough that, where a generation ago sequencing a single human genome was a massive achievement, today we have more than 10,000 ancient human genomes.

More interesting to the general public than the biophysics of DNA preservation is the information that we can obtain from DNA. In the 20th century, geneticists used the variation of modern people to make inferences about the past, most famously with the “Out of Africa” theory of the origin of our species. In the 21st century, paleogeneticists don’t just reconstruct the tree of life, they actually expose its nodes, branches, and roots by obtaining ancient DNA. This fine-grained, detailed map of the human past allows for extremely precise conclusions to be drawn about general human demographics and social structure.

Even a single high-quality genome is informative because the three billion base pairs and millions of variants that convey unique individual information reflect the sum totality of a population’s history by ascending backward in their genealogy. The variation of the entire genome sequence of a single human makes it clear our lineage went through a bottleneck in the last 100,000 years, and that we’ve undergone massive population growth in the last ten thousand years. Similarly, the genome of a single Neanderthal also tells us that this lineage, unlike our own, seems to have existed at very low population numbers throughout its history.

Today, we know that Neanderthals varied in hair color, just like modern humans, and that their genome was 99% identical to those of our own lineage. They were physically more robust, which is clear from their fossils, and their genes for bone development and muscle development differ from ours. And, to the surprise of geneticists thirteen years ago, it was also clear that most modern humans have some Neanderthal ancestry. Outside of Africa, the figure is between 2% and 2.5%, and it is present from Australia to Ireland. The implication is that early on in the modern human expansion out of Africa they mixed with a group of Middle Eastern Neanderthals.

This shocking result only came out through ancient DNA. Not only do modern humans have Neanderthal ancestry, but some of us have Denisovan ancestry. It comprises 5% of the heritage of Papuans and lower fractions of Denisovan ancestry are found throughout Asia. There is an open question in anthropology as to whether humans are naturally promiscuous. The data from DNA shows that our forebears were sexually open to liaisons with populations and people quite different from them, and definitely forces us to lean in one direction in the debate.

Using a genomic clock, ‌Neanderthals and modern humans became separated 600,000 years ago. The most distinct lineage in modern populations, between South African Khoisan and all other humans, clocks in at 200,000 years. Our ancestors’ sexual preferences were evidently very broad. In a cave in Russia, Researchers have even discovered a young girl whose mother was a Neanderthal and whose father was a Denisovan. Statistically, the probability of catching a first-generation hybrid is low; the fact that it was discovered shows that this behavior was common.

But did the Denisovans and Neanderthals disappear into the loving arms of humans migrating out of Africa, or was there capture and coercion involved? The data to answer this question does not exist for these early encounters, but is more copious for later prehistoric epochs. Premodern humans clearly made love, but they also made war. Earlier generations of paleoanthropologists believed that modern populations descend from groups that diverged and settled down 50,000 years ago and turned into what we now know as regional populations: Europeans, Africans, East Asians, and South Asians. The ancestors of modern Europeans were thought to be the Cro-Magnons that arrived 45,000 years ago and replaced the Neanderthals.

Ancient DNA and genomics make it clear that this idea is wrong. The Cro-Magnons left no descendants; they’re totally unrelated to other modern populations. Rather, about 38,000 years ago, they were replaced by another group of humans from the southeast who eventually gave rise to the Gravettian culture, notable for producing small ivory sculptures. Then, 20,000 years ago, another culture arrived that gave rise to the cave-painting Magdalenians, replacing the earlier populations. Then, 15,000 years ago, more waves of foragers migrated out from the Middle East, again mostly replacing the earlier populations.

Farmers arrived 8000 years ago with the Neolithic Revolution, only to be replaced 4500 years ago by Indo-Europeans from the steppe. The story of Europe is the story of the world; humans are always on the move, and one group replaces another, over and over. Unlike many other organisms, humans are not sessile and geographically concentrated. Our demographic equilibrium is inevitably shocked by radical transformations. For modern humans, all is change.

The Genghis Khan Effect and Population Replacement

How did humans replace other humans? The evidence from ancient DNA in Europe sheds light on the structure of these early societies. The great longhouses of Neolithic Germany were ruled by groups of men who were related, sharing the same Y-chromosomal lineage. The women, in contrast, were strangers, judging by their diverse maternal lineages. When Indo-Europeans arrived 4500 years ago, the ancient DNA record shows a total turnover of Y chromosomes, with the older lineages replaced by new ones. In contrast, the maternal lineages show continuity. In genetic parlance, this is “male-mediated migration,” but ‌it was probably quite like the Spanish conquest of the New World, with young males taking new lands and women by force.

The Roman recollection of the rape of the Sabine women likely reflects cultural memory of events in prehistory, where victorious males obtained mates from the lands they conquered after killing the fathers and brothers of the women they would make wives. Prehistoric human males behaved like lions taking over a pack, killing everyone among the conquered except for nubile females. Genetics shows that since the end of the last Ice Age, paternal lineages are characterized by periodic explosions, where one clan seems to have replaced all the others through a process of competition and polygyny.

Call it the “Genghis Khan effect,” but the Mongolian world emperor was simply the last in a long line of “super-males” that have defined much of the last 12,000 years. They say to believe them when they tell you who they are, and the legends of the Indo-Europeans reflect a patriarchal and warlike culture, destroyers of cities like the god Indra and near-immortal warriors like Achilles, and this is exactly what genetics tell us about them. In prehistoric Sweden, the Neolithic Megalith builders who dominated the region for more than 1000 years seem to have been totally exterminated by the invading “Battle-Axe” culture. The development of agriculture was a new technology that allowed for the expansion of human societies and the emergence of social stratification, but combined with our innate instincts, genetics make it clear that the drive to extermination manifested itself in most places and most times.

We cannot avoid what human nature was for tens of thousands of years in the past. It was bloody, it was brutal, and it was typified by genocide. This is the legacy we inherit, but it is not the legacy we need to replicate. The average life expectancy in the past was also much shorter than in the present, but the application of technology and social institutions has ameliorated the toll that disease takes on the human body. Human societies are also organisms and their rise and fall are measured in the waves of change in the genes of our own species. To the victors go the spoils and the seeds of the future. But institutions like monogamy and a modicum of wealth redistribution can be thought of as social technologies that dampen the volatility inherent in human relationships, a volatility that can manifest in chaos and warfare. Not a war of all against all, but a war where winners took all.

The Hero Prometheus Unchains the God Asclepius

If the application of genomics to understanding our past is the present, then the application of genomics to medicine will be our future. The original value proposition of the Human Genome Project was in medical health, and overall genomics has not lived up to the early hype. The reason turns out partly to be that the architecture of the genetics of diseases was not what we had assumed twenty years ago. Instead of a few mutations that have an outsized impact, conditions like type 2 diabetes, schizophrenia, and cancer are defined by numerous genes that cause a person to be at risk.

A single entire genome—or a few hundred—was only the start. These were necessary preconditions to understand diseases, but not sufficient. Today, we know that most diseases are complex or polygenic, because of the action of hundreds or thousands of genes, and many of the causal mutations are quite rare. Human variation is substantial enough that generalizations across individuals have limited utility until reaching the very large sample sizes that are only feasible today. The value of genomics is thus directly proportional to the number of genomes sequenced. This means that the true promises of genomic health will only come to the fore in the next decade, as we scale sample sizes to new heights.

The lowest hanging fruit for genetic health is for single-gene Mendelian disorders. In the 1990s, detecting the mutations for diseases like cystic fibrosis was worthy of coverage in the pages of The New York Times. Today, most of these diseases have been characterized in terms of their mutations, and diagnostic services are available in developed countries. The power of genomics to find unknown unknowns is illustrated most powerfully already in the pediatric domain. Newborns who are born and do not flourish are now habitually sequenced and, currently, disorders can be identified immediately in 40% of cases.

When technology becomes ubiquitous, it becomes not worth mentioning; already millions of women take for granted noninvasive prenatal tests that detect whether a fetus has Down’s syndrome or other chromosomal abnormalities. These tests have allowed for the phasing out of amniocentesis after 2015, and in the future, their power will increase. They operate by extracting the genetic signal of the fetus from the blood sample drawn from the mother; complex computational algorithms that distinguish the faint signal of fetal cells that are present in the pregnant woman.

Next up after the detection of chromosomal abnormalities will be Mendelian diseases like Tay-Sachs disease, cystic fibrosis, and sickle cell disease. But in the next twenty years, these prenatal methods will expand to highly heritable complex diseases: autism is 85% heritable, Alzheimer’s risk is 70% heritable, schizophrenia is over 75% heritable, and bipolar disorder is 70% heritable. The implication here is clear: we can eliminate many of these diseases through genetic screening of embryos or selective termination. The problem may be that some of these characteristics, like schizophrenia, are also associated with creativity. If we improve our gene pool, are we throwing the baby out with the bathwater?

The age of ubiquitous reading of the genome is around the corner. The technology is orders of magnitude cheaper and more efficient than even a decade ago. By the 2030s, we’ll have near-zero-cost sequencing, and it won’t require a massive machine. Instead, imagine portable kits that you could carry and can perform high-scale computation on your desktop or in your pocket. And the spread of computation into everyday life will include genomics. One of the most essential monitors of your own health will likely be the toilet, which will get samples of your cells and your microbiome several times a day.

Rather than unnatural and intrusive medical tests, monitoring will embed itself in our lives, and genomics will be a large part of it because of the plasticity and mutability of your sequence over your life through variation in gene expression. One reason that pancreatic cancer is so fatal is that it is detected only late in life. Imagine a scenario where your toilet is constantly sampling the cells that you shed and is able to detect mutations and abnormalities once they rise above the threshold of salience.

But the age of reading will soon be joined by the age of editing. Genetics is a science of retrospective analysis, but it is also a science of application, colloquially known as “genetic engineering.” The tools and techniques of genetic engineering have been available since the 1970s, but in 2012 CRISPR technology allowed for a quantum leap in the ease of methodology, as well as accuracy and precision. If previous recombinant DNA technologies were a buzzsaw, CRISPR editing is like a paring knife. A decade out, CRISPR has been applied to only a few cases, like curing sickle cell.

As with any technology, it is still imperfect and not suitable for the treatment of most diseases due to the risk of the introduction of novel mutations that might cause diseases like cancer. But the rate of change is such that it is clear that genetic engineering will be safe and affordable within a generation, and that is when the true implications of the genetic revolution will confront us. Instead of simply reading the book of life, we will have the ability to write it. The question will not be about our ability to do something, but our willingness. What had previously been technical questions will become normative ones.

We Will Either Evolve or Go Extinct

The twentieth century saw humanity’s grasping of Promethean powers in the realm of physics. Nuclear weapons have the potential either to power or to destroy our civilization. But unlike the rise of agriculture, this new technology did not in fact unleash a bloodbath—so far at least. The emergence of international institutions and the aversion to using nuclear weapons by international elites ushered in an era of relative peace, at least compared to the wars of the first half of the twentieth century. Genetics has made it clear that the rise of civilization was soaked in blood, but nuclear technology is a case in point that awesome powers do not always entail doom.

The 21st century will see massive developments in biology analogous to nuclear weapons. Since the emergence of this scientific field out of anatomy and natural history in the 19th century, the focus has been on understanding and reflection, rather than direct application. There are exceptions, like the multi-decade gain in lifespan in the mid-20th century for children with type 1 diabetes, as well as the spread of vaccination and antibiotics. Both vaccination and antibiotics co-opt nature, but the production of insulin was one of the first successes of recombinant DNA technology.

Insulin is produced by bacteria that have been genetically engineered with human biochemical pathways. Within one generation, it seems likely that instead of using genetic engineering to create biological products for humans, the tools will be powerful enough that we can recreate and re-fashion ourselves at the level of our genetic blueprint. The question is not whether we can do it, the question is whether we will or even should. Will humanity remake itself in whatever image it chooses? Or will it bow down before the mandate of nature’s law?

The human propensity for competition and domination was a toxic cocktail when mixed with the new technologies of agriculture; foraging tribes scaled up simply expanded the horizon of brutality. It is almost inevitable that the age of eugenics will arrive in the near future. In the science fiction treatment of Star Trek,genetic modifications unleashed a sect of supermen upon the world, triggering the “Eugenics Wars.” This is one vision of the future, where genetic enhancement unlocks our capacity for brutality, taking it to the next level, just as agriculture seems to have.

But there is another possibility: that we can and should modify our personalities, our natures, to make us somewhat less the “Pleistocene mind” trapped in a technological world. Genomic technologies have made it crystal clear how our forager mindsets can take society down dark turns in the face of new technological opportunities; we now have the technology to remake humanity by tinkering with the biomolecular machinery and making us more adept and fitting for the glittering world that we inhabit.

Razib Khan is a co-founder GenRAIT, a deep tech company that deploys a platform for the life sciences. He also writes at Razib Khan’s Unsupervised Learning. You can follow him at @razibkhan.