“We’re only just starting to know the complete majesty of life on Earth,” wrote the founding members of the Earth BioGenome Project in 2018. The ambitious project raised eyebrows when first announced. It seeks to genetically profile over 1,000,000 plants, animals, and fungi. Documenting these genomes is step one to constructing an atlas of complex life on Earth.
Many living species remain mysterious to science. A database resulting from the project can be a precious resource for monitoring biodiversity. It could also make clear the genetic “dark matter” of complex life to encourage latest biomaterials, medicines, or spark ideas for synthetic biology. Other insights could tailor agricultural practices to ramp up food production and feed a growing global population.
In other words, digging into living creatures’ genetic data is ready to unveil “unimaginable biological secrets,” wrote the team.
The issue? A hefty price tag. With an estimated cost of $4.7 billion, even the founders of the project called it a moonshot. Nevertheless, against all odds, the project has made progress, with 3,000 genomes already sequenced and 10,000 more species expected by 2026.
While lagging its original goal of sequencing roughly 1.7 million genomes in a decade, the project still hopes to hit this goal by 2032—later than the unique goalpost, but with a much cheaper price tag due to more efficient DNA sequencing technologies.
Meanwhile, the international team has also built infrastructure to share gene sequencing data, and machine learning methods are further helping the consortium analyze hundreds of datasets—helping characterize latest species and monitor DNA data for endangered ones.
Expanding the Scope
Genetic material is in every single place. It’s an abundant resource to make sense of lifetime of Earth. As genetic sequencing becomes faster, cheaper, and more reliable, recent studies have begun digging into information represented by DNA from species across the globe.
One method, dubbed metagenomics, captures and analyzes microbial DNA gathered in quite a lot of environments, from city sewers to boiling hot springs. The tactic captures and analyzes all DNA from a specific source to color a broad genetic picture of bacteria from a given environment. Slightly than bacteria, the Earth BioGenome Project, or EBP, is aiming to sequence the genomes of individual eukaryotic creatures—principally, those who keep most of their DNA in a nut-like structure, or nucleus, inside each cell.
Humans, plants, fungi, and other animals all fall into this group. In a single estimate, there are roughly 10 to fifteen million eukaryotic species on our planet. But just slightly over two million have been documented.
Sequencing DNA from eukaryotic cells could vastly expand our knowledge of Earth’s genetic diversity. Such a database may be a treasure trove for synthetic biology. Scientists have already tinkered with the genetic blueprints of life in bacteria and yeast cells. Deciphering—after which reprogramming—their genes has led to advances similar to coaxing bacteria cells to pump out biofuels, degradable materials, and medicines similar to insulin.
Charting eukaryotes’ genomes could further encourage latest materials or medicines. For instance, cytarabine, a chemotherapy drug, was initially isolated from a sponge-like sea creature and approved by the FDA to treat blood cancers that spread to the brain. Other plant-derived medications are already getting used to tackle viral infections or to manage pain. From nearly 400,000 different plant species, tons of of medicines have already been approved and are in the marketplace. Similarly, deciphering plant genetics have galvanized ideas for brand spanking new biodegradable materials and biofuels.
Genetic sequences from complex organisms can “provide the raw materials for genome engineering and artificial biology to provide helpful bioproducts at industrial scale,” wrote the team.
Medical and industrial uses aside, the trouble also documents biodiversity. Making a DNA digital library of all known eukaryotic life can pinpoint which species are most in danger—including species not yet fully characterised—providing data for earlier intervention.
“For the primary time in history, it is feasible to efficiently sequence the genomes of all known species and to make use of genomics to assist discover the remaining 80 to 90 percent of species which might be currently hidden from science,” wrote the team.
Soldiering On
The project has three phases.
Phase one lays the groundwork. It establishes the species to be sequenced, builds digital infrastructure for data sharing, develops an evaluation toolkit. A very powerful goal is to construct a reference DNA sequence for species similar in genetic makeup—that’s, those in a “family.”
Reference genomes are incredibly necessary for genetic studies. True to their name, scientists depend on them as a baseline when comparing genetic variants—for instance, to trace down genes related to inherited diseases in humans or sugar content in several variants of crops.
Phase two of the project will begin analyzing the sequencing data and form strategies to keep up biodiversity. The last phase integrates all previous work to potentially revise how different species fit into our evolutionary tree. Scientists can even integrate climate data into this phase and tease out the impacts of climate change on biodiversity.
The international project began in 2018 and included the US, UK, Denmark, and China, with most DNA specimens sequenced at facilities in China and the UK. Today, 28 countries spanning six continents have signed on. Most DNA material isolated from individual species is directly sequenced on site, reducing the fee of transportation while increasing fidelity.
Not all participants have quick access to DNA sequencing facilities. One institution, Wellcome Sanger, developed a conveyable DNA sequencing lab that might help scientists working in rural areas to capture the genetic blueprints of exotic plants and animals. The device sequenced the DNA of a variety of sunflower with potential medicinal properties in Africa, amongst other specimens from exotic locations.
EBP follows within the footsteps of other global projects aiming to sequence the Earth’s microbes, similar to the National Microbiome Initiative or the Earth Microbiome Project. Once also considered moonshots, these have secured funding from government agencies and personal investments.
Despite the passion of its participants, EBP continues to be short billions of dollars to guide it to full completion. However the project’s price tag—originally estimated within the billions of dollars—could also be far less.
Because of more efficient and cheaper genetic sequencing methods, the present cost of phase one is anticipated to be half the unique estimate—around $265 million.
It’s still a hefty sum, but for participants, the resulting database and methods are price it. “We now have a standard forum to learn together about how one can produce genomes with the very best possible quality,” Alexandre Aleixo on the Vale Institute of Technology, who participated within the project, told Science.
Given the influence bacterial genetics has already had on biomedicine and biofuels, it’s likely that deciphering eukaryote DNA can spur further inspiration. In the long run, the project relies on a worldwide collaboration to learn humanity.
“The far-reaching potential advantages of making an open digital repository of genomic information for all times on Earth might be realized only by a coordinated international effort,” wrote the team.