Google DeepMind and its charitable arm Google.org have announced a five‑year partnership with the UK’s Wellcome Sanger Institute to create high‑quality genomic datasets that will serve as training data for new artificial‑intelligence models. The consortium, announced at the AI x BIO conference in Cambridge, will receive $5 million per year from Google.org and DeepMind, totaling $25 million over its lifespan.

The goal of the partnership is to produce curated, open‑access genomic resources that can be used by the broader scientific community. According to a statement from Julia Wilson, chief innovation and impact officer at the Sanger Institute, the consortium aims “to create resources that will be shared widely with the community to enable transformative scientific discoveries and deliver broad impact across the life sciences.”

DeepMind’s recent work in genomics provides a foundation for the collaboration. In January, the company released AlphaGenome, a publicly available model that predicts the function of DNA sequences. According to reports, AlphaGenome can go beyond measuring gene expression and also forecast DNA accessibility and transcription‑factor binding. Žiga Avsec, a DeepMind researcher and lead author of the AlphaGenome paper, told C&EN that the model “can go beyond expression and predict more‑detailed aspects like DNA accessibility and transcription‑factor binding.”

AlphaGenome is part of DeepMind’s broader open‑source strategy. The model is available through a preview API for non‑commercial research and is built on open‑access datasets. In May, DeepMind introduced Co‑Scientist, a multi‑agent AI platform that scans existing literature and generates hypotheses. Both tools illustrate DeepMind’s commitment to developing AI systems that can accelerate biological discovery.

The Wellcome Sanger Institute has a long history of data sharing and large‑scale sequencing. Founded in 1992 and located on the Wellcome Genome Campus near Cambridge, the institute has contributed significantly to the Human Genome Project and now focuses on genetics in health and disease. With roughly 900 staff, its research spans somatic genomics, cellular genomics, human genetics, parasites and microbes, generative genomics, and the tree of life.

DeepMind’s partnership with the Sanger Institute is intended to address a gap in the life‑sciences ecosystem: many areas lack suitably indexed and curated resources for training AI algorithms. By combining the Sanger Institute’s data‑generation capabilities with DeepMind’s AI expertise, the consortium seeks to produce datasets that are both high‑quality and widely available.

The funding arrangement reflects Google.org’s broader AI‑for‑science initiative. Google.org, the charitable arm of Alphabet, has committed roughly $100 million annually to nonprofit projects that use technology to address global challenges. The AI‑for‑Science program, which launched in 2024, supports research in health, agriculture, biodiversity, and climate resilience.

DeepMind’s history of breakthrough AI research underpins the consortium’s credibility. The company’s AlphaFold protein‑folding model, released in 2020, achieved state‑of‑the‑art accuracy and was subsequently made publicly available. AlphaFold’s success demonstrated the power of large, well‑curated datasets combined with advanced deep‑learning techniques.

The consortium’s five‑year timeline will allow for iterative development of genomic resources and the training of new AI models. While the partnership is currently focused on data creation, the long‑term vision includes enabling AI systems that can interpret complex genetic information, accelerate drug discovery, and improve understanding of disease mechanisms.

At present, the consortium has not announced specific milestones beyond the initial funding and partnership agreement. However, the release of AlphaGenome and the launch of Co‑Scientist suggest that DeepMind is actively applying its AI capabilities to genomics. The Sanger Institute’s commitment to open data further indicates that the consortium’s outputs will be broadly accessible.

In summary, the Google DeepMind–Google.org–Wellcome Sanger Institute partnership represents a coordinated effort to supply the life‑sciences community with high‑quality genomic datasets. By combining DeepMind’s AI expertise with the Sanger Institute’s data‑generation infrastructure and Google.org’s philanthropic support, the consortium aims to accelerate scientific discovery and deliver wide‑ranging benefits across biology and medicine.

The partnership remains in its early stages, and the consortium has not yet detailed concrete deliverables or timelines beyond the funding schedule. Future announcements are expected to outline specific datasets, AI model releases, and potential applications in biomedical research.