UK Biobank, a biomedical database that contains extensive information from half a million volunteers, has announced the release of new data from whole genome sequencing of its participants. This is the world’s largest single set of sequencing data, and it will enable researchers to study the genetic determinants of disease and accelerate the development of new treatments and cures.
UK Biobank is a large-scale biomedical project that started in 2006, with the aim of collecting and storing health and lifestyle data from 500,000 people aged 40-69 in the UK. The participants have agreed to provide blood, urine and saliva samples, undergo physical and cognitive tests, and have their medical records and health outcomes tracked over time. They have also consented to have their data and samples used for approved health research by scientists from around the world.
UK Biobank is a unique and valuable resource for health research, as it allows researchers to investigate the complex interactions between genes, environment and lifestyle that influence the risk and progression of various diseases. UK Biobank has already enabled many discoveries and insights into the causes and consequences of common conditions such as cancer, diabetes, heart disease, dementia, and mental health disorders.
What is whole genome sequencing and how was it done?
Whole genome sequencing (WGS) is a technique that reads the entire DNA code of an organism, which consists of about 3 billion letters. WGS can reveal the variations and mutations in the DNA that may affect the function of genes and proteins, and influence the susceptibility and response to diseases and treatments.
UK Biobank has undertaken the most ambitious project of its kind ever, by sequencing the whole genomes of all 500,000 participants. This was made possible by a public-private partnership involving UK Research and Innovation (UKRI), Wellcome, and four pharmaceutical companies: Amgen, AstraZeneca, GlaxoSmithKline (GSK) and Johnson & Johnson. The sequencing was carried out by deCODE Genetics and the Wellcome Sanger Institute, and took more than five years and over 350,000 hours of computing time to complete.
The first batch of WGS data for 200,000 participants was released in November 2021, and the remaining 300,000 genomes were released today, making it the world’s largest single set of sequencing data available for research.
What are the benefits and challenges of using WGS data for health research?
The WGS data from UK Biobank will provide unprecedented opportunities for researchers to explore the genetic basis of health and disease, and to identify new targets and biomarkers for drug discovery and development. The WGS data will also enable researchers to study the rare and complex variants in the DNA that are not captured by other methods, such as genotyping or exome sequencing. These variants may have significant effects on gene expression and regulation, and may explain some of the missing heritability of common diseases.
However, the WGS data also pose some challenges and limitations for health research. For example, the interpretation and analysis of WGS data require advanced computational and statistical methods, as well as ethical and legal frameworks to ensure the privacy and security of the participants’ data. Moreover, the WGS data alone are not sufficient to understand the causal mechanisms and pathways of diseases, and need to be integrated with other types of data, such as environmental, lifestyle, biochemical, imaging, and clinical data. UK Biobank has collected and stored such data for its participants, and has made them accessible to researchers through a secure online platform.
How can researchers access and use the WGS data from UK Biobank?
Researchers who wish to access and use the WGS data from UK Biobank need to apply for approval from the UK Biobank Access Committee, and agree to the terms and conditions of the UK Biobank Resource Access Agreement. The WGS data are available through the UK Biobank Research Analysis Platform, which is a cloud-based environment that allows researchers to securely access, analyse and share the data. The platform also provides various tools and resources to facilitate the use of the WGS data, such as quality control metrics, variant annotation, reference panels, and pipelines.
The WGS data from UK Biobank are expected to generate a wealth of new knowledge and discoveries that will advance the field of genomics and precision medicine, and ultimately improve the health and well-being of millions of people.