A Research Blog

Last time, we looked at Zika and microcephaly. As we continue our series on the top research stories of 2016, we asked Prof Stuart Cook, Director of the Cardiovascular and Metabolic Disorders (CVMD) Programme at Duke-NUS, for what he thought was the biggest research story of 2016 to impact CVMD research. His pick: the Exome Aggregation Consortium (ExAC). In today’s post, we find out more about ExAC and why it is such a big deal.

What ExACtly is the exome?

Our genome stores all the information necessary for life, it is like the body’s instruction manual on how to function. Each cell refers to this manual to determine which genes to express into proteins, thereby dictating a cell’s behaviour within a tissue, organ and system. The portions of the genome that directly code for these proteins make up the exome.

Variation within the exome exists due to the accumulation of mutations in the genome. Some of these variants have no effect on the proteins they code, while others render the protein useless and contributes to the development of disease. The question now is which variants contribute to disease, and which are noise?

What ExACtly is ExAC?

Enter ExAC, the largest known repository of human protein-coding regions. Made up of the exomes of over 60,000 individuals, ExAC provides unprecedented, open access to large amounts of genetic data that would be a data mining dream. The sheer size of this database captures most existing variants across global populations, even those occurring at extremely low frequencies.

The public is free to access this data at the ExAC Browser and search for a variant of interest simply by inputting a gene name, specific variant ID or gene region. The browser will return a report, which includes the sequencing coverage, the list of variants, its functional annotation and its frequency in the database.

ExAC Report

Why ExACtly is ExAC such a big deal?

ExAC is a fantastic example of the sharing, aggregation and harmonization of data in research. Hopefully, this sets the trend for other research areas to follow suit; pooling data to sufficiently power analysis, and answer questions that could not be answered previously.

Never before have researchers had unfettered access to such amounts of genetic data, allowing them to rigorously investigate the associations between gene variants and phenotypes. ExAC is large enough to capture most variants and provides confidence in interpreting the significance of specific variants in individuals. In addition, the size of ExAC provides scientists with the ability to analyse low-frequency protein-coding variants at a resolution not seen before.

In many CVMDs, such as diabetes and some cardiomyopathies, there is a significant genetic component associated with the manifestation of disease. ExAC now enables CVMD researchers to home in on genetic variants specifically associated with CVMDs, with more precision than ever before. Better understanding of a patient’s genetic composition may allow physicians to take a more personalised approach in managing CVMDs.

What ExACtly is next?

In the next year, as researchers continue to mine the data available, identifying variants that play significant roles in disease, ExAC plans to expand their dataset to include over 120,000 exomes and 20,000 whole genome sequences. We hope that, as ExAC continues to grow, our understanding of how genes contribute to disease continues to grow with it. Having the specific genetic makeup of a patient, may one day be all that is needed to personalise the treatment and management of disease, while maximising efficacy and minimising risks and side effects.

ExAC is one of the best products of big data meeting biology!


Search form