Using high-throughput experiments and machine learning to understand the role of non-coding mutations in cancer

Cancer is caused by mutations in the DNA that cause a patient’s cells to grow out of control. Some of these cancer-causing mutations change how genes are regulated; that is, which genes are turned on or off in the cell. Essentially all cancers have activated the TERT gene because TERT is essential for cancer growth. We understand TERT regulation better than most genes, but even here we cannot predict how mutations alter TERT expression. Overall, we do not understand which genes or mutations can promote cancer via altered gene regulation. Our work aims to learn the code that cancer cells use to interpret regulatory mutations. We will make many artificial mutations in large scale, and measure how much each mutation affects the amount of gene made. We will model how the cells interpret these mutations using a computer, and apply the model to find new cancer mutations. We will these computer models to discover how often mutations alter gene regulation in cancer, and highlight genes whose regulation is important in particular cancers. In the long-term, our work will allow us to better diagnose and treat cancer by showing how a particular patient’s tumor’s mutations alter gene regulation and cancer growth.