Extraction and evaluation of transcription factor gene-disease association

Genes are the basic blueprints used by the cells in our body. When a gene is modified, the cells in our body can be affected; in the worst case, this can cause a disease. A researcher can often be faced with several candidate genes to study in relation to a particular disease, and choosing the genes with the best potential for discovery is important for making the best use of research resources. Around the world, researchers are studying thousands of different genes to understand their roles in health and disease states. Their findings – in the form of abstracts and annotations – are captured in a variety of databases. However, this vast source of biomedical literature is an under-utilized resource. Powerful computational biology methods are required to allow researchers to mine this information. Warren Cheung is developing an automated system that can examine the available biomedical literature and quantitatively determine which genes are most likely involved in a particular disease. Not only will the system identify previous relevant findings, its integration of data and annotations from many studies is expected to identify previously unknown associations between genes and diseases. Cheung’s research will initially focus on the involvement of transcription factor genes in brain diseases and cancer. However, the techniques developed and tested will be easily adaptable to all types of genes and diseases. Cheung’s award is jointly funded by MSFHR and the Down Syndrome Research Foundation. With the ability to automatically look at all the papers that have been published on genes and their functions, this system will make unbiased predictions and previously unknown linkages. This promises to be a powerful tool for understanding genes and disease.