题目:Applications of machine learning methods in genomic data analysis
主讲人:Yanni Sun(Professor,City University of Hong Kong)
日期:2019年5月24日(星期五)
时间:下午3:00 - 3:30
地点:数据科学与jbo竞博电竞官方网站 A201
主持:吴维刚 教授
摘要:Known as the blueprint of life, the genomic sequence contains instructions for controlling a species’ growth, development, survival, and reproduction. Next-generation sequencing (NGS) technologies, which produce vast amount of sequencing data for various life forms, have provided tremendous information for tackling grand challenges from finding more effective treatment for human diseases to improving biofuel energy production.
Many fundamental research problems in analyzing NGS data can be formulated as classification and clustering problems in machine learning. Both conventional models such as hidden Markov models and newer models such as deep learning models can be applied to tackle these problems. In this talk, I will introduce several applications of machine learning models/methods in genomic data analysis and highlight the challenges.
I will then focus on a particular research topic on using these methods to characterize viral populations from clinical samples. Many clinically important RNA viruses such as HIV, HCV, SARS-coV, Influenza have a high mutation rate during replication and thus form a population of related but different viral strains, which are referred to as quasispecies. Reconstruction of each strain sequence is highly important for development of clinic prevention and treatment. I will present our work on effective reconstruction of all haplotypes in quasispecies using NGS data. It has many applications in precision medicine, ecology, vaccine design etc.
个人介绍:Yanni Sun is an Associate Professor in Electronic Engineering at City University of Hong Kong. Before she relocated to Hong Kong, she was an Associate Professor in the Department of Computer Science and Engineering at Michigan State University, USA. She received the B.S. and M.S. degrees from Xi'an JiaoTong University (China), both in Computer Science. She received the Ph.D. degree in Computer Science from Washington University in Saint Louis, USA in 2008. She works in bioinformatics and computational biology. In particular, her recent research interests include sequence analysis, application of machine learning and data mining models in analyzing next-generation sequencing data, metagenomics, protein domain annotation, and noncoding RNA annotation. She was a recipient of NSF CAREER Award in 2010.