Research Statistical Analyst (Bioinformatics)

University of Texas MD Anderson Cancer Center
Epigenetics and Molecular Carcinogenesis
United States Texas Houston


Research Statistical Analyst (Bioinformatics) – Epigenetics and Molecular Carcinogenesis
This position is in the Department of Epigenetics and Molecular Carcinogenesis and will be located at the Houston MD Anderson campus in the Texas Medical Center.

The mission of The University of Texas M. D. Anderson Cancer Center is to eliminate cancer in Texas, the nation, and the world through outstanding programs that integrate patient care, research and prevention, and through education for undergraduate and graduate students, trainees, professionals, employees and the public.

The primary purpose of the Statistical Analyst position is to carry out data preparation responsibilities, develop statistical programs and perform statistical analysis. Impacts data analysis and management aspects of research for the Department of Epigenetics and Molecular Carcinogenesis.


Bioinformatics Support
Under supervision and in collaboration with researchers and faculty, compiles, analyzes and reports statistical data for various projects including, but not limited to manual and computer-aided data abstraction and evaluation, computerized imaging and bioinformatics. Utilize various database management systems. Interact regularly with next generation sequencing (NGS) core to troubleshoot and resolve technical problems impacting the quantity and quality of samples and data. Perform analysis on data produced from NGS systems (DNA-seq, RNA-seq, ChIP-seq, Bisulfite sequencing etc.). Maintain, benchmark and develop NGS-based applications and technologies in order to assist department faculty for research data analysis. Design and apply bioinformatics algorithms to elucidate global regulatory mechanism by integrating the genomic data. Leverage a wide range of bioinformatics and statistical skills with a focus on contributing to NGS core and epigenetic research projects.

Expertise and Specialization
Utilizing knowledge and experience in both Windows and Linux operating systems, with emphasis on Linux, shell scripting within Bash environment or similar general purpose programming languages, develop high quality scripts (R, Perl or Python) and programs (C/C++, and optionally Java) for genomic applications, evaluate research project results and develop appropriate responses to resolve problems, review scientific findings/articles, evaluate and build upon third-party open-source bioinformatics and other software applications, and use public genome databases (TCGA, GEO, NCBI, EBI, UCSC, etc.). Perform bioinformatics algorithms and large-scale data analysis.

Research Support
Assist faculty in the development of new statistical methodology for measurement and analysis of data. Apply standard statistical methods, simulation models and statistical programming as needed. Collaborate with supervisor and principal investigators in biology labs to develop appropriate hypotheses and data analysis pipelines and deliver robust, integrated analyses of the NGS data. Assist in planning, developing, and executing computational tools and algorithms necessary to understand cellular and molecular mechanisms of carcinogenesis, cancer epigenetics, and other biological processes. Assess relevant literature as well as existing data, evaluate the quality of data used in reports and assist with preparation and distribution of data for committee and scientific meetings. Represent the bioinformatics program at conferences and meetings. Compile, write and submit and/or present progress reports to supervisor and PIs.
Other duties as assigned.


Education: Master’s degree in Biostatistics, Statistics, Bioinformatics, Mathematics or related field.

Preferred Education: PhD in Biostatistics, Statistics, Bioinformatics, Mathematics or related field.

Certification: None

Preferred Certification: None

Preferred Experience: Two years of experience with bioinformatics analysis of NGS genomic data. Experience performing analysis on data produced from NGS systems (DNA-seq, RNA-seq, ChIP-seq, Bisulfite sequencing, etc.), maintaining, benchmarking, and developing NGS-based applications and technologies for research data analysis.

Start date

As soon as possible

How to Apply


Bin Liu