Serimmune, an early-stage start-up backed by Illumina Ventures and Merck is applying a unique immune repertoire characterization platform to map human immunity. We are seeking a Bioinformatics Data Scientist, versed in the application of machine learning and data mining approaches to large datasets in the life sciences. This individual will play a leading role in the company’s efforts to apply machine learning, data mining, and high performance computing to reveal unique insights from the company’s proprietary database of thousands of immune profiles. This individual will have the opportunity to work on projects that make a meaningful impact on human health, including the identification of biomarkers of disease (cancer, autoimmunity, infection), prediction of response to therapy, and characterization of human immunity.
Tasks and Responsibilities:
• Analyze assay data to develop robust methods for quality control and data analysis
• Design and apply computational methods, software, and statistics to analyze and mine data to generate actionable insights.
• Develop and code algorithms and applications to interrogate large peptide sequence datasets.
• Design schema for large sequence repositories and work closely with scientists to design experiments and computational/data mining methods.
• Interpret and communicate insights and findings to research and development colleagues and strategic partners.
Serimmune, Inc. is an equal opportunity employer
DISCLAIMER: The information on this description has been designed to indicate the general nature and level of work. It is not designed to be interpreted as a comprehensive inventory of all duties and responsibilities of an employee to this job.
Minimum Qualifications (Must have):
• Ph.D. or M.S. degree in Computer Science, Bioinformatics, computational biology, molecular biology, genomics or related scientific field (or equivalent experience and depth).
• Minimum 1 year post-graduate or industry experience in technology / life science environments.
• Experience with next-generation sequencing (NGS), proteomic, genomic, or transcriptomic data analysis.
• Solid knowledge of languages such as Python, Java, and/or C++, and high programming proficiency in a Unix/Linux environment.
• Experience applying statistical approaches to analyze NGS / proteome / genome data interpretation.
• Ability to understand subtleties of different analytical methods and impacts on data generation and interpretation.
• Knowledge of external sequence databases and resources
• Excellent oral and written communication skills.
• Experience in the development of clinical (LDT/IVD) assays
• Experience with high performance computing and cloud computing.
• Experience with structure-based computational tools and methods
• Office environment (Santa Barbara, CA or bay area)
• Prolonged periods of sitting, standing
• Must be able to lift and transport at least 25 pounds
• Ability to travel