We are launching a campus-wide initiative to build foundation models that simulate the evolution of tumor ecosystems. You will be the lead engineer contributing to large-scale generative modelling on single-cell, spatial-omics, and clinical data.
Core responsibilities:
Design, train, and deploy multi-modal foundation models for single-cell and spatial cancer data
Build scalable training pipelines in PyTorch/JAX on GPU clusters and cloud HPC/ADK
Implement data-efficient fine-tuning, adaptive learning workflows and agentic frameworks for reasoning
Collaborate with machine learning experts and computational biologists to build tools for AI agents e.g. libraries, MCPs and APIs
Required qualifications:
B.S./B.E. (minimum) in Computer Science, Biomedical/Electrical Engineering, Statistics, Bioinformatics, Applied Math, or related field
6+ years of experience in software engineering
3+ yrs hands-on experience training generative AI or large-language models at scale
Substantial expertise in training deep learning models and tuning large foundation models.
Expertise with developing efficient data loaders for large datasets and optimizing training workflows.
Deep knowledge of probabilistic modelling, self-supervised learning and representation learning, diffusion/VAE/flow matching/transformer architectures
Strong Python, PyTorch/JAX, containerization & MLOps skills; familiarity with distributed training and modern experiment-tracking stacks
Experience with AI coding tools (e.g., Copilot, Cursor)
Preferred extras:
M.S. or graduate-level degree in relevant field
Experience with single-cell and spatial genomic or imaging data, and multimodal integration
Expertise in statistical causal discovery and inference
Publications or open-source contributions in generative models
Strong interest in applications and driving impact in cancer biology and immunology
Applications will only be accepted by submitting directly through: apply.interfolio.com/176464
Applications will be reviewed on a rolling basis until the roles are filled.