Bioinformatics for cancer vaccine and immunotherapy discovery

University of Edinburgh/University of Gdansk
Department of Chemistry
Poland Gdansk


The combination of immunotherapy and cancer vaccines promises to both relieve immuno-suppressive features critical to in cancer and promote tumor death through

A current paradigm in cancer therapeutics involves stimulating the patients’ own immune system to eradicate the tumour. It is well established that immunosurveillance sculpts the immunogenic phenotype of cancer. The discovery of co-stimulatory and co-inhibitory pathways controlling the immune response has led to the development of immune checkpoint inhibitors, such as anti-PD-1 antibodies, which can counter the ‘escape’ of cancers from immunity. Their clinical efficacy is generally associated with a high mutational load, which could be associated with the presence of tumour-associated antigen (TAA)-specific T-cells. However, only a minority of patients with cancer respond to this treatment modality, which might reflect a low prevalence of such T-cells.

Expansion of TAA-specific T cells via vaccination, could potentially increase the rate of clinical responses to immune-checkpoint inhibitors. Cancer cells exhibit a unique landscape of mutant antigens that could function as immune stimulatory vaccines. However, optimizing which set of these ‘neo-antigens’ to pursue for vaccine development remains a challenge. Specifically, an understanding of which of those mutations identified by genomics will reliably be presented as MHC class I mutated antigens is urgently needed.

Project 1: Comprehensive characterization of antigen presentation in cancer.

Introduction. The classic concept of self and non-self antigen discrimination by the immune system is focused on the recognition of neoantigen peptides presented by MHC molecules. Significant focus has been placed in studies of MHC class I/II antigen presentation, yet there are several other interesting strategies for antigen presentation available for study. MHC-class-I-like CD1 antigen-presentation molecules (CD1a, CD1b, CD1c, CD1d and CD1e) allow the immune system to recognize abundant and diverse lipid-containing antigens. These CD1 molecules bind and present amphipathic lipid antigens for recognition by T-cell receptors [1-3]. The MR1 protein binds to molecules derived from bacterial riboflavin biosynthesis, and these to mucosal associated invariant T (MAIT) cells for activation [4]. Yet, even with a seemingly unrelated function, the TCGA Pan-cancer studies indicate MR1 as amplified in ~8% of Cholangiocarcinomas, Breast Invasive Carcinomas and Liver cancers (cbioportal; [5]). Further, two preliminary studies have implicated the endogenous presentations of MR1 ligand in tumour cells [6-8].

Project outline. We are searching for a computational biology student to address questions in systems immunology and cancer. In Stage 1, the student will continue the computational analysis of MHC class I/classII antigen presentation, but also develop computational pipelines in lipidomics and metabolocs for the mass-spectrometry based study of CD1 and MR1 presented antigens. In Stage 2, the student will develop a database cataloguing the existing literature and datasets around lipid and metabolite antigen presentation. In stage 3, we will work with Dr. Irena Dapic and a team of biophysicists on new mass-spectrometry technologies and and support applications towards the characterization of MHC Class I/II, CD1 and MR1 presented antigens. In parallel ICCVS (Dapic) is developing a strategy for the immunoprecipitation of CD1 and MR1 antigen presenting cells.

Project 2: Comparative genomics of MHC class I antigen presenentation and processing.

(PhD supervisory team: Fahraeus, Alfaro, Hupp, Dapic,

A. Introduction. Evolution of antigen processing and presentation (APP) reveals non-cannonical links to protein translation. How the immune system distinguishes self from non-self antigens is largely a question of how evolution has solved the problem of finding a small number of non-self peptides within an ocean of self peptides. This characteristic relies on the ability of the immune system to detect non-self peptides on major histocompatibility (MHC) class I molecules, which forms an essential part in fighting infections and the removal of transformed cells. This is a well matured researched field but some fundamental aspects regarding production, processing and delivery of antigenic peptides are still not answered. More recent results show potential sources of peptides derived from alternative mRNA translation events [1]. These have an important impact on antigen presentation and raise some interesting cell biological questions for which current models are insufficient to offer explanations. Particularly it appears that the antigen presentation machinery is closely linked to the translational machinery through some dedicated pathway. If this is the case a careful evolutionary study of the antigen presentation and processing machinery alongside the translational pathway should provide support for co-evolution and for identifying new targets for therapeutic interference with the produciton of neoantigen peptide substrates.

B. Project outline. We are applying for a computational biology doctoral student to address questions regarding the basic biology of antigen processing and presentation. The student will undertake a comparative genomics analysis across diverse species in order to identify a core set of genes linking the translational machinery to the antigen presentation and processing (APP) pathway. We will identify these links using two strategies, the first aimed at identifying co-evolving proteins and identifying specific sequence regions of co-evolution and the second aimed at characterizing and expanding evolutionary conserved modules involving proteins from both the translational machinery and the APP pathway.

Project 3: strategies for denovo-sequencing

Neoantigen cancer therapeutics are expected to be of benefit in cancers with a defined mutagen load, chromosome instability or RNA processing error. We will focus training on developing proteogenomics platforms in sarcomas, oral adenocarcinomas and ovarian cancers, which are considered of high unmet clinical need. Datasets will be generated in collaboration with the University of Edinburgh and the University of Gdansk.

Year 1-2: Cancer neo-epitope discovery platform development. The first training stage for the PhD student involves neo-epitope discovery using matched tumour and normal patient samples. The student is expected to develop and apply a computational pipeline to identify mutated genes, mutant mRNA, RNA editing events, intron-translation, and chromosomal fusions from next generation DNA and RNA sequencing data. The student will create protein reference databases that are patient specific; and use these reference databases to characterize immunopeptidomes by mass spectrometry and standard immunoaffinity purification.

Year 3-4: Predicting neo-antigen presentation from genomics and transcriptomics. Project 1 will result in a large dataset of immunopeptidomes derived by mass-spectrometry alongside matching genomic and transcriptomic datasets. The student will use machine learning strategies to develop a predictor of neo-antigen presentation based on these and publicly available data. The goal will be to predict from genomics and transcriptomic datasets, which cancer-specific peptides will later be detected by mass-spectrometry as cell-surface antigens. The resulting model will be used to accelerate discovery and reduce costs for neo-antigen discovery.

Resources: Students will be jointly supervised between Dr. Ted Hupp at the University of Edinburgh and Dr. Javier Alfaro at the University of Gdansk. Students will work closely and validate their results alongside collaborators at both Universities, which are equipped with state of the art facilities in mass spectrometry, virology, protein biochemistry and vaccine technology. Students will also have the opportunity to develop skills in machine learning and high performance computing. The project has access to Cyfronet Prometheus (~55, 000 cores) and CI TASK Tryton (~38, 000 cores) clusters, which are consistently represented among the top 500 super computers in the world. As the work is international in nature, students will have the opportunity to travel between Edinburgh and Gdansk at various points during their PhD.

Please note, you are encouraged to contact Dr. Ted Hupp ( and Dr. Javier Alfaro ( before submitting an application.


Desired skills:
• Previous experience in Genomics, Transcriptomics or Proteomics
• Previous project in bioinformatics
• Experience in R
• Experience in a scripting language (Perl or Python)

Start date

October 15, 2019

Start date

To be determined

How to Apply

For this role, we seek a computer scientist willing to become fluent in the terminology of genetics, genomics, proteomics and immunology. Alternatively, we seek a highly motivated molecular biologist eager to pick up skills in data analysis, including statistics, programming and high performance computing. International, EU and UK students are welcome to apply.

Please send a cover letter and CV to myself and Ted Hupp:
Javier.alfaro [.a.t.]
Ted.Hupp [a.t.]


Javier A Alfaro and Ted Hupp