Machine learning-based prediction of protein-protein interactions

Wageningen University
Bioinformatics Group
Netherlands GE Wageningen


Are you looking for a PhD position and like to participate in an exciting interdisciplinary research project to discover how proteins select their partners to interact with? Are you a bioinformatician with interest in machine learning, and its application to predict protein interactions? Then this PhD vacancy may be of interest to you.

Biological processes are controlled by transcription factor proteins (TFs), which interact in many different combinations to regulate gene expression. However, it is still largely unclear how the specificity of protein-protein interactions is encoded in protein sequence and structure and how this determines which combinations of proteins with unique functions can be formed. Ultimately, this knowledge is important to steer and control biological processes, and investigate for example the cause of human diseases or govern modern plant breeding approaches.

In the framework of a project ‘Machine learning-based prediction of transcription factor protein-protein interactions’, two PhD candidates will collaborate to develop a robust computational method to predict protein-protein interactions of TFs with complex evolutionary histories. The method will be developed based on the MADS-domain transcription factor family, which regulates important plant processes such as flowering and fruit development. One of the PhD candidates will work in the Bioinformatics Group, the other PhD candidate in the Plant Developmental Systems group of the Laboratory for Molecular Biology to perform wet-lab experiments aiming to gain insight into the specific requirements for protein-protein interactions.

This vacancy only applies to the PhD position in the Bioinformatics group at Wageningen University. This PhD student will develop a computational method, based on machine learning, to predict MADS-domain TF protein-protein interaction specificity. This method will be used to identify features that influence protein interaction, and identify amino acid residues essential for MADS-domain TF protein-protein interaction specificity. As such, the PhD student will work in close collaboration with the PhD student at Plant Developmental Systems, where input for the model will be generated and model predictions will be tested.

The research is embedded within the Bioinformatics Group, which focuses on fundamental and applied bioinformatics research in the green life sciences. In particular, they develop and apply novel computational methods for the analysis and integration of –omics data. The group has a strong track record in genome analysis, algorithm development (including the application of machine learning to study protein function and interaction), and tool construction. There are many national and international collaborations with researchers studying plant development, biotechnology and genomics.


You have
- A successfully completed MSc degree in bioinformatics, biology, data science or a related discipline
- Demonstrable experience in applying machine learning to biological data
- Good statistical and mathematical skills
- Proficiency in programming (e.g. in Python)
- Affinity with molecular biology and be interested in collaborating with researchers in this discipline
- Perseverance in problem solving

You are an independent person, but also a team player who enjoys working in a multidisciplinary team.

For this position your command of the English language is expected to be at C1 level. Sometimes it is necessary to submit an internationally recognised Certificate of Proficiency in the English Language.

Start date

As soon as possible

How to Apply

For more information and an online application form, please visit


Aalt-Jan van Dijk