Soutenance de thèse Thibault Robin


M. Thibault Robin soutiendra en anglais, en vue de l'obtention du grade de docteur ès sciences mention bioinformatique, sa thèse intitulée:

Development of Bioinformatics Tools and Workflows for the Analysis of Cell Line Data

Date: Lundi 22 juin 2020 à 14h00

Lieu: Zoom

 Jury de thèse:

  • Dr. Frédérique Lisacek 
  • Prof. Amos Bairoch
  • Dr. Barbara Parodi
  • Prof. Bernd Wollscheid

Cell lines became an essential asset for biomedical research through the numerous practical advantages they offer over other types of cell cultures. They are used in a wide range of experiments in the different omics fields, which entered in a high-throughput era in recent years with the emergence of new technology and instrumentation. Cell line cross-contamination and misidentification is however known to impede the reliability and reproducibility of experimental results. The large-scale generation of biological data also raises many bioinformatics challenges mainly spanning issues of formats, storage, representation, and interpretation. This thesis focuses on the development of bioinformatics software to address these problems, providing tools for the analysis and interpretation of omics data to the scientific community. Three distinct applications were developed in the course of this thesis, which led to the publication of four related scientific articles.

CLASTR is a web application that provides researchers with a reliable online service to authenticate the cell lines they are working with. It allows performing similarity searches on the short tandem repeat (STR) profiles stored in the Cellosaurus online resource. It has both an intuitive web user interface and an efficient application programming interface. Numerous search parameters are available, for which a specific research article was written in order to detail the impact of their choice on the resulting identifications. CLASTR represents a significant effort in facilitating and democratizing cell authentication by STR profiling with the goal of curbing the spread of misidentified and cross-contaminated cell lines in the scientific literature.

MzVar is a desktop application that was designed for the compilation of customized variant protein and peptide databases. In the database search approach, only the sequences that are included in the database can be subsequently identified. Sequence variants that may have a biological relevance may consequently be missed if they are not included in the searched database. This tool was used in a subsequent study to identify sequence variants in proteomics tandem mass spectrometry data from the HeLa cell line, while trying to assess their influence on protein expression and stability.

GlyConnect Compozitor is a web application that enables researchers to explore the content of the GlyConnect database in the form of glycan composition graphs. GlyConnect contains a wealth of information about glycoproteins and glycosylation. With the recent shift towards  glycoproteomics experiments, a larger proportion of publications features detailed glycosylation site information but lacks fully resolved glycan structures. GlyConnect Compozitor aims to bridge the gap by allowing the visualization and comparison of glycan compositions in relation with a range of other biological entities.  Glycan compositions of interest can be exported in different formats to be used as glycan composition files for subsequent glycoproteomics searches.