Date : Dec. 20, 2019, 9 a.m. - ZOGHLAMI Manel - Salle du conseil
Multiple instance learning for sequence data: Application on bacterial ionizing radiation resistance prediction
In Multiple Instance Learning (MIL) problem for sequence data, the instances inside the bags are sequences. In some real world applications such as bioinformatics, comparing a random couple of sequences makes no sense. In fact, each instance may have structural and/or functional relationship with instances of other bags. Thus, the classification task should take into account this across bag relationship. In this thesis, we present two novel MIL approaches for sequence data classification named ABClass and ABSim. ABClass extracts motifs from related instances and use them to encode sequences. A discriminative classifier is then applied to compute a partial classification result for each set of related sequences. ABSim uses a similarity measure to discriminate the related instances and to compute a scores matrix. For both approaches, an aggregation method is applied in order to generate the final classification result. We applied both approaches to the problem of bacterial ionizing radiation resistance prediction. The experimental results were satisfactory.
multiple instance learning, sequence data classification, prediction of bacterial ionizing radiation resistance.
Dr. Marie-Dominique DEVIGNES, CNRS, LORIA, France, Reviewer
Pr. Faten CHAIEB, University of Carthage, Tunisia, Reviewer
Dr. Jean SALLANTIN, CNRS, LIRMM, France, Examiner
Pr. Khedija AROUR, University of Carthage, Tunisia, Examiner
Pr. Engelbert MEPHU NGUIFO, University Clermont Auvergne, France, Advisor
Pr. Amel BORGI, University of Tunis El Manar, Tunisia, Advisor
Dr. Sabeur ARIDHI, University of Lorraine, France, Co-advisor
Pr. Mondher MADDOURI, University of Jeddah, Co-advisor.