Groupe de travail MINERS

Description :
Data Mining Team of LIMOS

Responsable communication :
MBOUOPDA Michael Franklin

Photographs from IJCAI 2020 - Yokohama (virtual) - Jan. 18, 2021 - GENERAL

 Sit in the garden

Three papers accepted at the national conference EGC'2020 - Nov. 30, 2020 - PUBLICATION

GPoID : Extraction de Motifs Graduels pour les Bases de Données Imprécises

By: Michael Chirmeni Boujike, Jerry Lonlac, Norbert Tsopze and Engelbert Mephu Nguifo 

Apport de l'entropie pour les c-moyennes floues sur des données catégoriques (French version of Fuzz-IEEE'2020)

By: Abdoul Jalil Djiberou Mahamadou, Violaine Antoine, Engelbert Mephu Nguifo and Sylvain Moreno

Ontology-based data integration in a distributed context of coalition air missions

By: Karima Ennaoui, Mathieu Faivre, Md Shahriar Hassan, Christophe Rey, Lauren Dargent, Hervé Girod and Engelbert Mephu Nguifo

Accepted Paper at ICDMW 2020: Uncertain Time Series Classification with Shapelet Transform - Nov. 16, 2020 - PUBLICATION

Authors: Michael F. MBOUOPDA and Engelbert MEPHU NGUIFO

Abstract: Time series classification is a task that aims at classifying chronological data. It is used in a diverse range of domains such as meteorology, medicine and physics. In the last decade, many algorithms have been built to perform this task with very appreciable accuracy. However, applications where time series have uncertainty has been under-explored. Using uncertainty propagation techniques, we propose a new uncertain dissimilarity measure based on Euclidean distance. We then propose the uncertain shapelet transform algorithm for the classification of uncertain time series. The large experiments we conducted on state of the art datasets show the effectiveness of our contribution. The source code of our contribution and the datasets we used are all available on a public repository.


Model overview



A novel algorithm for searching frequent gradual patterns from an ordered data set - Oct. 8, 2020 - PUBLICATION

Accepted Paper at WUML2020 (workshop at ECMLPKDD 2020): Classification of Uncertain Time Series by Propagating Uncertainty in Shapelet Transform - July 24, 2020 - PUBLICATION

Author: Michael F. MBOUOPDA and Engelbert MEPHU NGUIFO

Abstract: Time series classification is a task that aims at classifying chronological data. It is used in a diverse range of domains such as meteorology, medicine and physics. In the last decade, many algorithms have been built to perform this task with very appreciable accuracy. However, the uncertainty in data is not explicitly taken into account by these methods. Using uncertainty propagation techniques, we propose a new uncertain dissimilarity measure based on euclidean distance. We also show how to classify uncertain time series using the proposed dissimilarity measure and shapelet transform, one of the best time series classification methods. An experimental assessment of our contribution is done on the well known UCR dataset.

Accepted Paper at FUZZ-IEEE2020: Categorical fuzzy entropy c-means - May 8, 2020 - PUBLICATION

Authors: Abdoul Jalil Djiberou Mahamadou, Violaine Antoine and Engelbert Mephu Nguifo and Sylvain Moreno

Abstract: Hard and fuzzy clustering algorithms are part of the partition-based clustering family. They are widely used in real-world applications to cluster numerical and categorical data. While in hard clustering an object is assigned to a cluster with certainty, in fuzzy clustering an object can be assigned to different clusters given a membership degree. For both types of method an entropy can be incorporated into the objective function, mostly to avoid solutions raising too much uncertainties. In this paper, we present an extension of a fuzzy clustering method for categorical data using fuzzy centroids. The new algorithm, referred to as Categorical Fuzzy Entropy (CFE), integrates an entropy term in the objective function. This allows a better fuzzification of the cluster prototypes. Experiments on ten real-world data sets and statistical comparisons show that the new method can efficiently handle categorical data.

Acticle accepté à CNIA2020: Classification des Séries Temporelles Incertaines par Transformation Shapelet - May 6, 2020 - PUBLICATION

Auteurs: Michael Franklin MBOUOPDA et Engelbert MEPHU NGUIFO

Résumé: La classification des séries temporelles est une tâche qui consiste à classifier les données chronologiques. Elle est utilisée dans divers domaines tels que la météorologie, la médecine et la physique. Plusieurs techniques performantes ont été proposées durant les dix dernières années pour accomplir cette tâche. Cependant, elles ne prennent pas explicitement en compte l’incertitude dans les données. En utilisant la propagation de l’incertitude, nous proposons une nouvelle mesure de dissimilarité incertaine basée sur la distance euclidienne. Nous montrons également comment faire la classification de séries temporelles incertaines en couplant cette mesure avec la méthode de transformation shapelet, l’une des méthodes les plus performantes pour cette tâche. Une évaluation expérimentale de notre contribution est faite sur le dépôt de données temporelles UCR.

Accepted Paper at FUZZ-IEEE 2019: Evidential clustering for categorical data - May 6, 2020 - PUBLICATION

Author: A. J. Djiberou Mahamadou, V. Antoine, G. J. Christie and S. Moreno

Abstract: Evidential clustering methods assign objects to clusters with a degree of belief, allowing for better representation of cluster overlap and outliers. Based on the theoretical framework of belief functions, they generate credal partitions which extend crisp, fuzzy and possibilistic partitions. Despite their ability to provide rich information about the partition, no evidential clustering algorithm for categorical data has yet been proposed. This paper presents a categorical version of ECM, an evidential variant of k-means. The proposed algorithm, referred to as catECM, considers a new dissimilarity measure and introduces an alternating minimization scheme in order to obtain a credal partition. Experimental results with real and synthetic data sets show the potential and the efficiency of cat-ECM for clustering categorical data.

NeuroDeRisk - Semi annual meeting - April 28, 2020 - SEMINAIRE

Semi-annual face-to-face meeting of the European project NeuroDeRisk , initially planned in Brussels, but held in web conference because of COVID-19.
A meeting to discuss the last 6 months deliverables and futur ones.

Nouvel article publié dans la revue Pattern Recognition - March 20, 2020 - PUBLICATION

Remise des écharpes docteurs 2020 - Feb. 4, 2020 - SEMINAIRE

Nos nouveaux docteurs en informatique Dr. Angeline PLAUD et Dr. Jocelyn DE GOËR, tous deux encadrés par Prof. Engelbert MEPHU NGUIFO

Next meeting speakers

Thursday, 14 January 2021, 2:00 pm

Speakers: Sarah ZOUININA,

Topics: Data anonymization through Microaggregation.

Poster from the DAPPEM project

By HOSSAIN Sheikh Imran


Miners team during the Covid19