DECIDE

Présentation

Publications (HAL)

Séminaires scientifiques

20/06/2024

Intervenant : Hadi Khalilia, Google Scholar : https://scholar.google.com/citations?user=1ZAZctcAAAAJ&hl=fr&oi=ao

Titre : Une approche de crowdsourcing pour la création de lexiques multilingues tenant compte de la diversité

Résumé : Les langues décrivent le monde de diverses manières, ce qui se manifeste par des phénomènes tels que les termes spécifiques aux langues et l'intraduisibilité. Cependant, les ressources informatiques ne parviennent souvent pas à représenter cette diversité et sont biaisées en faveur de l'espace conceptuel des principales langues. Mon travail propose des méthodes pour enrichir les lexiques informatiques, en réduisant leur biais en faveur de la langue anglaise et de la culture anglo-saxonne. Je présente une nouvelle approche qui combine l'enrichissement systématique et les méthodologies de crowdsourcing afin d'enrichir les ressources lexico-sémantiques (RLS) avec du contenu relatif à la diversité linguistique.

Je démontre la méthode à travers des études de cas sur la terminologie de la parenté dans sept dialectes arabes et trois langues indonésiennes, ainsi qu'entre les termes alimentaires anglais et arabes.

Nos résultats, disponibles sous forme de ressources informatiques navigables et téléchargeables, donnent un aperçu de l'étendue de la diversité linguistique dans le monde, et illustrent l'importance de traiter les biais dans les RLS afin de mieux soutenir les applications de traitement du langage naturel.

Mots-clé : diversité linguistique, crowdsourcing

11/04/2024

Intervenant : Arnold Hien, IdHAL: lobnury, google scholar: https://scholar.google.fr/citations?user=v3H7C0EAAAAJ

Titre : Fouille de données et programmation par contraintes

Résumé : Dans les problèmes de fouille de données, les utilisateurs peuvent avoir recours à des contraintes pour trouver des solutions intéressantes pour eux. Ces contraintes leur permettent alors de personnaliser les modèles de fouilles de données en y intégrant des connaissances/préférences. Cependant l'intégration de ces contraintes dans le problème initial n'est pas toujours aisée et peut poser des difficultés en terme de modélisation et de résolution.
Face à cette difficulté, des travaux ont permis d'exploiter le cadre déclaratif flexible de la programmation par contraintes (PPC) pour faciliter la modélisation et la résolution de problème de fouille de données.
Dans cette présentation, je vous parlerai d'un exemple de fouille interactive de motifs qui fait appel à la PPC, la fouille de motifs et le machine learning pour extraire des motifs intéressants pour l'utilisateur.

Mots-clé : fouille de données, fouille de motifs, programmation par contraintes

14/03/2024

Intervenant : Hugues Moreau, post-doctorant équipe MATRIX, orcid: 0000-0002-0569-4190

Titre : "Mobility and acoustics data analysis : deep learning based approaches and clustering"

Résumé : La présentation sera un résumé général de ce que j'ai pu faire: La thèse consiste en l'application de l'apprentissage profond sur des signaux de capteurs inertiels pour de l'analyse de la mobilité. Je parlerai ensuite de mon postdoc sur le custering de données de mobilité à l'aide de modèles de mélange de régressions. Je présenterai enfin brièvement l'objet de mon postdoc actuel à l'ENSTA Bretagne, la prédiction du type de sédiment à partir de données acoustiques.

mots-clé : Apprentissage profond, capteurs inertiels, modèles de mélange de régressions.

15/02/2024

Speaker: Sasha Piccione, doctorante, Venice (Italy) at the Ca' Foscari University of Venice and IMT Atlantique (Brest).

Title: I'm politer than you just because I think I owe you

Abstract: The aim of this research is to investigate the factors that influence the tone (measured as arousal and valence) used by Wikipedians when discussing on Talk Pages. The research focuses on isolating contextual characteristics from demographic characteristics, drawing on literature regarding social cues and mimicry. The ultimate objective is to determine the extent to which contextual factors influence the tone expressed in written comments.

Keywords: Online Open communities, stigmergy, sentiment analysis

08/02/2024

Intervenant : Ebtissem Sassi, enseignante-chercheure, ENSIBS (Lorient)

Résumé : Cette présentation porte sur les travaux en cours autour des thématiques d'optimisation de la chaîne logistique et de l'aide à la décision ; un focus sur les jumeaux numériques : concevoir et déployer un jumeau numérique de la chaîne logistique alimentaire courte.

18/01/2024

Intervenant : Florian RASCOUSSIER, doctorant, DECIDE, Lab-STICC

Titre : Predicting SSH keys in OpenSSH Memory dumps

Résumé : Ce travail est le résultat du projet de recherche de fin d’étude (de Masterarbeit) visant à faire progresser le domaine de la cybersécurité grâce à l'apprentissage automatique. L’apport réside dans le développement d'algorithmes et l'entraînement de divers modèles d'apprentissage pour prédire les emplacements des clés SSH dans les heap dumps de mémoire OpenSSH.

Mots-clé : graphes, SSH, clé, embedding, machine learning, graph convolution network, binary classification

11/01/2024

Intervenant : Nicolas Jullien, professeur, laboratoire LEGO

Titre : IA in digital comm(unitie)s. A socio-economics approach.

Abstract : Algorithmic management is blamed for its errors which would have discriminatory effects. Does Wikipedia do better in that matter? Through the analysis of the management of the most important bot fighting vandalism for the Fr-Wikipedia, we show that 1) over-standardization and discrepancies are hardly avoidable on the long run; 2) it is an issue for any platform, as it decreases its creativity and thus its attractiveness; 3) counterbalancing this is not one of technical limitations, but of socio-technical arrangements, on developing the human control and analysis of algorithmic decisions

15/11/2023

Intervenant : Shoko Wakamiya, Associate Professor at Nara Institute of Science and Technology, Social Computing lab Japan

Mots clés : traitement de données textuelles, social média / network analysis, social computing, santé, ...

Title: Health-related Social Media Data Analysis and Applications

Summary: With the development of social media and smartphones, various crowd-based data are available, and some are linked to locations. In this talk, I will present some studies using web and social media data, which would contribute to health promotion and well-being. Also, I'll talk about the challenges in using these data and ongoing projects to address them.

09/11/2023

Intervenant : Antti Knutas, associate professor, LUT University (Finland)

Title : “Software Engineering in Civic Technology: Introduction to the Field, Research Trajectory, and Lessons Learned”

Bio : Antti Knutas is an associate professor at LUT University Department of Software Engineering and has spend one month at IMT Atlantique as a visiting researcher. His current research area is how grassroots civic tech communities design, create, and share software (see his webpage for details https://anttiknutas.net).

20/10/2023

Intervenant : Baptiste Alglave, maître de conférences, équipe DECIDE, Lab-STICC.

Titre : "Integrating massive and heterogeneous spatio-temporal data to infer spatial processes. Marine ecology as field of application."

28/09/2023

Intervenant : Gábor Bella, maître de conférences, équipe DECIDE, Lab-STICC

Titre : "L'interopérabilité sémantique des données de santé"

05/07/2023

Intervenant : Ba Huy Pham, doctorant, équipe MATRIX, Lab-STICC, ONERA

Titre : "Pistage avec un réseau d'antennes pour l'observation radar "around the corner""

26/05/2023

Intervenant : Alexandre Reiffers-Masson, maître de conférences, équipe MATHS & NET, LAB-STICC

Title : Stochastic Processes and Stochastic Algorithms for Distributed Systems

Abstract : In this talk, I will present my recent works on distributed systems. First, I will focus on Directed Acyclic Graphs (DAG)-based distributed ledgers. In distributed ledger technologies (DLTs) with a directed acyclic graph (DAG) data structure, a block-issuing node can decide where to append new blocks and, consequently, how the DAG grows. This DAG data structure is typically decomposed into two pools of blocks, dependent on whether another block already references them. The unreferenced blocks are called the tips. Due to network delay, nodes can perceive the set of tips differently, giving rise to local tip pools. In this series of works, we have proposed new mathematical models to capture the evolution of the number of tips under the presence of heterogeneous delay in the peer-to-peer network. I will show the different theoretical properties obtained, such as stability, ergodicity, and upper bound on the average number of tips.

Then, in the second part of the talk, I will consider the measurement model Y = AX where X and, hence, Y are random variables and A is an a priori known tall matrix. At each time instance, a sample of one of Y's coordinates is available, and the goal is to estimate E[X] via these samples. However, the challenge is that a small but unknown subset of Y's coordinates are controlled by adversaries with infinite power: they can return any real number each time they are queried for a sample. For such an adversarial setting, we propose the first asynchronous online algorithm that converges to E[X] almost surely. We prove this result using a novel differential inclusion based two-timescale analysis. Our algorithm can be used in decentralized scenarios, such as decentralized byzantine-robust gradient estimation.

20/04/2023

Intervenant : Esteban Bautista, post-doctorant, DECIDE, Lab-STICC

Titre: A frequency-structure decomposition for link streams

Abstract : A link stream is a set of triplets (t, u, v) modeling interactions over time, such as person u calling v at time t, or bank account u transferring money to v at time t. Effectively analyzing link streams is key for numerous applications. In practice, they are commonly studied as collections of graphs or time series, yet adapting the data for signal and graph methods often leads to unsatisfactory tradeoffs. Thus calling for methods dedicated for link streams. In this work, our goal is to develop a decomposition for link streams: breaking down a complex link stream into simpler ones that are easier to study. For this we develop a novel structural decomposition that interacts well with time-series decompositions. We show that their combination allows to decompose a link stream into simple structures oscillating at a specific frequency. Moreover, we show that this permits to easily introduce filters for link streams which can be useful in various settings.

02/03/2023

Intervenant : Erwan Alincourt, Lieutenant de vaisseau dans la marine nationale, doctorant, DECIDE, Lab-STICC

Titre : Cartographie, détection d'anomalies et réaction suite à une détection sur des systèmes industriels : application aux navires militaires et civils

22/12/2022

Intervenant : Patrick Meyer, professeur, DECIDE, Lab-STICC

Title : "A genetic algorithm for learning the parameters of an SRMP preference model".

Résumé : In the domain of Multiple Criteria Decision Aiding, decision makers are faced with problems involving multiple conflicting criteria. Preference models are used to reach a decision in such situations. To tune the parameters of those models, preference elicitation algorithms are used, generally using so-called holistic judgments as inputs. In this work, we focus on a specific preference model called ranking based on multiple reference profiles. In the literature, mixed-integer linear programming and constraint programming techniques have already been proposed to tune the model parameters. However these approaches have difficulties to handle realistic large scale problems. We propose here an evolutionary metaheuristic in order to address this issue, which we test using extensive numerical experiments in order to highlight its performance and limits. We show that the proposed metaheuristic has the capacity to reproduce learning inputs very well, while having an important generalization power.

12/05/2022

Intervenant : Quentin Perrachon, doctorant, DECIDE, Lab-STICC

Papier présenté : Quentin Perrachon, Alexandru-Liviu Olteanu, Marc Sevaux. PPC pour un problème d'ordonnancement industriel : Multi-Resource Flexible Job Shop. 23ème congrès annuel de la Société Française de Recherche Opérationnelle et d'Aide à la Décision, INSA Lyon, Feb 2022, Villeurbanne - Lyon, France. ⟨hal-03595382⟩

Résumé : La société Hérakles développe et distribue un ERP-GPAO partout en France. Sa clientèle est principalement composée d’industries de très petite, petite et moyenne taille. Hérakles souhaite proposer et fournir des solutions intelligentes d’ordonnancement à ses clients. Dans le cadre d’une thèse CIFRE en collaboration entre la société Hérakles et l’équipe DECIDE du laboratoire Lab-STICC, nous présentons donc une première méthode de résolution pour des problèmes d’ordonnancement d’ateliers industriels correspondants à une majorité de clients d’Hérakles.

24/03/2022

Intervenant : Antoine Mallégol, doctorant, DECIDE, Lab-STICC

Titre : "Optimisation multi-objectifs de systèmes multi-énergies : modèle mathématique et étude de différentes méthodes de linéarisation"

Papier : https://hal.archives-ouvertes.fr/hal-03595359

16/09/2021

Intervenant : Yannis Haralambous, maître de conférence, DECIDE, Lab-STICC

Papier présenté : Haralambous, Yannis, and Tian Tian. "Tailoring a controlled language out of a corpus of maintenance reports." Proceedings of the Seventh International Workshop on Controlled Natural Language (CNL 2020/21). 2021.

Résumé : We introduce a method for tailoring a controlled language out of a specialized language corpus, as well as for training the user to ensure a smooth transition between the specialized and the controlled language. Our method is based on the selection of maximal coverage syntax rules. The number of rules chosen is a naturalness vs. formality parameter of the controlled language. We introduce a training tool that displays segmentation into left-to-right maximal parsed sentences and allows utterance modification by the user until a complete parse is achieved. We have applied our method to a French corpus of maintenance reports of boilers in a thermal power station and provide coverage and segmentation results.