My research encompasses automatic understanding of high-level informations in multimodal documents and multimodal interactions. My work lies at the frontier of natural language processing, spoken language processing, machine learning and more recently artificial vision. My objective is to enable computers to understand complex natural interaction situations between human beings. In order to reach that objective, I develop three complementary areas of research: semantic content analysis and modeling, multimodal system fusion, and rigorous evaluation of related technologies in realistic applications.


  • Manon Scholivet, 2017+, "Multilingual representations for NLP", ED-184
  • Simone Fuscone, 2017+, "Linguistic Underpinnings of Conversational Interpersonal Dynamics", DOC2AMU
  • Jeremy Auguste, 2016+, "Learning conversation representations", ANR DATCHA
  • Sébastien Delecraz, 2015+, "Multimodal understanding", AMU-DGA
  • Thibault Magallon, 2015+, "OCR and document structuring", CIFRE Numericompta
  • Olivier Michalon, 2013-2017, "Semantic parsing of French", ANR ASFALDA
  • Jérémie Tafforeau, 2013-2017, "Adaptation of NLP pipeline", FP7 SENSEI
  • Jérémy Trione, 2013-2017, "Summarization of human-human conversations", FP7 SENSEI


PARSEME-FR: syntactic parsing and multiword expressions in French

Multiword expressions break the general hypothesis that words can be considered as basic units in natural language processing applications. In this project we tackle the definition, annotation, detection and integration of MWEs in French texts, in conjunction with the Parseme COST (

HOMEOSTASIS: art and sciences

In this project, we explore jointly with the PULSO dance company how language interactions between humans and machines can fuel artistic creation. A first show, Homeostasis, showcases how failure in speech input capabilities can encourage artists to improvise and ensure continuity in the artistic creation process. More info can be found at

DATCHA: analysing customer care chats

Current customer care analytics solutions are often limited to low-level analysis. In this project, we want to explore the relation between discourse and semantic analysis of customer care textual chats. Our hypothesis is that both components can be combined for better robustness. The work will be evaluated by Orange according to task-oriented metrics.

SENSEI: making sense of human-human interactions

The SENSEI project aims at leveraging natural language processing for structuring and summarizing human-human conversations. The project focuses on two use cases: call-centre conversations and comments to news articles on the web. For both use cases, we create NLP pipelines up to semantic and para-semantic analysis. Our group explores domain and cross-language adaptation with deep learning methods, as well as abstractive summarization approaches for conversations. The SENSEI project also strives to perform an ecological evaluation of the proposed technologies.

ADNVideo: multimodal video recommendation

This project is a tech-transfer project towards industry. It looks into developing robust approaches for semantic analysis from videos on which to base recommendation algorithms. The application domain is advertisement recommendation on user-created videos. Technologies created in the PERCOL project are being ported to the platform developed by Kalizee.

ORFEO: annotated corpus of spoken and written French

The ORFEO project aims at creating resources for supporting the next-level of humanities research based on French corpora. In this project, we have gathered a wide range of text and speech transcript corpora in French language, and harmonized existing annotations and trained NLP systems for generating missing or higher levels of annotations. The corpora will be accessible through various query mechanisms and annotations will be correctable in a wiki-like fashion. I am in particular in charge of sentence-like unit segmentation in speech corpora, and we also use the macaon platform for generating analyses up to syntactic parses.

ASFALDA: semantic analysis of French

The objective of the Asfalda project is to explore open-domain surface semantic analysis with the creation of French framenet full-text annotated corpora and parsers. We currently explore fast adaptation of parsers, exogenous data integration, and the relation between deep syntax and surface semantics for building better semantic parsers.

Past projects

ThunderBOLT: speech-to-speech translation with clarification dialogs

Typical speech-to-speech machine translation systems have a limited domain knowledge and static models. The mistake they make when transcribing and translating speech usually harms the conversation to a point of failure. In this project, we have proposed to enhance a speech-to-speech MT system with clarification dialogs which can recover system errors. ASR and MT errors are detected and characterized so that a dialog system can ask targeted questions in order to solicit clarifications from the user. These can then be integrated in the system knowledge in order to fix the system output and to learn how to prevent those mistakes in the future.

PERCOL: person recognition in broadcast videos

This project addresses the problem of person identification in videos for indexing and information retrieval applications. Since maintaining biometric models for face recognition and speaker identification for a large number of persons is not realistic, we have focused on acquiring identities on the fly using displayed and pronounced names, as well as role and scene analysis. These aspects enabled the PERCOL team to win the Defi Repere challenge.

DECODA: call-centre conversation analysis

This project focuses on speech analytics from call centre recordings.

SEQUOIA: syntactic parsing of French

While syntactic parsing has become a basic building block in NLP pipelines, this project aimed at bringing to French the benefits from latest advances in statistical parsing.

PORTMEDIA: robustness, portability of semantic parsing in dialog systems

This project addressed the problem of porting dialog systems across languages or domains with minimal efforts. I joint this project while on postdoc at LIUM, and worked on concept detection and ASR.

CALO: cognitive assistant that learns and organizes

The objective of the CALO project was to create an intelligent organizer that could help with meetings, documents, etc. I paricipated to this project while at ICSI and mainly worked on meeting summarization.

  • Relevant papers
    • Gokhan Tur, Andreas Stolcke, Lynn Voss, John Dowding, Benoit Favre, Raquel Fernandez, Matthew Frampton, Michael Frandsen, Clint Frederickson, Martin Graciarena, Dilek Hakkani-Tür, Donald Kintzing, Kyle Leveque, Shane Mason, John Niekrasz, Stanley Peters, Matthew Purver, Korbinian Riedhammer, Elizabeth Shriberg, Jing Tien, Dimitra Vergyri, Fan Yang, "The CALO Meeting Assistant System", IEEE Transactions on Audio, Speech and Language Processing, 2010
    • Gokhan Tur, Andreas Stolcke, Lynn Voss, John Dowding, Benoit Favre, Raquel Fernandez, Matthew Frampton, Michael Frandsen, Clint Frederickson, Martin Graciarena, Dilek Hakkani-Tür, Donald Kintzing, Kyle Leveque, Shane Mason, John Niekrasz, Stanley Peters, Matthew Purver, Korbinian Riedhammer, Elizabeth Shriberg, Jing Tien, Dimitra Vergyri, Fan Yang, "The CALO Meeting Speech Recognition and Understanding System", Spoken Languge Technologies (SLT), Goa (India), 2008

NightinGALE: distillation from multilingual, multigenre speech and text

The GALE project aimed at accurate information extraction from both broadcast news recordings and forum texts in English, Arabic and Chinese. I contributed to the NightinGALE team through sentence segmentation and punctuation detection in speech.

Last updated on 2021-12-17