{"id":464,"date":"2021-12-10T11:14:12","date_gmt":"2021-12-10T10:14:12","guid":{"rendered":"https:\/\/pageperso.lis-lab.fr\/bernard.espinasse\/nouveausite\/?page_id=464"},"modified":"2022-01-09T23:26:59","modified_gmt":"2022-01-09T22:26:59","slug":"these-salima-lamsiyah","status":"publish","type":"page","link":"https:\/\/pageperso.lis-lab.fr\/bernard.espinasse\/index.php\/these-salima-lamsiyah\/","title":{"rendered":"These Salima Lamsiyah"},"content":{"rendered":"<p><strong><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-434 alignleft\" src=\"https:\/\/pageperso.lis-lab.fr\/bernard.espinasse\/wp-content\/uploads\/2021\/12\/Salima-these-2.png\" alt=\"\" width=\"124\" height=\"122\" \/><span style=\"font-size: 14pt;\">Salima Lamsiyah, <\/span><\/strong><span style=\"font-size: 14pt;\">\u00ab\u00a0Deep Learning-Based Unsupervised Extractive Methods for Multi-Document Summarization\u00a0\u00bb (defended in 2021)<\/span><\/p>\n<p><span style=\"font-size: 14pt;\"><a href=\"https:\/\/pageperso.lis-lab.fr\/bernard.espinasse\/nouveausite\/wp-content\/uploads\/2021\/12\/These-S.Lamsiyah-2021.pdf\" target=\"_blank\" rel=\"noopener\">Manuscrit<\/a><\/span><\/p>\n<p><span style=\"font-size: 14pt;\">La th\u00e8se de S. Lamsiyah concerne le r\u00e9sum\u00e9 extractif multi-documents non supervis\u00e9 \u00e0 base d&rsquo;apprentissage profond, tout d&rsquo;abord concernant le\u00a0 \u00ab\u00a0Generic Multi-Document Summarization (G-MDS)\u00a0\u00bb avec tout d&rsquo;abord une approche centroid et diff\u00e9rents \u00ab\u00a0sentence embedding representations\u00a0\u00bb, et ensuite\u00a0 en exploitant l&rsquo;apprentissage par transfert (Transfert learning) \u00e0 partir du r\u00e9glage fin de BERT (Bidirectional Encoder Representations from Transformers) sur des t\u00e2ches de compr\u00e9hension du langage naturel pour l&rsquo;apprentissage de la repr\u00e9sentation des phrases. Concernant le Query-Focused Multi-Document Summarization (QF-MDS), nous proposons une m\u00e9thode extractive non supervis\u00e9e bas\u00e9e sur l&rsquo;apprentissage par transfert \u00e0 partir de mod\u00e8les d&rsquo;int\u00e9gration de phrases pr\u00e9-entra\u00een\u00e9s (mod\u00e8le BM25) combin\u00e9s avec le crit\u00e8re de pertinence marginale maximale (maximal marginal relevance criterion).<\/span><\/p>\n<p><span style=\"font-size: 14pt;\"><em>The Salima Lamsiyah (defended in 2021) thesis focuses on Extractive ATS systems, and more specifically on Generic Multi-Document Summarization (G- MDS) and Query-Focused Multi-Document Summarization (QF-MDS) tasks. G-MDS systems generate summaries that represent all relevant facts of the source documents without considering the users\u2019 information needs. Besides, QF-MDS systems produce summaries where the content of the summary is derived from the user\u2019s information need or simply the user\u2019 s query. Our main objective is to develop robust and effective systems for both G-MDS and QF-MDS tasks that require no domain knowledge. Therefore, we propose four contributions to deal with this issue and to improve the performance of unsupervised extractive multi-document summarization. In the first contribution, we propose an unsupervised extractive method for G-MDS based on the centroid approach and the sentence embedding representations. We improve sentence scoring by combining three metrics, including sentence content relevance, sentence novelty, and sentence position. Moreover, we provide a comparative analysis of nine sentence embedding models used to represent sentences as dense vectors in a low dimensional vector space in the context of extractive multi-document summarization. In the second contribution, we improve the aforementioned G-MDS method by leveraging transfer learning from BERT fine-tuning on Natural Language Understanding tasks for sentence representation learning. Specifically, we fine-tune BERT on supervised intermediate tasks from GLUE benchmark using single-task and multi-task fine-tuning. Then, we transfer the learned knowledge to our summarization task. In the third contribution, we propose an unsupervised extractive QF-MDS method based on transfer learning from pre-trained sentence embedding models, BM25 model, and maximal marginal relevance criterion. We combine the BM25 model with the semantic similarity to select a subset of sentences based on their relevance to the query. Moreover, we incorporate sentence embedding representation in the maximal marginal relevance method to re-rank the candidate sentences by maintaining query relevance and minimizing redundancy. In the last contribution, we explore the potential of the recent pre-trained Sentence-BERT (SBERT), based on the Siamese network structure and fine-tuning mechanism, to boost the performance of extractive query-focused multi- document summarization (QF-MDS) task.<\/em><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Salima Lamsiyah, \u00ab\u00a0Deep Learning-Based Unsupervised Extractive Methods for Multi-Document Summarization\u00a0\u00bb (defended in 2021) Manuscrit La th\u00e8se de S. Lamsiyah concerne le r\u00e9sum\u00e9 extractif multi-documents non supervis\u00e9 \u00e0 base d&rsquo;apprentissage profond, tout d&rsquo;abord concernant le\u00a0 \u00ab\u00a0Generic Multi-Document Summarization (G-MDS)\u00a0\u00bb avec tout d&rsquo;abord une approche centroid et diff\u00e9rents \u00ab\u00a0sentence embedding representations\u00a0\u00bb, et ensuite\u00a0 en exploitant l&rsquo;apprentissage par &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/pageperso.lis-lab.fr\/bernard.espinasse\/index.php\/these-salima-lamsiyah\/\" class=\"more-link\">Continuer la lecture <span class=\"screen-reader-text\"> \u00ab\u00a0These Salima Lamsiyah\u00a0\u00bb<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_crdt_document":"","footnotes":""},"class_list":["post-464","page","type-page","status-publish","hentry","entry"],"_links":{"self":[{"href":"https:\/\/pageperso.lis-lab.fr\/bernard.espinasse\/index.php\/wp-json\/wp\/v2\/pages\/464","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pageperso.lis-lab.fr\/bernard.espinasse\/index.php\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/pageperso.lis-lab.fr\/bernard.espinasse\/index.php\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/pageperso.lis-lab.fr\/bernard.espinasse\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/pageperso.lis-lab.fr\/bernard.espinasse\/index.php\/wp-json\/wp\/v2\/comments?post=464"}],"version-history":[{"count":7,"href":"https:\/\/pageperso.lis-lab.fr\/bernard.espinasse\/index.php\/wp-json\/wp\/v2\/pages\/464\/revisions"}],"predecessor-version":[{"id":1197,"href":"https:\/\/pageperso.lis-lab.fr\/bernard.espinasse\/index.php\/wp-json\/wp\/v2\/pages\/464\/revisions\/1197"}],"wp:attachment":[{"href":"https:\/\/pageperso.lis-lab.fr\/bernard.espinasse\/index.php\/wp-json\/wp\/v2\/media?parent=464"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}