Combining state-of-the-art models for multi-document summarization using maximal marginal relevance

Thumbnail Image
Adams, David
University of Lethbridge. Faculty of Arts and Science
Journal Title
Journal ISSN
Volume Title
Lethbridge, Alta. : University of Lethbridge, Dept. of Mathematics and Computer Science
In Natural Language Processing, multi-document summarization (MDS) poses many challenges to researchers. While advancements in deep learning approaches have led to the development of several advanced language models capable of summarization, the variety of approaches specific to the problem of multi-document summarization remains relatively limited. Current state-of-the-art models produce impressive results on multi-document datasets, but the question of whether improvements can be made via the combination of these state-of-the-art models remains. This question is particularly relevant in few-shot and zero-shot applications, in which models have little familiarity or no familiarity with the expected output, respectively. To explore one potential method, we implement a query-relevance-focused approach which combines the pretrained models' outputs using maximal marginal relevance (MMR). Our MMR-based approach shows improvement over some aspects of the current state-of-the-art results while preserving overall state-of-the-art performance, with larger improvements occurring in fewer-shot contexts.
Research Subject Categories::TECHNOLOGY::Information technology::Computer science::Computer science , natural language processing , summarization , multi-document summarization , maximal marginal relevance , machine learning , Artificial intelligence , Automatic abstracting , Electronic information resources -- Abstracting and indexing. , Information storage and retrieval systems. , Natural language processing , Selective dissemination of information