Abstractive multi-document summarization - paraphrasing and compressing with neural networks
dc.contributor.author | Egonmwan, Elozino Ofualagba | |
dc.contributor.author | University of Lethbridge. Faculty of Arts and Science | |
dc.contributor.supervisor | Chali, Yllias | |
dc.date.accessioned | 2021-01-15T17:33:03Z | |
dc.date.available | 2021-01-15T17:33:03Z | |
dc.date.issued | 2020 | |
dc.degree.level | Ph.D | en_US |
dc.description.abstract | This thesis presents studies in neural text summarization for single and multiple documents.The focus is on using sentence paraphrasing and compression for generating fluent summaries, especially in multi-document summarization where there is data paucity. A novel solution is to use transfer-learning from downstream tasks with an abundance of data. For this purpose, we pre-train three models for each of extractive summarization, paraphrase generation and sentence compression. We find that summarization datasets – CNN/DM and NEWSROOM – contain a number of noisy samples. Hence, we present a method for automatically filtering out this noise. We combine the representational power of the GRU-RNN and TRANSFORMER encoders in our paraphrase generation model. In training our sentence compression model, we investigate the impact of using different early-stopping criteria, such as embedding-based cosine similarity and F1. We utilize the pre-trained models (ours, GPT2 and T5) in different settings for single and multi-document summarization. | en_US |
dc.description.sponsorship | SGS Tuition Award Alberta Innovates Technology Futures (AITF) | en_US |
dc.identifier.uri | https://hdl.handle.net/10133/5827 | |
dc.language.iso | en_US | en_US |
dc.proquest.subject | Computer science [0984] | en_US |
dc.proquest.subject | Computer engineering [0464] | en_US |
dc.proquest.subject | Artificial intelligence [0800] | en_US |
dc.proquestyes | Yes | en_US |
dc.publisher | Lethbridge, Alta. : University of Lethbridge, Dept. of Mathematics and Computer Science | en_US |
dc.publisher.department | Department of Mathematics and Computer Science | en_US |
dc.publisher.faculty | Arts and Science | en_US |
dc.relation.ispartofseries | Thesis (University of Lethbridge. Faculty of Arts and Science) | en_US |
dc.subject | Artificial intelligence | en_US |
dc.subject | Automatic programming (Computer science) | en_US |
dc.subject | Computer programming | en_US |
dc.subject | Dissertations, Academic | en_US |
dc.subject | Machine learning | en_US |
dc.subject | Natural language processing (Computer science) | en_US |
dc.subject | Neural networks (Computer science) | en_US |
dc.title | Abstractive multi-document summarization - paraphrasing and compressing with neural networks | en_US |
dc.type | Thesis | en_US |