Abstractive multi-document summarization - paraphrasing and compressing with neural networks

dc.contributor.authorEgonmwan, Elozino Ofualagba
dc.contributor.authorUniversity of Lethbridge. Faculty of Arts and Science
dc.contributor.supervisorChali, Yllias
dc.date.accessioned2021-01-15T17:33:03Z
dc.date.available2021-01-15T17:33:03Z
dc.date.issued2020
dc.degree.levelPh.Den_US
dc.description.abstractThis thesis presents studies in neural text summarization for single and multiple documents.The focus is on using sentence paraphrasing and compression for generating fluent summaries, especially in multi-document summarization where there is data paucity. A novel solution is to use transfer-learning from downstream tasks with an abundance of data. For this purpose, we pre-train three models for each of extractive summarization, paraphrase generation and sentence compression. We find that summarization datasets – CNN/DM and NEWSROOM – contain a number of noisy samples. Hence, we present a method for automatically filtering out this noise. We combine the representational power of the GRU-RNN and TRANSFORMER encoders in our paraphrase generation model. In training our sentence compression model, we investigate the impact of using different early-stopping criteria, such as embedding-based cosine similarity and F1. We utilize the pre-trained models (ours, GPT2 and T5) in different settings for single and multi-document summarization.en_US
dc.description.sponsorshipSGS Tuition Award Alberta Innovates Technology Futures (AITF)en_US
dc.identifier.urihttps://hdl.handle.net/10133/5827
dc.language.isoen_USen_US
dc.proquest.subjectComputer science [0984]en_US
dc.proquest.subjectComputer engineering [0464]en_US
dc.proquest.subjectArtificial intelligence [0800]en_US
dc.proquestyesYesen_US
dc.publisherLethbridge, Alta. : University of Lethbridge, Dept. of Mathematics and Computer Scienceen_US
dc.publisher.departmentDepartment of Mathematics and Computer Scienceen_US
dc.publisher.facultyArts and Scienceen_US
dc.relation.ispartofseriesThesis (University of Lethbridge. Faculty of Arts and Science)en_US
dc.subjectArtificial intelligenceen_US
dc.subjectAutomatic programming (Computer science)en_US
dc.subjectComputer programmingen_US
dc.subjectDissertations, Academicen_US
dc.subjectMachine learningen_US
dc.subjectNatural language processing (Computer science)en_US
dc.subjectNeural networks (Computer science)en_US
dc.titleAbstractive multi-document summarization - paraphrasing and compressing with neural networksen_US
dc.typeThesisen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Egonmwan_Elozino_PhD_2020.pdf
Size:
654.26 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
3.25 KB
Format:
Item-specific license agreed upon to submission
Description: