Multi-document summarization based on document clustering and neural sentence fusion
dc.contributor.author | Fuad, Tanvir Ahmed | |
dc.contributor.author | University of Lethbridge. Faculty of Arts and Science | |
dc.contributor.supervisor | Chali, Yllias | |
dc.date.accessioned | 2019-02-22T21:36:03Z | |
dc.date.available | 2019-02-22T21:36:03Z | |
dc.date.issued | 2018 | |
dc.degree.level | Masters | en_US |
dc.description.abstract | In this thesis, we have approached a technique for tackling abstractive text summarization tasks with state-of-the-art results. We have proposed a novel method to improve multidocument summarization. The lack of large multi-document human-authored summaries needed to train seq2seq encoder-decoder models and the inaccuracy in representing multiple long documents into a fixed size vector inspired us to design complementary models for two different tasks such as sentence clustering and neural sentence fusion. In this thesis, we minimize the risk of producing incorrect fact by encoding a related set of sentences as an input to the encoder. We applied our complementary models to implement a full abstractive multi-document summarization system which simultaneously considers importance, coverage, and diversity under a desired length limit. We conduct extensive experiments for all the proposed models which bring significant improvements over the state-of-the-art methods across different evaluation metrics. | en_US |
dc.description.sponsorship | Natural Sciences and Engineering Research Council (NSERC) of Canada and the University of Lethbridge | en_US |
dc.embargo | No | en_US |
dc.identifier.uri | https://hdl.handle.net/10133/5294 | |
dc.language.iso | en_US | en_US |
dc.proquest.subject | 0984 | en_US |
dc.proquest.subject | 0723 | en_US |
dc.proquest.subject | 0800 | en_US |
dc.proquestyes | Yes | en_US |
dc.publisher | Lethbridge, Alta. : Universtiy of Lethbridge, Department of Mathematics and Computer Science | en_US |
dc.publisher.department | Department of Mathematics and Computer Science | en_US |
dc.publisher.faculty | Arts and Science | en_US |
dc.relation.ispartofseries | Thesis (University of Lethbridge. Faculty of Arts and Science) | en_US |
dc.subject | automatic text summarization | en_US |
dc.subject | sentence fusion | en_US |
dc.subject | mutli document summarization | en_US |
dc.subject | text clustering | en_US |
dc.subject | abstractive text summarization | en_US |
dc.subject | tensor2tensor | en_US |
dc.subject | Document clustering | en_US |
dc.subject | Text processing (Computer science) | en_US |
dc.subject | Automatic abstracting | en_US |
dc.subject | Electronic information resources -- Abstracting and indexing | en_US |
dc.subject | Dissertations, Academic | en_US |
dc.title | Multi-document summarization based on document clustering and neural sentence fusion | en_US |
dc.type | Thesis | en_US |