Multi-document summarization based on document clustering and neural sentence fusion

dc.contributor.authorFuad, Tanvir Ahmed
dc.contributor.authorUniversity of Lethbridge. Faculty of Arts and Science
dc.contributor.supervisorChali, Yllias
dc.date.accessioned2019-02-22T21:36:03Z
dc.date.available2019-02-22T21:36:03Z
dc.date.issued2018
dc.degree.levelMastersen_US
dc.description.abstractIn this thesis, we have approached a technique for tackling abstractive text summarization tasks with state-of-the-art results. We have proposed a novel method to improve multidocument summarization. The lack of large multi-document human-authored summaries needed to train seq2seq encoder-decoder models and the inaccuracy in representing multiple long documents into a fixed size vector inspired us to design complementary models for two different tasks such as sentence clustering and neural sentence fusion. In this thesis, we minimize the risk of producing incorrect fact by encoding a related set of sentences as an input to the encoder. We applied our complementary models to implement a full abstractive multi-document summarization system which simultaneously considers importance, coverage, and diversity under a desired length limit. We conduct extensive experiments for all the proposed models which bring significant improvements over the state-of-the-art methods across different evaluation metrics.en_US
dc.description.sponsorshipNatural Sciences and Engineering Research Council (NSERC) of Canada and the University of Lethbridgeen_US
dc.embargoNoen_US
dc.identifier.urihttps://hdl.handle.net/10133/5294
dc.language.isoen_USen_US
dc.proquest.subject0984en_US
dc.proquest.subject0723en_US
dc.proquest.subject0800en_US
dc.proquestyesYesen_US
dc.publisherLethbridge, Alta. : Universtiy of Lethbridge, Department of Mathematics and Computer Scienceen_US
dc.publisher.departmentDepartment of Mathematics and Computer Scienceen_US
dc.publisher.facultyArts and Scienceen_US
dc.relation.ispartofseriesThesis (University of Lethbridge. Faculty of Arts and Science)en_US
dc.subjectautomatic text summarizationen_US
dc.subjectsentence fusionen_US
dc.subjectmutli document summarizationen_US
dc.subjecttext clusteringen_US
dc.subjectabstractive text summarizationen_US
dc.subjecttensor2tensoren_US
dc.subjectDocument clusteringen_US
dc.subjectText processing (Computer science)en_US
dc.subjectAutomatic abstractingen_US
dc.subjectElectronic information resources -- Abstracting and indexingen_US
dc.subjectDissertations, Academicen_US
dc.titleMulti-document summarization based on document clustering and neural sentence fusionen_US
dc.typeThesisen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
FUAD_TANVIR_AHMED_MSC_2018.pdf
Size:
1.76 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
3.25 KB
Format:
Item-specific license agreed upon to submission
Description: