Methods of sentence extraction, abstraction and ordering for automatic text summarization

Thumbnail Image
Nayeem, Mir Tafseer
University of Lethbridge. Faculty of Arts and Science
Journal Title
Journal ISSN
Volume Title
Lethbridge, Alta. : Universtiy of Lethbridge, Department of Mathematics and Computer Science
In this thesis, we have developed several techniques for tackling both the extractive and abstractive text summarization tasks. We implement a rank based extractive sentence selection algorithm. For ensuring a pure sentence abstraction, we propose several novel sentence abstraction techniques which jointly perform sentence compression, fusion, and paraphrasing at the sentence level. We also model abstractive compression generation as a sequence-to-sequence (seq2seq) problem using an encoder-decoder framework. Furthermore, we applied our sentence abstraction techniques to the multi-document abstractive text summarization. We also propose a greedy sentence ordering algorithm to maintain the summary coherence for increasing the readability. We introduce an optimal solution to the summary length limit problem. Our experiments demonstrate that the methods bring significant improvements over the state-of-the-art methods. At the end of this thesis, we also introduced a new concept called "Reader Aware Summary" which can generate summaries for some critical readers (e.g. Non-Native Reader).
automatic text summarization , multi-document text summarization , neural paraphrastic compression , sentence abstraction , sequence-to-sequence