Applied and Computational Engineering

- The Open Access Proceedings Series for Conferences


Proceedings of the 3rd International Conference on Signal Processing and Machine Learning

Series Vol. 5 , 31 May 2023


Open Access | Article

Survey on abstractive text summarization using pretraining models and their developments

Yixin Zhang * 1
1 China, 3003 Yipinyaju, Luoyang, Henan United Kingdom, F167, the quadrangle, 1 Lower Ormond Street, M1 5QF University of Manchester

* Author to whom correspondence should be addressed.

Applied and Computational Engineering, Vol. 5, 109-117
Published 31 May 2023. © 2023 The Author(s). Published by EWA Publishing
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Citation Yixin Zhang. Survey on abstractive text summarization using pretraining models and their developments. ACE (2023) Vol. 5: 109-117. DOI: 10.54254/2755-2721/5/20230543.

Abstract

In these years, pre-training models gain a lot of attention in the summary generation area and demonstrate new possibilities for improving the sequence-to-sequence attention framework. This survey conducts a comprehensive overview of BERT-based pre-training models that can be used in abstractive summaries. Firstly, the BERT model is introduced as a typical pre-training model, followed by baseline models inspired by it. Then problems and developments of previous models are discussed including some recent SOTA approaches. Apart from that, some datasets used for models are demonstrated with main features. Besides, the commonly used evaluation methods are introduced. Last but not least, several potential research directions are suggested.

Keywords

pre-training model, abstractive summary, natural language processing, BERT, transformer

References

1. Jones K S, 1992, “Natural language processing: an overview”, in W. Bright (ed.) International encyclopedia of linguistics, New York: Oxford University Press, Vol. 3, 53–59.

2. Elman Jeffrey L, Finding Structure in Time. Cognitive Science. 1990, 14 (2): 179–211.

3. Hochreiter S and Schmidhuber J, 1997. Long short-term memory. Neural computation, 9(8), pp.1735-1780.

4. Chung J, et al., 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.

5. Mikolov T, et al., 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.

6. Bahdanau D, Cho K. and Bengio, Y, 2014. Neural machine translation by jointly learning to align and translate.

7. Radford A, et al., 2018. Improving language understanding by generative pre-training.

8. Vaswani A, et al., 2017. Attention is all you need. Advances in neural information processing systems, 30.

9. Devlin J, et al., 2018. Bert: Pre-training of deep bidirectional transformers for language understanding.

10. Song K, et al., 2019. Mass: Masked sequence to sequence pre-training for language generation.

11. Dong L, et al., 2019. Unified language model pre-training for natural language understanding and generation. Advances in Neural Information Processing Systems, 32.

12. Bao H, et al., 2020, November. Unilmv2: Pseudo-masked language models for unified language model pre-training. In International Conference on Machine Learning (pp. 642-652). PMLR.

13. Lewis M, et al., 2019. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension.

14. Qi W, et al., 2020. Prophetnet: Predicting future n-gram for sequence-to-sequence pre-training. arXiv preprint arXiv:2001.04063.

15. Saleh M, et al., 2020, November. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In International Conference on Machine Learning (pp. 11328-11339). PMLR.

16. Wan D and Bansal M, 2022. FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization.

17. Xu J, Zhang H, and Wang J, 2019. Pretraining-based natural language generation for text summarization.

18. Raffel C, et al., 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(140), pp.1-67.

19. Liu Y, 2019. Fine-tune BERT for extractive summarization. arXiv preprint arXiv:1903.10318.

20. Yu T, Liu Z and Fung P, 2021. AdaptSum: Towards low-resource domain adaptation for abstractive summarization.

21. Liu Y and Lapata, M, 2019. Text summarization with pre-trained encoders. arXiv preprint arXiv:1908.08345.

22. Zou Y, et al., 2020. Pre-training for abstractive document summarization by reinstating source text.

23. Khandelwal U, et al., 2019. Sample efficient text summarization using a single pre-trained transformer.

24. Cohan A, et al., W, 2021. Primer: Pyramid-based masked sentence pre-training for multi-document summarization.

25. Cheng H, et al., 2020. A two-stage transformer-based approach for variable-length abstractive summarization. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28, pp.2061-2072.

26. Chen Y S and Shuai H H, 2021, May. Meta-Transfer Learning for Low-Resource Abstractive Summarization. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 14, pp. 12692-12700).

27. Liu Y, Dou Z Y and Liu P, 2021. Refsum: Refactoring neural summarization.

28. Li P, Lam W and Bing L, 2018. Actor-critic-based training framework for abstractive summarization.

29. Pang R Y and He H, 2020. Text generation by learning from demonstrations.

30. Qu Q, et al., 2019, July. Exploring human-like reading strategy for abstractive text summarization. AAAI Conference, Artificial Intelligence (Vol. 33, No. 01, pp. 7362-7369).

31. Nishida K, et al., J, 2020. Abstractive summarization with combination of pre-trained sequence-to-sequence and saliency models.

32. Hsu W T, et al., 2018. A unified model for extractive and abstractive summarization using inconsistency loss.

33. Xu S, et al., 2022, June. Sequence level contrastive learning for text summarization. AAAI Conference, Artificial Intelligence (Vol. 36, No. 10, pp. 11556-11565).

34. Li W, and Sun S, 2021. Alleviating exposure bias via contrastive learning for abstractive text summarization.

35. Liu P and Liu Y, 2021. SimCLS: A simple framework for contrastive learning of abstractive summarization.

36. Radev D, et al., 2022. BRIO: Bringing Order to Abstractive Summarization.

37. Ravaut M, Joty S and Chen N F, 2022. SummaReranker: A Multi-Task Mixture-of-Experts Re-ranking Framework for Abstractive Summarization.

38. Andhale, N and Bewoor, LA, 2016, August. An overview of text summarization techniques. In 2016 international conference on computing communication control and automation (ICCUBEA) (pp. 1-7). IEEE.

Data Availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Authors who publish this series agree to the following terms:

1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.

2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.

3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open Access Instruction).

Volume Title
Proceedings of the 3rd International Conference on Signal Processing and Machine Learning
ISBN (Print)
978-1-915371-57-7
ISBN (Online)
978-1-915371-58-4
Published Date
31 May 2023
Series
Applied and Computational Engineering
ISSN (Print)
2755-2721
ISSN (Online)
2755-273X
DOI
10.54254/2755-2721/5/20230543
Copyright
© 2023 The Author(s)
Open Access
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited

Copyright © 2023 EWA Publishing. Unless Otherwise Stated