DOI: https://doi.org/10.20535/SRIT.2308-8893.2019.4.07

Методи абстрактного реферування текстів: огляд літератури

D. V. Shypik, Petro I. Bidyuk

Анотація


Містить огляд літератури, присвяченої методам абстрактного реферування текстів. Розглянуто класифікацію методів абстрактного реферування. Із появою методів реферування текстів у 1950-х рр. техніки створення рефератів постійно покращувались, але оскільки абстрактне реферування потребує потужних технік для оброблення/генерації тексту, найбільший прогрес досягнуто в останні роки. Поточний швидкий розвиток у сфері як оброблення природної мови в цілому, так і автоматичного реферування зокрема робить особливо необхідним аналіз прогресу в цій сфері. Надано загальне уявлення як про попередні підходи, так і найновіші, включаючи пояснення методів і підходів. Додатково подано кількісні оцінки методів, запропонованих в оглянутих джерелах.

Ключові слова


обробка природної мови; реферування тексту; абстрактне реферування тексту; sequence to sequence моделі

Повний текст:

PDF (English)

Посилання


Jones S.K. Automatic summarizing: factors and directions [Online] / S. K. Jones // MIT Press. — 1999. — Available at: https://www.cl.cam.ac.uk/archive/ksj21/ksjdigipapers/summbook99.pdf.

Multi-document summarization by sentence extraction [Online] / J.S. Goldstein, V. Mittal, J.G. Carbonell, M. Kantrowitz. — 2000. — Available at:: http://scholar.google.com.ua/scholar_url?url=https://kilthub.cmu.edu/articles/Multi-Document_Summarization_By_Sentence_Extraction/6624470/files/ 12121496.pdf&hl=uk&sa=X&scisig=AAGBfm3dUni3D9yq1qbG7bN3z4ow9ChpyA&nossl=1&oi=scholarr.

Lloret E. Text summarisation in progress: a literature review / E. Lloret, M. Palomar // Artificial Intelligence Review. — 2011. — N 37. — P. 1–41.

Genest P. HEXTAC: the Creation of a Manual Extractive Run [Online] / P. Genest, G. Lapalme, M. Yousfi-Monod. — 2009. — Available at: http://www.mymcorner.net/files/Genest-Lapalme-Yousfi-Monod-09.pdf.

Hasler L. From extracts to abstracts: human summary production operations for computer-aided summarisation [Online] / L. Hasler. — 2007. — Available at: http://rgcl.wlv.ac.uk/events/CALP07/papers/10.pdf.

Erkan G. Lexrank: graph-based lexical centrality as salience in text summarization / G. Erkan, D. Radev // Journal of Artificial Intelligence Research. — 2004. — Vol. 22. — P. 457–479.

A perspective-based approach for solving textual entailment recognition [Online] / O.Ferrandez, D. Micol, R. Munoz, M. Palomar. — 2007. — Available at: https://www.researchgate.net/profile/Joao_Cordeiro11/ publication/234803426_Biology_Based_Alignments_of_Paraphrases_for_Sentence_Compression/links/560fa81b08ae0fc513ef311e/Biology-Based-Alignments-of-Paraphrases-for-Sentence-Compression.pdf#page=80.

Filippova K. Multi-Sentence Compression: Finding Shortest Paths in Word Graphs [Online] / K. Filippova. — 2010. — Available at: https://www.aclweb.org/anthology/C10-1037.pdf.

Gardner J. An integrated framework for de-identifying unstructured medical data [Online] / J. Gardner, L. Xiong // Elsevier. — 2009. — Available at: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.157.7185&rep= rep1&type=pdf.

Hliaoutakis A. The AMTEx approach in the medical document indexing and retrieval application / A. Hliaoutakis, K. Zervanou, E. Petrakis // Data & Knowledge Engineering. — 2009. — N 68. — P. 380–392.

Ganesan K. Opinosis: A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions [Online] / K. Ganesan, C. Zhai, H. Jiawei. — 2010. — Available at: https://www.aclweb.org/anthology/C10-1039.pdf.

Lloret E. Analyzing the Use of Word Graphs for Abstractive Text Summarization [Online] / E. Lloret, M. Palomar. — 2011. — Available at: https://pdfs.semanticscholar.org/7d25/67d7eefc772865992e93996c1cd7f6ba6319.pdf.

Banerjee S. Multi-document abstractive summarization using ILP based multisentence compression [Online] / S. Banerjee, P. Mitra, K. Sugiyama. — 2015. — Available at: https://www.ijcai.org/Proceedings/ 15/Papers/174.pdf.

Ganest P. Framework for Abstractive Summarization — Available at: https://pdfs.semanticscholar.org/fdf9/e7d06bf21093e29923742d2040b0e495bc1d.pdf.

Khan A. A framework for multi-document abstractive summarization based on semantic role labelling / A. Khan, N. Salim, Jaya Kumar Y. // Appl Soft Comput 30:737–747. — 2015. — doi:10.1016/j.asoc.2015.01.070

Zajic D. BBN/UMD at DUC-2004:Topiary [Online] / D. Zajic, B. Dorr, R. Schwartz. — 2004. — Available at: http://users.umiacs.umd.edu/ ~bonnie/Publications/Attic/DUC2004-HEADLINE.pdf.

Clarke J. Global inference for sentence compression an integer linear programming approach / J. Clarke, M. Lapata // Journal of Artificial Intelligence Research. — 2008. — N 31. — P. 399–429.

Woodsend K. Title Generation with Quasi-Synchronous Grammar [Online] / K. Woodsend, Y. Feng, M. Lapata. — 2010. — Available at: https://www.research.ed.ac.uk/portal/files/23634327/2010_Woodsend_Feng_ET_AL_Title_Generation_with_Quasi_Synchronous_Grammar.pdf.

Abstractive Multi-Document Summarization via Phrase Selection and Merging [Online] / L. Bing, P. Li, Y. Liao et al. — 2015. — Available at: https://www.cs.cmu.edu/~lbing/pub/acl2015-bing.pdf.

Retrieve, Rerank and Rewrite: Soft Template Based Neural Summarization [Online] / Z. Cao, W. Li, F. Wei, S. Li. — 2018. — Available at: https://pdfs.semanticscholar.org/c93b/8518204ef722f4c749628023c6d5d061a5fa.pdf.

Learning Phrase Representations using RNN Encoder–Decoderfor Statistical Machine Translation [Online] / K. Cho, B. Merrienboer, C. Gulcehre et al. — 2014. — Available at: https://www.aclweb.org/anthology/ D14-1179.pdf.

Sutskever I. Sequence to Sequence Learning with Neural Networks [Online] / I. Sutskever, O. Vinyals, Q.V. Le. — 2014. — Available at: https://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf.

Schuster M. Bidirectional recurrent neural networks / M. Schuster, K. Paliwal // IEEE Transactions on Signal Processing. — 1997. — N 45. — P. 2673–2681.

Bahdanau D. Neural machine translation by jointly learning to align and translate [Online] / D. Bahdanau, K. Cho, Y. Bengio. — 2015. — Available at: https://arxiv.org/pdf/1409.0473.pdf.

Rush A.M. A Neural Attention Model for Sentence Summarization [Online] / A.M. Rush, S. Chopra, J. Weston. — 2015. — Available at: https://www.aclweb.org/anthology/D15-1044.pdf.

Napoles C. Annotated Gigaword [Online] / C. Napoles, M. Gormley, B. Van Durme. — 2012. — Available at: https://www.cs.cmu.edu/~mgormley/ papers/napoles+gormley+van-durme.naaclw.2012.pdf.

On the Properties of Neural Machine Translation: Encoder–Decoder Approaches [Online] / K.Cho, B. Van Merrienboer, D. Bahdanau, Y. Bengio. — 2014. — Available at: https://arxiv.org/pdf/1409.1259.pdf.

Empirical evaluation of gated recurrent neural networks on sequence modeling [Online] / J.Chung, Ç. Gülçehre, K. Hyun Cho, Y. Bengio. — 2014. — Available at: https://arxiv.org/pdf/1412.3555.pdf.

On Using Very Large Target Vocabulary forNeural Machine Translation [Online] / S. Jean, K. Cho, R. Memisevic, Y. Bengio. — 2014. — Available at: https://www.aclweb.org/anthology/P15-1001.pdf.

Nallapati R. Sequence-to-sequence RNNs for text summarization [Online] / R. Nallapati, B. Xiang, B. Zhou. — 2016. — Available at: https://pdfs.semanticscholar.org/033b/c4febf590f6e011e9b0f497cadfe6a4c292d.pdf.

Abstractive text summarization using sequence-to-sequence RNNs and beyond [Online] / R. Nallapati, B. Zhou, C. Santos et al. — 2016. — Available at: https://arxiv.org/pdf/1602.06023.pdf.

Nallapati R. SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive Summarization of Documents [Online] / R. Nallapati, F. Zhai, B. Zhou. — 2017. — Available at: https://arxiv.org/pdf/ 1611.04230.

See A. Get To The Point: Summarization with Pointer-Generator Networks [Online] / A. See, P.J. Liu, C.D. Manning. — 2017. — Available at: https://arxiv.org/pdf/1704.04368.

Selective Encoding for Abstractive Sentence Summarization [Online] / Q. Zhou, N. Yang, F. Wei, M. Zhou. — 2017. — Available at: https://www.aclweb.org/anthology/P17-1101.pdf.

Wang K. BiSET: Bi-directional Selective Encoding with Template for Abstractive Summarization [Online] / K. Wang, X. Quan, R. Wang. — 2017. — Available at: https://www.aclweb.org/anthology/P19-1207.pdf.

Bidirectional attention flow for machine comprehension [Online] / M. Seo, A. Kembhavi, A. Farhadi, H. Hajishirzi. — 2017. — Available at: https://arxiv.org/pdf/1611.01603.pdf.


Пристатейна бібліографія ГОСТ


1. Jones S.K. Automatic summarizing: factors and directions [Online] / S. K. Jones // MIT Press. — 1999. — Available at: https://www.cl.cam.ac.uk/archive/ksj21/ksjdigipapers/summbook99.pdf.

2. Multi-document summarization by sentence extraction [Online] / J.S. Goldstein, V. Mittal, J.G. Carbonell, M. Kantrowitz. — 2000. — Available at:: http://scholar.google.com.ua/scholar_url?url=https://kilthub.cmu.edu/articles/Multi-Document_Summarization_By_Sentence_Extraction/6624470/files/ 12121496.pdf&hl=uk&sa=X&scisig=AAGBfm3dUni3D9yq1qbG7bN3z4ow9ChpyA&nossl=1&oi=scholarr.

3. Lloret E. Text summarisation in progress: a literature review / E. Lloret, M. Palomar // Artificial Intelligence Review. — 2011. — N 37. — P. 1–41.

4. Genest P. HEXTAC: the Creation of a Manual Extractive Run [Online] / P. Genest, G. Lapalme, M. Yousfi-Monod. — 2009. — Available at: http://www.mymcorner.net/files/Genest-Lapalme-Yousfi-Monod-09.pdf.

5. Hasler L. From extracts to abstracts: human summary production operations for computer-aided summarisation [Online] / L. Hasler. — 2007. — Available at: http://rgcl.wlv.ac.uk/events/CALP07/papers/10.pdf.

6. Erkan G. Lexrank: graph-based lexical centrality as salience in text summarization / G. Erkan, D. Radev // Journal of Artificial Intelligence Research. — 2004. — Vol. 22. — P. 457–479.

7. A perspective-based approach for solving textual entailment recognition [Online] / O.Ferrandez, D. Micol, R. Munoz, M. Palomar. — 2007. — Available at: https://www.researchgate.net/profile/Joao_Cordeiro11/ publication/234803426_Biology_Based_Alignments_of_Paraphrases_for_Sentence_Compression/links/560fa81b08ae0fc513ef311e/Biology-Based-Alignments-of-Paraphrases-for-Sentence-Compression.pdf#page=80.

8. Filippova K. Multi-Sentence Compression: Finding Shortest Paths in Word Graphs [Online] / K. Filippova. — 2010. — Available at: https://www.aclweb.org/anthology/C10-1037.pdf.

9. Gardner J. An integrated framework for de-identifying unstructured medical data [Online] / J. Gardner, L. Xiong // Elsevier. — 2009. — Available at: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.157.7185&rep= rep1&type=pdf.

10. Hliaoutakis A. The AMTEx approach in the medical document indexing and retrieval application / A. Hliaoutakis, K. Zervanou, E. Petrakis // Data & Knowledge Engineering. — 2009. — N 68. — P. 380–392.

11. Ganesan K. Opinosis: A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions [Online] / K. Ganesan, C. Zhai, H. Jiawei. — 2010. — Available at: https://www.aclweb.org/anthology/C10-1039.pdf.

12. Lloret E. Analyzing the Use of Word Graphs for Abstractive Text Summarization [Online] / E. Lloret, M. Palomar. — 2011. — Available at: https://pdfs.semanticscholar.org/7d25/67d7eefc772865992e93996c1cd7f6ba6319.pdf.

13. Banerjee S. Multi-document abstractive summarization using ILP based multisentence compression [Online] / S. Banerjee, P. Mitra, K. Sugiyama. — 2015. — Available at: https://www.ijcai.org/Proceedings/ 15/Papers/174.pdf.

14. Ganest P. Framework for Abstractive Summarization — Available at: https://pdfs.semanticscholar.org/fdf9/e7d06bf21093e29923742d2040b0e495bc1d.pdf.

15. Khan A. A framework for multi-document abstractive summarization based on semantic role labelling / A. Khan, N. Salim, Jaya Kumar Y. // Appl Soft Comput 30:737–747. — 2015. — doi:10.1016/j.asoc.2015.01.070

16. Zajic D. BBN/UMD at DUC-2004:Topiary [Online] / D. Zajic, B. Dorr, R. Schwartz. — 2004. — Available at: http://users.umiacs.umd.edu/ ~bonnie/Publications/Attic/DUC2004-HEADLINE.pdf.

17. Clarke J. Global inference for sentence compression an integer linear programming approach / J. Clarke, M. Lapata // Journal of Artificial Intelligence Research. — 2008. — N 31. — P. 399–429.

18. Woodsend K. Title Generation with Quasi-Synchronous Grammar [Online] / K. Woodsend, Y. Feng, M. Lapata. — 2010. — Available at: https://www.research.ed.ac.uk/portal/files/23634327/2010_Woodsend_Feng_ET_AL_Title_Generation_with_Quasi_Synchronous_Grammar.pdf.

19. Abstractive Multi-Document Summarization via Phrase Selection and Merging [Online] / L. Bing, P. Li, Y. Liao et al. — 2015. — Available at: https://www.cs.cmu.edu/~lbing/pub/acl2015-bing.pdf.

20. Retrieve, Rerank and Rewrite: Soft Template Based Neural Summarization [Online] / Z. Cao, W. Li, F. Wei, S. Li. — 2018. — Available at: https://pdfs.semanticscholar.org/c93b/8518204ef722f4c749628023c6d5d061a5fa.pdf.

21. Learning Phrase Representations using RNN Encoder–Decoderfor Statistical Machine Translation [Online] / K. Cho, B. Merrienboer, C. Gulcehre et al. — 2014. — Available at: https://www.aclweb.org/anthology/ D14-1179.pdf.

22. Sutskever I. Sequence to Sequence Learning with Neural Networks [Online] / I. Sutskever, O. Vinyals, Q.V. Le. — 2014. — Available at: https://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf.

23. Schuster M. Bidirectional recurrent neural networks / M. Schuster, K. Paliwal // IEEE Transactions on Signal Processing. — 1997. — N 45. — P. 2673–2681.

24. Bahdanau D. Neural machine translation by jointly learning to align and translate [Online] / D. Bahdanau, K. Cho, Y. Bengio. — 2015. — Available at: https://arxiv.org/pdf/1409.0473.pdf.

25. Rush A.M. A Neural Attention Model for Sentence Summarization [Online] / A.M. Rush, S. Chopra, J. Weston. — 2015. — Available at: https://www.aclweb.org/anthology/D15-1044.pdf.

26. Napoles C. Annotated Gigaword [Online] / C. Napoles, M. Gormley, B. Van Durme. — 2012. — Available at: https://www.cs.cmu.edu/~mgormley/ papers/napoles+gormley+van-durme.naaclw.2012.pdf.

27. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches [Online] / K.Cho, B. Van Merrienboer, D. Bahdanau, Y. Bengio. — 2014. — Available at: https://arxiv.org/pdf/1409.1259.pdf.

28. Empirical evaluation of gated recurrent neural networks on sequence modeling [Online] / J.Chung, Ç. Gülçehre, K. Hyun Cho, Y. Bengio. — 2014. — Available at: https://arxiv.org/pdf/1412.3555.pdf.

29. On Using Very Large Target Vocabulary forNeural Machine Translation [Online] / S. Jean, K. Cho, R. Memisevic, Y. Bengio. — 2014. — Available at: https://www.aclweb.org/anthology/P15-1001.pdf.

30. Nallapati R. Sequence-to-sequence RNNs for text summarization [Online] / R. Nallapati, B. Xiang, B. Zhou. — 2016. — Available at: https://pdfs.semanticscholar.org/033b/c4febf590f6e011e9b0f497cadfe6a4c292d.pdf.

31. Abstractive text summarization using sequence-to-sequence RNNs and beyond [Online] / R. Nallapati, B. Zhou, C. Santos et al. — 2016. — Available at: https://arxiv.org/pdf/1602.06023.pdf.

32. Nallapati R. SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive Summarization of Documents [Online] / R. Nallapati, F. Zhai, B. Zhou. — 2017. — Available at: https://arxiv.org/pdf/ 1611.04230.

33. See A. Get To The Point: Summarization with Pointer-Generator Networks [Online] / A. See, P.J. Liu, C.D. Manning. — 2017. — Available at: https://arxiv.org/pdf/1704.04368.

34. Selective Encoding for Abstractive Sentence Summarization [Online] / Q. Zhou, N. Yang, F. Wei, M. Zhou. — 2017. — Available at: https://www.aclweb.org/anthology/P17-1101.pdf.

35. Wang K. BiSET: Bi-directional Selective Encoding with Template for Abstractive Summarization [Online] / K. Wang, X. Quan, R. Wang. — 2017. — Available at: https://www.aclweb.org/anthology/P19-1207.pdf.

36. Bidirectional attention flow for machine comprehension [Online] / M. Seo, A. Kembhavi, A. Farhadi, H. Hajishirzi. — 2017. — Available at: https://arxiv.org/pdf/1611.01603.pdf.