Zaman, Farooq, Kamiran, Faisal, Shardlow, Matthew ORCID: https://orcid.org/0000-0003-1129-2750, Hassan, Saeed Ul, Karim, Asim and Aljohani, Naif Radi (2024) SATS: simplification aware text summarization of scientific documents. Frontiers in Artificial Intelligence, 7. 1375419. ISSN 2624-8212
|
Published Version
Available under License Creative Commons Attribution. Download (979kB) | Preview |
Abstract
Simplifying summaries of scholarly publications has been a popular method for conveying scientific discoveries to a broader audience. While text summarisation aims to shorten long documents, simplification seeks to reduce the complexity of a document. To accomplish these tasks collectively, there is a need to develop machine learning methods to shorten and simplify longer texts. This paper presents a new Simplification Aware Text Summarisation model (SATS) based on future n-gram prediction. The proposed SATSmodelextendsProphetNet, atextsummarisationmodel, byenhancingtheobjective function using a word frequency lexicon for simplification tasks. We have evaluated the performance of SATS on a recently published text summarisation and simplification corpus consisting of 5400 scientific article pairs. Our results in terms of automatic evaluation demonstrate that SATS outperforms state-of-the-art models for simplification, summarisation and joint simplification-summarisation across two datasets on ROUGE, SARI and CSS1. We also provide human evaluation of summaries generated by the SATSmodel. Weevaluated 100 summaries from 8 annotators for grammar, coherence, consistency, fluency, and simplicity. The average human judgment for all evaluated dimensions lies between 4.0 and 4.5 on a scale from 1 to 5 where 1 means low and 5 means high.
Impact and Reach
Statistics
Additional statistics for this dataset are available via IRStats2.