Çok Yöntemli Metin Özetleme: CNN/Daily Mail Üzerinde Çıkarımsal ve BART Tabanlı Yaklaşımların Değerlendirilmesi


İnal Y., Bakal M. G., Eşit M.

2025 9th International Symposium on Innovative Approaches in Smart Technologies (ISAS), Gaziantep, Türkiye, 27 - 28 Haziran 2025, ss.1-7, (Tam Metin Bildiri)

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/isas66241.2025.11101791
  • Basıldığı Şehir: Gaziantep
  • Basıldığı Ülke: Türkiye
  • Sayfa Sayıları: ss.1-7
  • Abdullah Gül Üniversitesi Adresli: Evet

Özet

With the exponential growth of digital content, efficient text summarization has become increasingly crucial for managing information overload. This paper presents a comprehensive approach to text summarization using both extractive and abstractive methods, implemented on the CNN/Daily Mail dataset. We leverage pre-trained BART (Bidirectional and AutoRegressive Transformers) models and fine-tuning techniques to generate high-quality summaries. Our approach demonstrates significant improvements, with our best model trained on 287K samples achieving ROUGE-1 F1 scores of 0.4174, ROUGE-2 F1 scores of 0.1932, and ROUGE-L F1 scores of 0.2910. We provide detailed comparisons between extractive methods and various BART model configurations, analyzing the impact of training dataset size and model architecture on summarization quality. Additionally, we share our implementation through an opensource NLP toolkit to facilitate further research and practical applications in the field.