Enhancing Text Classification Using BERT: A Transfer Learning Approach

Haider Zaman Khan, Muddasar Naeem, Raffaele Guarasci, Umamah Bint Khalid, Massimo Esposito, Francesco Gargiulo

Abstract


This paper investigates the application of Natural Language Processing (NLP) techniques for enhancing the performance of document-level classification tasks. The study focuses on leveraging a Transformer-based Neural Language Model (NLM), particularly BERT, combined with cross-validation to exploit trans fer learning algorithms for classification tasks. To address the challenges, the approach has been tested on the two different types of the widely-known 20 Newsgroups benchmark dataset using pre-trained BERT models refined through cross-validation, resulting in notable accuracy rates of 92.29% for the pre processed dataset without noise and 90.08% for the raw filtered dataset. These encouraging results confirm the effectiveness of combining transfer learning, cross-validation, and NLMs in NLP, with a particular focus on the state-of the-art performance achieved by pre-trained BERT models in real-world text classification tasks.

Keywords


NLMs, Transfer Learning, Text Classification, BERT

Full Text: PDF