1- Department of Computer Engineering Amirkabir University of Technology Tehran, Iran
2- Department of Computer Engineering Amirkabir University of Technology Tehran, Iran , hzeinali@aut.ac.ir
Abstract: (249 Views)
BERT-based models have gained popularity for addressing various NLP tasks, yet the optimal utilization of knowledge embedded in distinct layers of BERT remains an open question. In this paper, we introduce and compare diverse architectures that integrate the hidden layers of BERT for text classification tasks, with a specific focus on Persian social media. We conduct sentiment analysis and stance detection on Persian tweet datasets. This work represents the first investigation into the impact of various neural network architectures on combinations of BERT hidden layers for Persian text classification. The experimental results demonstrate that our proposed approaches can outperform the vanilla BERT that utilizes an MLP classifier on top of the corresponding output of the CLS token in terms of performance and generalization.