A Systematic Review of Hybrid Sarcasm Detection: Fusing Contextual Embeddings with Handcrafted Linguistic Features

Authors

  • Samyak Ingle P.R. Pote Patil College Of Engineering And Management Amravti Author
  • Saurabh aghadate Author
  • Siddhi S. Pampattiwar Author
  • Aditi M. Kamble Author
  • Arati D. Paraskar Author
  • Prof. Abhishekh R. Ladole Author

Keywords:

Natural Language Processing (NLP), Sentiment Analysis, Linguistic Features, Hybrid Sarcasm Detection, Feature Engineering, Ensemble Learning, Deep Learning, Contextual Embeddings

Abstract

Sarcasm detection (SD) in Natural Language Processing (NLP) constitutes a significant  challenge, as sarcastic expressions convey the opposite of their literal meaning, often reversing  sentiment polarity. This ambiguity is amplified in text by the absence of non-verbal cues like  tone and facial expressions. While modern transformer models excel at capturing deep context,  they often fail to register the explicit rhetorical structures inherent in irony ; conversely,  traditional feature-based models capture linguistic structure but lack deep semantic  understanding.  To address these limitations, this paper proposes a novel, two-branch Hybrid Contextual- Linguistic Sarcasm Detector (HCL-SD) framework. The HCL-SD framework synergistically  integrates deep contextual embeddings, derived from fine-tuned RoBERTa/DistilBERT  models, with a meticulously engineered set of 13 handcrafted linguistic features (such as  Entropy, Readability Scores, and Part-of-Speech counts). This dual-branch approach allows the model to simultaneously learn implicit semantic incongruity and explicit rhetorical cues.

The resulting fused feature space is classified using an optimized Ensemble Model employing  a Majority Voting scheme. Comprehensive experimentation on benchmark datasets, including the News Headlines and  Mustard datasets, demonstrates the framework's superior performance. The proposed approach  achieved a state-of-the-art F1-Score of 0.997 on the News Headlines dataset. Crucially, the  integration of contextual metadata was proven essential for generalization, dramatically  improving the cross-domain F1-Score on the Reddit validation dataset from 0.70 to 0.92. This  research confirms that combining deep contextual comprehension with explicit linguistic  feature engineering is indispensable for constructing robust, efficient, and highly accurate  sarcasm detection systems.

Author Biographies

  • Siddhi S. Pampattiwar

    Department of Artificial Intelligence and Data Science

  • Aditi M. Kamble

    Department of Artificial Intelligence and Data Science

  • Arati D. Paraskar

    Department of Artificial Intelligence and Data Science

  • Prof. Abhishekh R. Ladole

    Assistant Professor and Co-Author, Department of Artificial Intelligence and Data Science

Downloads

Published

2026-03-09

Issue

Section

Articles

How to Cite

A Systematic Review of Hybrid Sarcasm Detection: Fusing Contextual Embeddings with Handcrafted Linguistic Features. (2026). Peer-Reviewed Journal of Computer Science (PRJCS), 1(3), 1-7. https://peerreviewjournal.in/index.php/prjcs/article/view/3