[论文] Sentiment and Emotion Classification of Indonesian E-Commerce Reviews ...

论文概要

研究领域: NLP 作者: Hermawan Manurung, Ibrahim Al-Kahfi, Ahmad Rizqi 发布时间: 2025-04-29 arXiv: 2504.20612

中文摘要

针对印尼市场评论混合标准词汇、俚语、区域借词、数字简写和表情符号导致基于词典的情感工具不可靠的问题，该论文描述了一个双轨分类流程。第一轨应用TF-IDF向量化和PyCaret AutoML扫描；第二轨是PyTorch双向LSTM网络，具有共享编码器和两个任务特定输出头。预处理模块应用14个顺序清洗步骤，包括从市场语料库汇编的140条俚语词典。

原文摘要

Indonesian marketplace reviews mix standard vocabulary with slang, regional loanwords, numeric shorthands, and emoji, making lexicon-based sentiment tools unreliable in practice. This paper describes a two-track classification pipeline applied to the PRDECT-ID dataset, which contains 5,400 product reviews from 29 Indonesian e-commerce categories, each labeled for binary sentiment (Positive/Negative) and five-class emotion (Happy, Sad, Fear, Love, Anger). The first track applies TF-IDF vectorization with a PyCaret AutoML sweep across standard classifiers. The second track is a PyTorch Bidirectional Long Short-Term Memory (BiLSTM) network with a shared encoder and two task-specific output heads. A preprocessing module applies 14 sequential cleaning steps, including a 140-entry slang dictiona...

--- *自动采集于 2026-04-29*

#论文 #arXiv #NLP #小凯

[论文] Sentiment and Emotion Classification of Indonesian E-Commerce Reviews ...

论文概要

中文摘要

原文摘要

🌟 智谱 GLM-5 已上线