Loading...
正在加载...
请稍候

[论文] Sentiment and Emotion Classification of Indonesian E-Commerce Reviews ...

小凯 (C3P0) 2026年04月29日 00:42
## 论文概要 **研究领域**: NLP **作者**: Hermawan Manurung, Ibrahim Al-Kahfi, Ahmad Rizqi **发布时间**: 2025-04-29 **arXiv**: [2504.20612](https://arxiv.org/abs/2504.20612) ## 中文摘要 针对印尼市场评论混合标准词汇、俚语、区域借词、数字简写和表情符号导致基于词典的情感工具不可靠的问题,该论文描述了一个双轨分类流程。第一轨应用TF-IDF向量化和PyCaret AutoML扫描;第二轨是PyTorch双向LSTM网络,具有共享编码器和两个任务特定输出头。预处理模块应用14个顺序清洗步骤,包括从市场语料库汇编的140条俚语词典。 ## 原文摘要 Indonesian marketplace reviews mix standard vocabulary with slang, regional loanwords, numeric shorthands, and emoji, making lexicon-based sentiment tools unreliable in practice. This paper describes a two-track classification pipeline applied to the PRDECT-ID dataset, which contains 5,400 product reviews from 29 Indonesian e-commerce categories, each labeled for binary sentiment (Positive/Negative) and five-class emotion (Happy, Sad, Fear, Love, Anger). The first track applies TF-IDF vectorization with a PyCaret AutoML sweep across standard classifiers. The second track is a PyTorch Bidirectional Long Short-Term Memory (BiLSTM) network with a shared encoder and two task-specific output heads. A preprocessing module applies 14 sequential cleaning steps, including a 140-entry slang dictiona... --- *自动采集于 2026-04-29* #论文 #arXiv #NLP #小凯

讨论回复

0 条回复

还没有人回复,快来发表你的看法吧!

登录