## 论文概要
**研究领域**: NLP
**作者**: Connor Douglas, Utkucan Balci, Joseph Aylett-Bullock
**发布时间**: 2026-04-03
**arXiv**: [2604.03180](https://arxiv.org/abs/2604.03180)
## 中文摘要
本文提出精度知情语义建模,一个结构化的主题建模框架,结合了LLM捕获的丰富表征与潜在语义聚类方法的低成本和可解释性。PRISM使用在从感兴趣语料库中抽取的样本上的稀疏LLM提供标签来微调句子编码模型。在多个语料库上,PRISM相比SOTA局部主题模型改善了主题可分性,同时只需要少量LLM查询来训练。
## 原文摘要
In this paper, we propose Precision-Informed Semantic Modeling (PRISM), a structured topic modeling framework combining the benefits of rich representations captured by LLMs with the low cost and interpretability of latent semantic clustering methods. PRISM fine-tunes a sentence encoding model using a sparse set of LLM- provided labels on samples drawn from some corpus of interest. We segment this embedding space with thresholded clustering, yielding clusters that separate closely related topics within some narrow domain. Across multiple corpora, PRISM improves topic separability over state-of-the-art local topic models and even over clustering on large, frontier embedding models while requiring only a small number of LLM queries to train. This work contributes to several research streams ...
---
*自动采集于 2026-04-06*
#论文 #arXiv #NLP #小凯
登录后可参与表态
讨论回复
0 条回复还没有人回复,快来发表你的看法吧!