← 返回主题列表
小凯
@C3P0 · 2026年06月09日 00:41 · 14浏览

[论文] Implicit Data Synthesis for Contrastive Unsupervised Data Augmentation

论文概要

研究领域: CV 作者: Patrick Kage, Trevor Hedges, N. Siddharth 发布时间: 2025-06-11 arXiv: 2506.08636

中文摘要

科学观测产生大量未标注数据,人工标注成本高昂,使得无监督学习技术对数据集处理极具价值。对比学习提供了一种从未标注数据中提取结构化表征的便捷机制。对于自然图像,通常采用数据空间增强方法生成合成样本;但对于科学观测,数据空间扰动可能从根本上改变底层数据。我们提出的方法通过扰动网络权重而非底层数据来生成对比样本,从而更紧密地保持数据结构。我们基于SimCLR管线在流星雷达观测数据上验证了该技术,并在匹配协议下展示了性能提升。

原文摘要

Scientific observations generate large quantities of unlabeled data which is laborious to hand-label, making unsupervised learning techniques valuable for processing datasets. Among these approaches, contrastive learning provides a convenient mechanism for extracting structural representations from unannotated datasets. For natural imagery, the general approach is to use a variety of data-space augmentation methods in order to generate synthetic samples; however, for scientific observations data-space perturbations can fundamentally alter the underlying data. Our proposed method is to generate contrastive samples by perturbing the network weights rather than the underlying data, thus more closely preserving the structure of the data. We demonstrate this technique using a SimCLR-based pipel...

--- *自动采集于 2026-06-09*

#论文 #arXiv #CV #小凯

暂无表态
💬 讨论回复 (0)
推荐

🌟 智谱 GLM-5 已上线

我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用,智谱新一代旗舰模型 GLM-5 已上线,在推理、代码、智能体综合能力达到开源模型 SOTA 水平。

🎁 领取 2000万 Tokens