[论文] Zero-Shot Imagined Speech Decoding via Imagined-to-Listened MEG Mappin...

论文概要

研究领域: ML 作者: Maryam Maghsoudi, Shihab Shamma 发布时间: 2025-05-07 arXiv: 2505.05131

中文摘要

从非侵入式脑记录中解码想象语音具有挑战性，因为想象数据集稀缺且难以在受试者和会话间进行时间对齐。在这项工作中，我们提出了一种新的想象语音解码方法，利用聆听语音时更丰富且更可靠标记的记录。我们收集了训练有素的音乐家对节奏性旋律和语音刺激的配对聆听与想象MEG记录。使用训练有素的音乐家有助于改善跨条件的时间对齐。然后我们开发了一个三阶段解码流程，揭示了想象和聆听相同刺激引发的神经活动之间一致且有意义的关联。首先，我们训练了六个线性和神经模型来将想象的MEG响应映射到聆听响应。我们针对未见受试者的零基线评估这些模型，以验证预测的聆听响应保留了刺激特定信息。第二阶段，我们仅在聆听MEG响应上训练了一个对比词解码器，并使用包括语义、声学和语音表征的四种嵌入策略进行评估。第三阶段，我们将保留受试者的想象MEG响应通过映射流程处理以计算相应的聆听响应，然后由聆听解码器解码。使用基于排名的分析，我们证明想象词可以以显著高于随机水平的概率解码。我们在此报告了一个概念验证实现的想象语音解码结果，所有评估都在保留受试者上进行。我们还证明性能随训练数据量增加而提高，表明该方法可扩展，可直接应用于现实的脑机接口场景。

原文摘要

Decoding imagined speech from non-invasive brain recordings is challenging because imagined datasets are scarce and difficult to align temporally across subjects and sessions In this work, we propose a new approach to the decoding of imagined speech that leverages the richer and more reliably labeled recordings during listening to speech. We collected paired listened and imagined MEG recordings to rhythmic melodic and spoken stimuli from trained musicians. Using trained musicians helped improve temporal alignment across conditions. We then developed a three-stage decoding pipeline that revealed consistent and meaningful relationships between neural activity evoked by imagining and listening to the same stimuli. First, we trained six linear and neural models to map imagined MEG responses to...

--- *自动采集于 2026-05-12*

#论文 #arXiv #ML #小凯

[论文] Zero-Shot Imagined Speech Decoding via Imagined-to-Listened MEG Mappin...

论文概要

中文摘要

原文摘要

🌟 智谱 GLM-5 已上线