Loading...
正在加载...
请稍候

Clarifying "MoME" A comprehensive guide to understanding multiple meanings in artificial intelligence

QianXun (QianXun) 2025年11月24日 16:12
<!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8"/> <meta name="viewport" content="width=device-width, initial-scale=1.0"/> <title>MoME: Multiple Meanings in AI - A Comprehensive Guide</title> <script src="https://cdn.tailwindcss.com"></script> <script> tailwind.config = { theme: { extend: { colors: { primary: '#1e3a8a', secondary: '#64748b', accent: '#f1f5f9' }, fontFamily: { 'serif': ['Playfair Display', 'serif'], 'sans': ['Inter', 'sans-serif'] } } } } </script> <link href="https://fonts.googleapis.com/css2?family=Playfair+Display:ital,wght@0,400;0,600;0,700;1,400&amp;family=Inter:wght@300;400;500;600;700&amp;display=swap" rel="stylesheet"/> <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.0/css/all.min.css"/> <script src="https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.min.js"></script> <style> .hero-gradient { background: linear-gradient(135deg, #1e3a8a 0%, #3b82f6 50%, #64748b 100%); } .text-shadow { text-shadow: 2px 2px 4px rgba(0,0,0,0.3); } .mermaid-container { display: flex; justify-content: center; min-height: 300px; max-height: 800px; background: #ffffff; border: 2px solid #e5e7eb; border-radius: 12px; padding: 30px; margin: 30px 0; box-shadow: 0 8px 25px rgba(0, 0, 0, 0.08); position: relative; overflow: hidden; } .mermaid-container .mermaid { width: 100%; max-width: 100%; height: 100%; cursor: grab; transition: transform 0.3s ease; transform-origin: center center; display: flex; justify-content: center; align-items: center; touch-action: none; -webkit-user-select: none; -moz-user-select: none; -ms-user-select: none; user-select: none; } .mermaid-container .mermaid svg { max-width: 100%; height: 100%; display: block; margin: 0 auto; } .mermaid-container .mermaid:active { cursor: grabbing; } .mermaid-container.zoomed .mermaid { height: 100%; width: 100%; cursor: grab; } .mermaid-controls { position: absolute; top: 15px; right: 15px; display: flex; gap: 10px; z-index: 20; background: rgba(255, 255, 255, 0.95); padding: 8px; border-radius: 8px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1); } .mermaid-control-btn { background: #ffffff; border: 1px solid #d1d5db; border-radius: 6px; padding: 10px; cursor: pointer; transition: all 0.2s ease; color: #374151; font-size: 14px; min-width: 36px; height: 36px; text-align: center; display: flex; align-items: center; justify-content: center; } .mermaid-control-btn:hover { background: #f8fafc; border-color: #3b82f6; color: #3b82f6; transform: translateY(-1px); } .mermaid-control-btn:active { transform: scale(0.95); } <span class="mention-invalid">@media</span> (max-width: 1024px) { .mermaid-control-btn:not(.reset-zoom) { display: none; } .mermaid-controls { top: auto; bottom: 15px; right: 15px; } } </style> <base target="_blank"> </head> <body class="font-sans text-gray-900 bg-white"> <!-- Toggle Button for Mobile --> <button id="toc-toggle" class="lg:hidden fixed top-4 left-4 z-50 p-2 bg-white rounded shadow text-primary"> <i class="fas fa-bars"></i> </button> <!-- Fixed Table of Contents --> <nav id="toc-nav" class="fixed left-0 top-0 h-full w-80 bg-white border-r border-gray-200 overflow-y-auto z-50 transform -translate-x-full lg:translate-x-0 transition-transform duration-300"> <div class="p-6 border-b border-gray-200 flex justify-between items-center"> <h2 class="text-lg font-semibold text-gray-900">Table of Contents</h2> <button id="toc-close" class="lg:hidden p-2 text-gray-500 hover:text-gray-700"> <i class="fas fa-times"></i> </button> </div> <ul class="p-6 space-y-3 text-sm"> <li> <a href="#introduction" class="block py-2 px-3 text-gray-700 hover:text-primary hover:bg-gray-50 rounded transition-colors">Introduction</a> </li> <li> <a href="#mome-meta-ai" class="block py-2 px-3 text-gray-700 hover:text-primary hover:bg-gray-50 rounded transition-colors">1. MoME in Meta AI</a> <ul class="ml-4 mt-2 space-y-2"> <li> <a href="#core-framework" class="block py-1 px-2 text-gray-600 hover:text-primary text-xs">Core Framework</a> </li> <li> <a href="#technical-advantages" class="block py-1 px-2 text-gray-600 hover:text-primary text-xs">Technical Advantages</a> </li> <li> <a href="#development-collaboration" class="block py-1 px-2 text-gray-600 hover:text-primary text-xs">Development &amp; Collaboration</a> </li> </ul> </li> <li> <a href="#broader-landscape" class="block py-2 px-3 text-gray-700 hover:text-primary hover:bg-gray-50 rounded transition-colors">2. Broader MoME Landscape</a> <ul class="ml-4 mt-2 space-y-2"> <li> <a href="#modality-experts" class="block py-1 px-2 text-gray-600 hover:text-primary text-xs">Mixture of Modality Experts</a> </li> <li> <a href="#variants-concepts" class="block py-1 px-2 text-gray-600 hover:text-primary text-xs">Variants &amp; Concepts</a> </li> </ul> </li> <li> <a href="#foundational-architecture" class="block py-2 px-3 text-gray-700 hover:text-primary hover:bg-gray-50 rounded transition-colors">3. Foundational Architecture</a> <ul class="ml-4 mt-2 space-y-2"> <li> <a href="#moe-principles" class="block py-1 px-2 text-gray-600 hover:text-primary text-xs">Core Principles</a> </li> <li> <a href="#moe-meta-ecosystem" class="block py-1 px-2 text-gray-600 hover:text-primary text-xs">MoE in Meta&#39;s Ecosystem</a> </li> </ul> </li> <li> <a href="#comparative-analysis" class="block py-2 px-3 text-gray-700 hover:text-primary hover:bg-gray-50 rounded transition-colors">4. Comparative Analysis</a> </li> <li> <a href="#conclusion" class="block py-2 px-3 text-gray-700 hover:text-primary hover:bg-gray-50 rounded transition-colors">5. Conclusion</a> </li> </ul> </nav> <!-- Main Content --> <div class="lg:ml-80"> <!-- Introduction --> <section id="introduction" class="py-20 bg-gray-50"> <div class="container mx-auto px-4 lg:px-12"> <div class="max-w-4xl mx-auto"> <h2 class="font-serif text-4xl font-bold text-primary mb-12 text-center">Understanding the Multiple Faces of MoME</h2> <div class="prose prose-lg max-w-none"> <p class="text-xl text-gray-700 leading-relaxed mb-8"> In the rapidly evolving landscape of artificial intelligence, the acronym <strong>&#34;MoME&#34;</strong> has emerged as a significant term with multiple distinct meanings. While context often clarifies intent, the overlapping nomenclature can create confusion among researchers, practitioners, and enthusiasts alike. </p> <div class="bg-white rounded-lg p-8 shadow-sm border border-gray-200 mb-12"> <h3 class="text-2xl font-semibold text-primary mb-6">Primary MoME Concepts</h3> <div class="grid grid-cols-1 md:grid-cols-2 gap-8"> <div> <h4 class="text-lg font-semibold text-gray-900 mb-3">Mixture of Matryoshka Experts</h4> <p class="text-gray-700 text-sm mb-4">A novel AI framework developed by Meta AI and Imperial College London for efficient audio-visual speech recognition.</p> <span class="inline-block bg-blue-100 text-blue-800 text-xs px-3 py-1 rounded-full">Meta AI Research</span> </div> <div> <h4 class="text-lg font-semibold text-gray-900 mb-3">Mixture of Modality Experts</h4> <p class="text-gray-700 text-sm mb-4">A medical AI model developed by HKUST for non-invasive breast cancer diagnosis using multiparametric MRI.</p> <span class="inline-block bg-green-100 text-green-800 text-xs px-3 py-1 rounded-full">Medical AI</span> </div> </div> </div> <p class="text-lg text-gray-700 leading-relaxed"> This comprehensive guide aims to clarify these different meanings, providing researchers and practitioners with a clear understanding of each concept, their applications, and the contexts in which they appear. By examining the technical foundations, development collaborations, and practical implementations, we can better navigate the complex landscape of modern AI research. </p> </div> </div> </div> </section> <!-- MoME in Meta AI --> <section id="mome-meta-ai" class="py-20 bg-white"> <div class="container mx-auto px-4 lg:px-12"> <div class="max-w-6xl mx-auto"> <h2 class="font-serif text-4xl font-bold text-primary mb-16">MoME in Meta AI: Mixture of Matryoshka Experts</h2> <!-- Core Framework --> <div id="core-framework" class="mb-20"> <h3 class="text-3xl font-semibold text-gray-900 mb-8">Core Framework and Purpose</h3> <div class="grid grid-cols-1 lg:grid-cols-3 gap-8 mb-12"> <div class="lg:col-span-2"> <p class="text-gray-700 leading-relaxed mb-6"> The <strong>Mixture of Matryoshka Experts (MoME)</strong> framework represents a sophisticated approach to enhancing the efficiency and performance of large-scale AI models, specifically in the domain of audio-visual speech recognition (AVSR). This development is a collaborative effort between Imperial College London and Meta AI, along with contributions from NatWest AI Research. </p> <p class="text-gray-700 leading-relaxed mb-6"> The framework&#39;s name, &#34;Matryoshka,&#34; is inspired by Russian nesting dolls, aptly describing its ability to handle information at various levels of compression or granularity. This design philosophy enables a single, unified model to operate effectively across different scenarios, from high-fidelity processing to highly compressed processing prioritizing speed and efficiency. </p> </div> <div class="bg-accent rounded-lg p-6"> <h4 class="font-semibold text-primary mb-4">Key Components</h4> <ul class="space-y-3 text-sm"> <li class="flex items-start"> <i class="fas fa-cog text-primary mt-1 mr-3"></i> <span><strong>MoE Architecture:</strong> Sparse computation with multiple expert sub-networks</span> </li> <li class="flex items-start"> <i class="fas fa-layer-group text-primary mt-1 mr-3"></i> <span><strong>MRL Integration:</strong> Hierarchical, multi-scale representation learning</span> </li> <li class="flex items-start"> <i class="fas fa-share-alt text-primary mt-1 mr-3"></i> <span><strong>Shared Router:</strong> Consistent expert activation across scales</span> </li> </ul> </div> </div> <!-- Architecture Diagram --> <div class="bg-gray-50 rounded-lg p-8 mb-12"> <h4 class="text-xl font-semibold text-gray-900 mb-6 text-center">MoME Architecture Overview</h4> <div class="mermaid-container"> <div class="mermaid-controls"> <button class="mermaid-control-btn zoom-in" title="放大"> <i class="fas fa-search-plus"></i> </button> <button class="mermaid-control-btn zoom-out" title="缩小"> <i class="fas fa-search-minus"></i> </button> <button class="mermaid-control-btn reset-zoom" title="重置"> <i class="fas fa-expand-arrows-alt"></i> </button> <button class="mermaid-control-btn fullscreen" title="全屏查看"> <i class="fas fa-expand"></i> </button> </div> <div class="mermaid" id="mome-architecture-diagram"> graph TB A[&#34;Audio-Visual Input&#34;] --&gt; B[&#34;Multi-Scale Processing&#34;] B --&gt; C[&#34;Shared Router&#34;] C --&gt; D[&#34;Expert Selection&#34;] D --&gt; E[&#34;Expert 1&#34;] D --&gt; F[&#34;Expert 2&#34;] D --&gt; G[&#34;Expert 3&#34;] D --&gt; H[&#34;Expert 4&#34;] E --&gt; I[&#34;Knowledge Fusion&#34;] F --&gt; I G --&gt; I H --&gt; I I --&gt; J[&#34;AVSR Output&#34;] style A fill:#e3f2fd,stroke:#1976d2,stroke-width:2px,color:#000 style C fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style I fill:#e8f5e8,stroke:#388e3c,stroke-width:2px,color:#000 style J fill:#fce4ec,stroke:#c2185b,stroke-width:2px,color:#000 style B fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#000 style D fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#000 style E fill:#ffebee,stroke:#d32f2f,stroke-width:2px,color:#000 style F fill:#ffebee,stroke:#d32f2f,stroke-width:2px,color:#000 style G fill:#ffebee,stroke:#d32f2f,stroke-width:2px,color:#000 style H fill:#ffebee,stroke:#d32f2f,stroke-width:2px,color:#000 </div> </div> </div> <!-- Primary Application --> <div class="mb-12"> <h4 class="text-2xl font-semibold text-gray-900 mb-6">Primary Application: Audio-Visual Speech Recognition</h4> <div class="grid grid-cols-1 md:grid-cols-2 gap-8"> <div> <p class="text-gray-700 leading-relaxed mb-4"> The primary application of MoME is in <strong>audio-visual speech recognition (AVSR)</strong>, a challenging multimodal task that involves transcribing spoken language by simultaneously analyzing both audio signals and visual lip movements. </p> <p class="text-gray-700 leading-relaxed"> This dual-modality approach is particularly valuable in noisy environments, where visual cues can significantly improve transcription accuracy and robustness—scenarios where purely audio-based systems often fail. </p> </div> <div class="bg-blue-50 rounded-lg p-6"> <h5 class="font-semibold text-primary mb-4">AVSR Challenges Addressed</h5> <ul class="space-y-2 text-sm text-gray-700"> <li>• High computational demands of multimodal processing</li> <li>• Sensitivity to input data granularity</li> <li>• Resource constraints in real-world deployment</li> <li>• Need for dynamic adaptation to varying conditions</li> </ul> </div> </div> </div> </div> <!-- Technical Advantages --> <div id="technical-advantages" class="mb-20"> <h3 class="text-3xl font-semibold text-gray-900 mb-8">Technical Advantages and Performance</h3> <div class="grid grid-cols-1 md:grid-cols-3 gap-8 mb-12"> <div class="bg-gradient-to-br from-blue-50 to-blue-100 rounded-lg p-6"> <div class="flex items-center mb-4"> <i class="fas fa-tachometer-alt text-2xl text-blue-600 mr-3"></i> <h4 class="text-lg font-semibold text-gray-900">Dynamic Capacity Allocation</h4> </div> <p class="text-gray-700 text-sm">Sparse MoE architecture activates only a small subset of experts for each input, significantly reducing computational load.</p> </div> <div class="bg-gradient-to-br from-green-50 to-green-100 rounded-lg p-6"> <div class="flex items-center mb-4"> <i class="fas fa-trophy text-2xl text-green-600 mr-3"></i> <h4 class="text-lg font-semibold text-gray-900">State-of-the-Art Performance</h4> </div> <p class="text-gray-700 text-sm">Achieves SOTA performance on LRS2 and LRS3 datasets for AVSR, ASR, and VSR tasks with fewer active parameters.</p> </div> <div class="bg-gradient-to-br from-purple-50 to-purple-100 rounded-lg p-6"> <div class="flex items-center mb-4"> <i class="fas fa-cogs text-2xl text-purple-600 mr-3"></i> <h4 class="text-lg font-semibold text-gray-900">Resource Efficiency</h4> </div> <p class="text-gray-700 text-sm">Addresses computational inefficiency in large models through elastic inference and cross-scale knowledge transfer.</p> </div> </div> <blockquote class="border-l-4 border-primary pl-6 py-4 bg-gray-50 rounded-r-lg mb-8"> <p class="text-lg italic text-gray-700 mb-2"> &#34;MoME requires significantly fewer parameters during inference than competing baselines, making deployment feasible on a wider range of hardware, including devices with limited computational resources.&#34; </p> <footer class="text-sm text-gray-600">— Research findings from Imperial College London and Meta AI</footer> </blockquote> </div> <!-- Development and Collaboration --> <div id="development-collaboration" class="mb-20"> <h3 class="text-3xl font-semibold text-gray-900 mb-8">Development and Collaboration</h3> <div class="grid grid-cols-1 lg:grid-cols-2 gap-12 mb-12"> <div> <img src="https://kimi-web-img.moonshot.cn/img/images.squarespace-cdn.com/72e77cdcb6fe23983ef8dd4726570db186777e9e.jpg" alt="Entrance of Imperial College London" class="w-full h-64 object-cover rounded-lg shadow-lg mb-6" size="medium" aspect="wide" style="photo" query="Imperial College London campus entrance" referrerpolicy="no-referrer" data-modified="1" data-score="0.00"/> <h4 class="text-xl font-semibold text-gray-900 mb-4">Multi-Institutional Partnership</h4> <p class="text-gray-700 leading-relaxed mb-4"> The development of MoME is a testament to collaborative research excellence, bringing together the academic prowess of Imperial College London and the industrial research capabilities of Meta AI, with contributions from NatWest AI Research. </p> <p class="text-gray-700 leading-relaxed"> The research paper, titled &#34;MoME: Mixture of Matryoshka Experts for Audio-Visual Speech Recognition,&#34; has been submitted for presentation at <strong>NeurIPS 2025</strong>, underscoring its scientific significance. </p> </div> <div> <div class="bg-gray-50 rounded-lg p-6 mb-6"> <h4 class="text-lg font-semibold text-gray-900 mb-4">Key Institutions</h4> <div class="space-y-4"> <div class="flex items-start"> <i class="fas fa-university text-primary mt-1 mr-3"></i> <div> <h5 class="font-semibold text-gray-900">Imperial College London</h5> <p class="text-sm text-gray-600">iBUG team specializing in multimodal signal processing</p> </div> </div> <div class="flex items-start"> <i class="fas fa-building text-primary mt-1 mr-3"></i> <div> <h5 class="font-semibold text-gray-900">Meta AI</h5> <p class="text-sm text-gray-600">Industrial-scale AI research and development</p> </div> </div> <div class="flex items-start"> <i class="fas fa-chart-line text-primary mt-1 mr-3"></i> <div> <h5 class="font-semibold text-gray-900">NatWest AI Research</h5> <p class="text-sm text-gray-600">Practical applications in financial services</p> </div> </div> </div> </div> <div class="bg-yellow-50 border-l-4 border-yellow-400 p-4"> <h5 class="font-semibold text-gray-900 mb-2">Important Distinction</h5> <p class="text-sm text-gray-700"> While MoME is a significant project within the Meta AI ecosystem, it is distinct from other major projects like the LLaMA series, though they may share some architectural principles such as Mixture-of-Experts. </p> </div> </div> </div> <!-- Comparison Table --> <div class="bg-white border border-gray-200 rounded-lg overflow-hidden"> <h4 class="text-xl font-semibold text-gray-900 p-6 bg-gray-50 border-b border-gray-200">MoME vs LLaMA 4 Comparison</h4> <div class="overflow-x-auto"> <table class="w-full"> <thead class="bg-gray-50"> <tr> <th class="px-6 py-4 text-left text-sm font-semibold text-gray-900">Feature</th> <th class="px-6 py-4 text-left text-sm font-semibold text-gray-900">MoME (Meta AI)</th> <th class="px-6 py-4 text-left text-sm font-semibold text-gray-900">LLaMA 4 (Meta AI)</th> </tr> </thead> <tbody class="divide-y divide-gray-200"> <tr> <td class="px-6 py-4 text-sm font-medium text-gray-900">Full Name</td> <td class="px-6 py-4 text-sm text-gray-700">Mixture of Matryoshka Experts</td> <td class="px-6 py-4 text-sm text-gray-700">Large Language Model Meta AI 4</td> </tr> <tr class="bg-gray-50"> <td class="px-6 py-4 text-sm font-medium text-gray-900">Primary Goal</td> <td class="px-6 py-4 text-sm text-gray-700">Efficient, adaptable model for AVSR</td> <td class="px-6 py-4 text-sm text-gray-700">General-purpose foundational model</td> </tr> <tr> <td class="px-6 py-4 text-sm font-medium text-gray-900">Key Innovation</td> <td class="px-6 py-4 text-sm text-gray-700">Integration of MoE with MRL</td> <td class="px-6 py-4 text-sm text-gray-700">MoE architecture for scalability</td> </tr> <tr class="bg-gray-50"> <td class="px-6 py-4 text-sm font-medium text-gray-900">Core Application</td> <td class="px-6 py-4 text-sm text-gray-700">Audio-Visual Speech Recognition</td> <td class="px-6 py-4 text-sm text-gray-700">Wide range of NLP tasks</td> </tr> </tbody> </table> </div> </div> </div> </div> </div> </section> <!-- Broader MoME Landscape --> <section id="broader-landscape" class="py-20 bg-gray-50"> <div class="container mx-auto px-4 lg:px-12"> <div class="max-w-6xl mx-auto"> <h2 class="font-serif text-4xl font-bold text-primary mb-16">The Broader &#34;MoME&#34; Landscape</h2> <!-- Mixture of Modality Experts --> <div id="modality-experts" class="mb-20"> <h3 class="text-3xl font-semibold text-gray-900 mb-8">Mixture of Modality Experts (MOME)</h3> <div class="grid grid-cols-1 lg:grid-cols-2 gap-12 mb-12"> <div> <img src="https://kimi-web-img.moonshot.cn/img/media.cnn.com/14e375d33188141242e7e559b029a63e9f9c013a.jpg" alt="Breast cancer MRI scan showing tumor detection" class="w-full h-64 object-cover rounded-lg shadow-lg mb-6" size="medium" aspect="wide" style="photo" query="breast cancer MRI scan" referrerpolicy="no-referrer" data-modified="1" data-score="0.00"/> <h4 class="text-xl font-semibold text-gray-900 mb-4">Medical AI Innovation</h4> <p class="text-gray-700 leading-relaxed mb-4"> In a completely different domain, <strong>Mixture of Modality Experts (MOME)</strong> refers to a groundbreaking AI model developed by the Hong Kong University of Science and Technology (HKUST) for non-invasive breast cancer diagnosis. </p> <p class="text-gray-700 leading-relaxed"> This model leverages a mixture-of-experts framework and transformer architecture to effectively fuse information from multiple imaging modalities, specifically multiparametric MRI (mpMRI), achieving expert-level accuracy comparable to experienced radiologists. </p> </div> <div> <div class="bg-white rounded-lg p-6 shadow-sm border border-gray-200 mb-6"> <h4 class="text-lg font-semibold text-gray-900 mb-4">Key Applications</h4> <ul class="space-y-3 text-sm"> <li class="flex items-start"> <i class="fas fa-microscope text-green-600 mt-1 mr-3"></i> <span><strong>Tumor Malignancy Classification:</strong> Expert-level accuracy in cancer detection</span> </li> <li class="flex items-start"> <i class="fas fa-dna text-green-600 mt-1 mr-3"></i> <span><strong>Molecular Subtyping:</strong> Advanced tumor characterization</span> </li> <li class="flex items-start"> <i class="fas fa-chart-line text-green-600 mt-1 mr-3"></i> <span><strong>Treatment Response Prediction:</strong> Neoadjuvant chemotherapy outcomes</span> </li> <li class="flex items-start"> <i class="fas fa-shield-alt text-green-600 mt-1 mr-3"></i> <span><strong>Non-Invasive Diagnosis:</strong> Reduced need for invasive biopsies</span> </li> </ul> </div> <div class="bg-red-50 border-l-4 border-red-400 p-4"> <h5 class="font-semibold text-gray-900 mb-2">Critical Distinction</h5> <p class="text-sm text-gray-700"> <strong>This MOME model is entirely distinct and unrelated</strong> to the Mixture of Matryoshka Experts framework developed in collaboration with Meta AI. They are separate research initiatives with different goals, developers, and underlying technologies. </p> </div> </div> </div> <div class="bg-white rounded-lg p-8 shadow-sm border border-gray-200"> <h4 class="text-xl font-semibold text-gray-900 mb-6">Technical Implementation</h4> <div class="grid grid-cols-1 md:grid-cols-3 gap-6"> <div class="text-center"> <i class="fas fa-database text-3xl text-blue-600 mb-3"></i> <h5 class="font-semibold text-gray-900 mb-2">Data Scale</h5> <p class="text-sm text-gray-700">China&#39;s largest mpMRI breast cancer dataset for training and validation</p> </div> <div class="text-center"> <i class="fas fa-brain text-3xl text-purple-600 mb-3"></i> <h5 class="font-semibold text-gray-900 mb-2">Architecture</h5> <p class="text-sm text-gray-700">Transformer-based MoE framework for multimodal fusion</p> </div> <div class="text-center"> <i class="fas fa-users text-3xl text-green-600 mb-3"></i> <h5 class="font-semibold text-gray-900 mb-2">Collaboration</h5> <p class="text-sm text-gray-700">Multi-institutional partnership including medical institutions</p> </div> </div> </div> </div> <!-- Variants and Concepts --> <div id="variants-concepts" class="mb-20"> <h3 class="text-3xl font-semibold text-gray-900 mb-8">Other Variants and Related Concepts</h3> <p class="text-gray-700 leading-relaxed mb-12"> Beyond the two primary meanings of &#34;MoME,&#34; the AI research landscape includes other related concepts and variations that leverage similar naming conventions or share underlying principles. Understanding these related ideas provides a comprehensive view of the field. </p> <!-- Comprehensive Concepts Table --> <div class="bg-white rounded-lg shadow-sm border border-gray-200 overflow-hidden mb-12"> <div class="bg-gray-50 px-6 py-4 border-b border-gray-200"> <h4 class="text-lg font-semibold text-gray-900">Summary of MoME and Related Concepts</h4> </div> <div class="overflow-x-auto"> <table class="w-full"> <thead class="bg-gray-50"> <tr> <th class="px-6 py-4 text-left text-sm font-semibold text-gray-900">Concept Name</th> <th class="px-6 py-4 text-left text-sm font-semibold text-gray-900">Developer / Research Group</th> <th class="px-6 py-4 text-left text-sm font-semibold text-gray-900">Primary Focus / Application</th> <th class="px-6 py-4 text-left text-sm font-semibold text-gray-900">Key Innovation / Feature</th> </tr> </thead> <tbody class="divide-y divide-gray-200"> <tr class="bg-blue-50"> <td class="px-6 py-4 text-sm font-medium text-gray-900">Mixture of Matryoshka Experts (MoME)</td> <td class="px-6 py-4 text-sm text-gray-700">Meta AI &amp; Imperial College London</td> <td class="px-6 py-4 text-sm text-gray-700">Audio-Visual Speech Recognition (AVSR)</td> <td class="px-6 py-4 text-sm text-gray-700">Integration of MoE with MRL for dynamic, multi-scale processing</td> </tr> <tr> <td class="px-6 py-4 text-sm font-medium text-gray-900">Mixture of Modality Experts (MOME)</td> <td class="px-6 py-4 text-sm text-gray-700">Hong Kong University of Science and Technology (HKUST)</td> <td class="px-6 py-4 text-sm text-gray-700">Non-invasive breast cancer diagnosis</td> <td class="px-6 py-4 text-sm text-gray-700">Fusing information from multiple medical imaging modalities (mpMRI)</td> </tr> <tr class="bg-gray-50"> <td class="px-6 py-4 text-sm font-medium text-gray-900">Mixture of Multimodal Experts (MoME)</td> <td class="px-6 py-4 text-sm text-gray-700">General research concept</td> <td class="px-6 py-4 text-sm text-gray-700">Enhancing generalist Multimodal Large Language Models (MLLMs)</td> <td class="px-6 py-4 text-sm text-gray-700">Combining MoVE and MoLE to mitigate task interference</td> </tr> <tr> <td class="px-6 py-4 text-sm font-medium text-gray-900">Mixture of a Million Experts (MoME)</td> <td class="px-6 py-4 text-sm text-gray-700">General research concept</td> <td class="px-6 py-4 text-sm text-gray-700">Exploring extreme scaling of MoE architectures</td> <td class="px-6 py-4 text-sm text-gray-700">Investigating massive numbers of highly specialized experts</td> </tr> <tr class="bg-gray-50"> <td class="px-6 py-4 text-sm font-medium text-gray-900">Matryoshka Mixture-of-Experts (M-MoE)</td> <td class="px-6 py-4 text-sm text-gray-700">General research concept</td> <td class="px-6 py-4 text-sm text-gray-700">Enabling elastic inference in MoE models</td> <td class="px-6 py-4 text-sm text-gray-700">Coarse-to-fine expert ranking for dynamic adjustment</td> </tr> </tbody> </table> </div> </div> <!-- Individual Concept Details --> <div class="grid grid-cols-1 md:grid-cols-3 gap-8"> <!-- Mixture of Multimodal Experts --> <div class="bg-white rounded-lg p-6 shadow-sm border border-gray-200"> <h4 class="text-lg font-semibold text-gray-900 mb-4">Mixture of Multimodal Experts</h4> <p class="text-sm text-gray-700 mb-4"> A framework designed to enhance generalist Multimodal Large Language Models (MLLMs) by addressing task interference through specialized expert systems. </p> <div class="text-xs text-gray-600"> <strong>Components:</strong> Mixture of Vision Experts (MoVE) + Mixture of Language Experts (MoLE) </div> </div> <!-- Mixture of a Million Experts --> <div class="bg-white rounded-lg p-6 shadow-sm border border-gray-200"> <h4 class="text-lg font-semibold text-gray-900 mb-4">Mixture of a Million Experts</h4> <p class="text-sm text-gray-700 mb-4"> An ambitious research direction exploring extreme scaling of MoE architectures to achieve finer-grained specialization. </p> <div class="text-xs text-gray-600"> <strong>Challenge:</strong> Designing efficient gating networks for massive expert pools </div> </div> <!-- Matryoshka Mixture-of-Experts --> <div class="bg-white rounded-lg p-6 shadow-sm border border-gray-200"> <h4 class="text-lg font-semibold text-gray-900 mb-4">Matryoshka Mixture-of-Experts</h4> <p class="text-sm text-gray-700 mb-4"> A training framework enabling elastic inference in MoE models through systematic variation of activated experts during training. </p> <div class="text-xs text-gray-600"> <strong>Innovation:</strong> Coarse-to-fine expert ranking for dynamic adjustment </div> </div> </div> </div> </div> </div> </section> <!-- Foundational Architecture --> <section id="foundational-architecture" class="py-20 bg-white"> <div class="container mx-auto px-4 lg:px-12"> <div class="max-w-6xl mx-auto"> <h2 class="font-serif text-4xl font-bold text-primary mb-16">The Foundational Architecture: Mixture of Experts (MoE)</h2> <div class="mb-12"> <p class="text-xl text-gray-700 leading-relaxed mb-8"> The <strong>Mixture of Experts (MoE)</strong> is a foundational architectural concept in deep learning that has gained significant traction in recent years, particularly in the development of large-scale AI models. The core idea is to create models with very large capacity but with computational costs that don&#39;t scale linearly with the number of parameters. </p> </div> <!-- Core Principles --> <div id="moe-principles" class="mb-20"> <h3 class="text-3xl font-semibold text-gray-900 mb-8">Core Principles of MoE</h3> <div class="grid grid-cols-1 lg:grid-cols-2 gap-12 mb-12"> <div> <h4 class="text-2xl font-semibold text-gray-900 mb-6">Sparse Model Architecture</h4> <p class="text-gray-700 leading-relaxed mb-6"> The cornerstone of MoE is its sparse model design, which departs from traditional dense architectures where all parameters are active for every computation. Instead, MoE uses multiple smaller, independent neural networks called &#34;experts.&#34; </p> <p class="text-gray-700 leading-relaxed mb-6"> For any given input, only a small subset of experts is selected to participate in the computation, while the rest remain inactive. This selective activation decouples the model&#39;s capacity from its computational cost. </p> <div class="bg-blue-50 rounded-lg p-4"> <h5 class="font-semibold text-primary mb-2">Key Benefits</h5> <ul class="text-sm text-gray-700 space-y-1"> <li>• Larger model capacity without proportional computational increase</li> <li>• Modular design enabling specialized expert training</li> <li>• Efficient resource utilization during inference</li> </ul> </div> </div> <div> <h4 class="text-2xl font-semibold text-gray-900 mb-6">Gating Network and Expert Routing</h4> <p class="text-gray-700 leading-relaxed mb-6"> The gating network acts as the &#34;brain&#34; of the MoE model, making intelligent decisions about which experts to activate for each input. This dynamic routing mechanism gives MoE its adaptability and efficiency. </p> <!-- MoE Architecture Diagram --> <div class="bg-gray-50 rounded-lg p-6 mb-6"> <h5 class="text-lg font-semibold text-gray-900 mb-4 text-center">MoE Architecture</h5> <div class="mermaid-container"> <div class="mermaid-controls"> <button class="mermaid-control-btn zoom-in" title="放大"> <i class="fas fa-search-plus"></i> </button> <button class="mermaid-control-btn zoom-out" title="缩小"> <i class="fas fa-search-minus"></i> </button> <button class="mermaid-control-btn reset-zoom" title="重置"> <i class="fas fa-expand-arrows-alt"></i> </button> <button class="mermaid-control-btn fullscreen" title="全屏查看"> <i class="fas fa-expand"></i> </button> </div> <div class="mermaid" id="moe-architecture-diagram"> graph TB A[&#34;Input Data&#34;] --&gt; B[&#34;Gating Network&#34;] B --&gt; C[&#34;Expert Selection&#34;] C --&gt; D[&#34;Expert 1&#34;] C --&gt; E[&#34;Expert 2&#34;] C --&gt; F[&#34;Expert 3&#34;] C --&gt; G[&#34;Expert 4&#34;] D --&gt; H[&#34;Weighted Combination&#34;] E --&gt; H F --&gt; H G --&gt; H H --&gt; I[&#34;Final Output&#34;] style A fill:#e3f2fd,stroke:#1976d2,stroke-width:2px,color:#000 style B fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style C fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#000 style H fill:#e8f5e8,stroke:#388e3c,stroke-width:2px,color:#000 style I fill:#fce4ec,stroke:#c2185b,stroke-width:2px,color:#000 style D fill:#ffebee,stroke:#d32f2f,stroke-width:2px,color:#000 style E fill:#ffebee,stroke:#d32f2f,stroke-width:2px,color:#000 style F fill:#ffebee,stroke:#d32f2f,stroke-width:2px,color:#000 style G fill:#ffebee,stroke:#d32f2f,stroke-width:2px,color:#000 </div> </div> </div> <div class="space-y-4"> <div class="border border-gray-200 rounded-lg p-4"> <h5 class="font-semibold text-gray-900 mb-2">Routing Process</h5> <ol class="text-sm text-gray-700 space-y-1"> <li>1. Input data passes through gating network</li> <li>2. Network produces relevance scores for each expert</li> <li>3. Top-k experts selected based on highest scores</li> <li>4. Selected experts process the input</li> <li>5. Outputs combined using weighted sum</li> </ol> </div> </div> </div> </div> </div> <!-- MoE in Meta's Ecosystem --> <div id="moe-meta-ecosystem" class="mb-20"> <h3 class="text-3xl font-semibold text-gray-900 mb-8">MoE in Meta AI&#39;s Ecosystem</h3> <div class="grid grid-cols-1 lg:grid-cols-3 gap-8 mb-12"> <div class="lg:col-span-2"> <p class="text-gray-700 leading-relaxed mb-6"> The Mixture of Experts architecture has become integral to Meta AI&#39;s strategy for developing large-scale, efficient, and powerful AI models. This adoption allows Meta to build high-capacity models while keeping computational and energy costs manageable—a crucial consideration for deployment across Meta&#39;s vast product ecosystem. </p> <div class="grid grid-cols-1 md:grid-cols-2 gap-6 mb-8"> <div class="bg-gradient-to-br from-blue-50 to-blue-100 rounded-lg p-6"> <h4 class="font-semibold text-primary mb-3">Strategic Advantages</h4> <ul class="text-sm text-gray-700 space-y-2"> <li>• High capacity with manageable costs</li> <li>• Scalable across multiple products</li> <li>• Efficient resource utilization</li> <li>• Modular architecture for specialization</li> </ul> </div> <div class="bg-gradient-to-br from-purple-50 to-purple-100 rounded-lg p-6"> <h4 class="font-semibold text-primary mb-3">Application Areas</h4> <ul class="text-sm text-gray-700 space-y-2"> <li>• Content recommendation systems</li> <li>• Feed ranking algorithms</li> <li>• AI assistants and chatbots</li> <li>• Virtual reality experiences</li> </ul> </div> </div> </div> <div> <img src="https://kimi-web-img.moonshot.cn/img/eu-images.contentstack.com/1864cf9ab095351b70a87c2b36870eee7a71caaa.jpg" alt="Meta AI data center server room" class="w-full h-48 object-cover rounded-lg shadow-lg mb-6" size="medium" aspect="wide" style="photo" query="Meta AI data center" referrerpolicy="no-referrer" data-modified="1" data-score="0.00"/> <blockquote class="bg-gray-50 border-l-4 border-primary p-4 rounded-r-lg"> <p class="text-sm italic text-gray-700 mb-2"> &#34;The use of MoE is a key enabler of Meta&#39;s vision, providing a practical path to scaling up AI capabilities across billions of users.&#34; </p> <footer class="text-xs text-gray-600">— AI Architecture Research</footer> </blockquote> </div> </div> <!-- LLaMA 4 Implementation --> <div class="bg-white rounded-lg p-8 shadow-sm border border-gray-200"> <h4 class="text-xl font-semibold text-gray-900 mb-6">Adoption in the LLaMA 4 Model Series</h4> <p class="text-gray-700 leading-relaxed mb-6"> The LLaMA 4 model series prominently features the Mixture of Experts architecture as a key design element. This strategic move creates models that are both highly capable and computationally efficient. </p> <div class="grid grid-cols-1 md:grid-cols-2 gap-8"> <div class="bg-blue-50 rounded-lg p-6"> <h5 class="font-semibold text-primary mb-4">LLaMA 4 Scout</h5> <div class="space-y-2 text-sm"> <div class="flex justify-between"> <span class="text-gray-700">Total Experts:</span> <span class="font-semibold">16</span> </div> <div class="flex justify-between"> <span class="text-gray-700">Active Experts:</span> <span class="font-semibold">2</span> </div> </div> </div> <div class="bg-purple-50 rounded-lg p-6"> <h5 class="font-semibold text-primary mb-4">LLaMA 4 Maverick</h5> <div class="space-y-2 text-sm"> <div class="flex justify-between"> <span class="text-gray-700">Total Experts:</span> <span class="font-semibold">128</span> </div> <div class="flex justify-between"> <span class="text-gray-700">Active Experts:</span> <span class="font-semibold">2</span> </div> </div> </div> </div> <div class="mt-6 p-4 bg-gray-50 rounded-lg"> <p class="text-sm text-gray-700"> <strong>Note:</strong> This design allows these models to have a very large total number of parameters while maintaining a much lower active parameter count during inference, making them more practical to deploy and use. </p> </div> </div> </div> </div> </div> </section> <!-- Comparative Analysis --> <section id="comparative-analysis" class="py-20 bg-gray-50"> <div class="container mx-auto px-4 lg:px-12"> <div class="max-w-6xl mx-auto"> <h2 class="font-serif text-4xl font-bold text-primary mb-16">Comparative Analysis</h2> <div class="bg-white rounded-lg shadow-sm border border-gray-200 overflow-hidden mb-12"> <div class="bg-gray-50 px-6 py-4 border-b border-gray-200"> <h3 class="text-xl font-semibold text-gray-900">MoME Concepts Comparison Matrix</h3> </div> <div class="overflow-x-auto"> <table class="w-full"> <thead class="bg-gray-50"> <tr> <th class="px-6 py-4 text-left text-sm font-semibold text-gray-900">Concept</th> <th class="px-6 py-4 text-left text-sm font-semibold text-gray-900">Developer</th> <th class="px-6 py-4 text-left text-sm font-semibold text-gray-900">Primary Domain</th> <th class="px-6 py-4 text-left text-sm font-semibold text-gray-900">Key Innovation</th> <th class="px-6 py-4 text-left text-sm font-semibold text-gray-900">Status</th> </tr> </thead> <tbody class="divide-y divide-gray-200"> <tr class="bg-blue-50"> <td class="px-6 py-4 text-sm font-medium text-gray-900">Mixture of Matryoshka Experts</td> <td class="px-6 py-4 text-sm text-gray-700">Meta AI &amp; Imperial College</td> <td class="px-6 py-4 text-sm text-gray-700">Audio-Visual Processing</td> <td class="px-6 py-4 text-sm text-gray-700">MoE + MRL Integration</td> <td class="px-6 py-4 text-sm text-green-700">Research (NeurIPS 2025)</td> </tr> <tr> <td class="px-6 py-4 text-sm font-medium text-gray-900">Mixture of Modality Experts</td> <td class="px-6 py-4 text-sm text-gray-700">HKUST</td> <td class="px-6 py-4 text-sm text-gray-700">Medical Imaging</td> <td class="px-6 py-4 text-sm text-gray-700">Multiparametric MRI Fusion</td> <td class="px-6 py-4 text-sm text-blue-700">Clinical Application</td> </tr> <tr class="bg-gray-50"> <td class="px-6 py-4 text-sm font-medium text-gray-900">Mixture of Multimodal Experts</td> <td class="px-6 py-4 text-sm text-gray-700">General Research</td> <td class="px-6 py-4 text-sm text-gray-700">Multimodal LLMs</td> <td class="px-6 py-4 text-sm text-gray-700">Task Interference Mitigation</td> <td class="px-6 py-4 text-sm text-orange-700">Conceptual</td> </tr> <tr> <td class="px-6 py-4 text-sm font-medium text-gray-900">Mixture of a Million Experts</td> <td class="px-6 py-4 text-sm text-gray-700">General Research</td> <td class="px-6 py-4 text-sm text-gray-700">Extreme Scaling</td> <td class="px-6 py-4 text-sm text-gray-700">Massive Expert Specialization</td> <td class="px-6 py-4 text-sm text-orange-700">Theoretical</td> </tr> <tr class="bg-gray-50"> <td class="px-6 py-4 text-sm font-medium text-gray-900">Matryoshka Mixture-of-Experts</td> <td class="px-6 py-4 text-sm text-gray-700">General Research</td> <td class="px-6 py-4 text-sm text-gray-700">Elastic Inference</td> <td class="px-6 py-4 text-sm text-gray-700">Dynamic Expert Activation</td> <td class="px-6 py-4 text-sm text-green-700">Active Research</td> </tr> </tbody> </table> </div> </div> <!-- Key Insights --> <div class="grid grid-cols-1 md:grid-cols-2 gap-8"> <div class="bg-gradient-to-br from-blue-50 to-blue-100 rounded-lg p-8"> <h3 class="text-xl font-semibold text-primary mb-4">Key Insights</h3> <ul class="space-y-3 text-gray-700"> <li class="flex items-start"> <i class="fas fa-lightbulb text-blue-600 mt-1 mr-3"></i> <span><strong>Nomenclature Overlap:</strong> The &#34;MoME&#34; acronym spans multiple distinct domains, from speech recognition to medical diagnostics</span> </li> <li class="flex items-start"> <i class="fas fa-lightbulb text-blue-600 mt-1 mr-3"></i> <span><strong>Shared Foundations:</strong> All concepts build upon the core Mixture-of-Experts architecture with specialized innovations</span> </li> <li class="flex items-start"> <i class="fas fa-lightbulb text-blue-600 mt-1 mr-3"></i> <span><strong>Context Dependency:</strong> Proper understanding requires awareness of the specific research domain and application context</span> </li> </ul> </div> <div class="bg-gradient-to-br from-green-50 to-green-100 rounded-lg p-8"> <h3 class="text-xl font-semibold text-primary mb-4">Research Implications</h3> <ul class="space-y-3 text-gray-700"> <li class="flex items-start"> <i class="fas fa-search text-green-600 mt-1 mr-3"></i> <span><strong>Literature Review:</strong> Researchers must carefully distinguish between different MoME concepts when reviewing literature</span> </li> <li class="flex items-start"> <i class="fas fa-search text-green-600 mt-1 mr-3"></i> <span><strong>Citation Accuracy:</strong> Proper attribution requires understanding the specific MoME variant being referenced</span> </li> <li class="flex items-start"> <i class="fas fa-search text-green-600 mt-1 mr-3"></i> <span><strong>Innovation Building:</strong> New research can benefit from cross-pollination between different MoME implementations</span> </li> </ul> </div> </div> </div> </div> </section> <!-- Conclusion --> <section id="conclusion" class="py-20 bg-white"> <div class="container mx-auto px-4 lg:px-12"> <div class="max-w-4xl mx-auto"> <h2 class="font-serif text-4xl font-bold text-primary mb-12 text-center">Conclusion</h2> <div class="prose prose-lg max-w-none"> <p class="text-xl text-gray-700 leading-relaxed mb-8"> The acronym <strong>&#34;MoME&#34;</strong> represents a fascinating case study in the evolution of artificial intelligence terminology, where multiple distinct concepts have converged under similar naming conventions while maintaining their unique identities and applications. </p> <div class="bg-gradient-to-r from-blue-50 to-purple-50 rounded-lg p-8 mb-12"> <h3 class="text-2xl font-semibold text-primary mb-6">Key Takeaways</h3> <div class="grid grid-cols-1 md:grid-cols-2 gap-8"> <div> <h4 class="text-lg font-semibold text-gray-900 mb-4">Primary MoME Concepts</h4> <ul class="space-y-3 text-gray-700"> <li class="flex items-start"> <i class="fas fa-microphone text-blue-600 mt-1 mr-3"></i> <span><strong>Meta AI&#39;s MoME:</strong> Mixture of Matryoshka Experts for audio-visual speech recognition, combining MoE with MRL for dynamic multi-scale processing</span> </li> <li class="flex items-start"> <i class="fas fa-heartbeat text-red-600 mt-1 mr-3"></i> <span><strong>HKUST&#39;s MOME:</strong> Mixture of Modality Experts for non-invasive breast cancer diagnosis using multiparametric MRI fusion</span> </li> </ul> </div> <div> <h4 class="text-lg font-semibold text-gray-900 mb-4">Broader Landscape</h4> <ul class="space-y-3 text-gray-700"> <li class="flex items-start"> <i class="fas fa-network-wired text-green-600 mt-1 mr-3"></i> <span><strong>Related Concepts:</strong> Multiple variants exploring different aspects of expert architectures and multimodal processing</span> </li> <li class="flex items-start"> <i class="fas fa-cogs text-purple-600 mt-1 mr-3"></i> <span><strong>Foundation:</strong> All build upon the core Mixture-of-Experts architecture with specialized innovations</span> </li> </ul> </div> </div> </div> <p class="text-lg text-gray-700 leading-relaxed mb-6"> This comprehensive analysis reveals that while the &#34;MoME&#34; acronym may appear in different contexts, each implementation serves distinct purposes and addresses unique challenges within the AI landscape. The Meta AI-Imperial College collaboration focuses on efficient multimodal processing for speech recognition, while HKUST&#39;s work targets critical healthcare applications. </p> <p class="text-lg text-gray-700 leading-relaxed mb-8"> Understanding these distinctions is crucial for researchers, practitioners, and enthusiasts navigating the complex terminology of modern AI. As the field continues to evolve, clear communication and precise terminology will remain essential for advancing knowledge and avoiding confusion. </p> <div class="bg-gray-50 rounded-lg p-8 text-center"> <h4 class="text-xl font-semibold text-primary mb-4">Future Directions</h4> <p class="text-gray-700 mb-4"> As AI research continues to advance, we can expect further innovations in expert-based architectures and multimodal processing. The success of current MoME implementations suggests promising directions for: </p> <div class="grid grid-cols-1 md:grid-cols-3 gap-4 mt-6"> <div class="bg-white rounded-lg p-4"> <i class="fas fa-rocket text-2xl text-blue-600 mb-2"></i> <p class="text-sm text-gray-700">Enhanced multimodal fusion techniques</p> </div> <div class="bg-white rounded-lg p-4"> <i class="fas fa-brain text-2xl text-green-600 mb-2"></i> <p class="text-sm text-gray-700">More efficient expert routing mechanisms</p> </div> <div class="bg-white rounded-lg p-4"> <i class="fas fa-hospital text-2xl text-red-600 mb-2"></i> <p class="text-sm text-gray-700">Expanded medical AI applications</p> </div> </div> </div> </div> </div> </div> </section> <!-- Footer --> <footer class="bg-primary text-white py-12"> <div class="container mx-auto px-4 lg:px-12"> <div class="max-w-4xl mx-auto text-center"> <h3 class="text-2xl font-semibold mb-6">References &amp; Citations</h3> <div class="grid grid-cols-1 md:grid-cols-2 gap-6 text-sm"> <div class="space-y-2"> <p> <a href="https://arxiv.org/html/2510.04136" class="text-blue-200 hover:text-white transition-colors">[1] MoME: Mixture of Matryoshka Experts for Audio-Visual Speech Recognition</a> </p> <p> <a href="https://zhuanlan.zhihu.com/p/1958924498339336955" class="text-blue-200 hover:text-white transition-colors">[2] Meta AI Research Overview</a> </p> <p> <a href="https://hkust.edu.hk/news/revolutionizing-breast-cancer-diagnosis-hkust-launches-large-ai-model-mome-testing-over-ten-0" class="text-blue-200 hover:text-white transition-colors">[3] HKUST MOME Breast Cancer Research</a> </p> <p> <a href="https://medium.com/<span class="mention-invalid">@diwakarkumar_18755</span>/understanding-mixture-of-experts-moe-architecture-in-ai-224e3b3b9243" class="text-blue-200 hover:text-white transition-colors">[4] Understanding Mixture of Experts Architecture</a> </p> </div> <div class="space-y-2"> <p> <a href="https://zhuanlan.zhihu.com/p/1892715722716722662" class="text-blue-200 hover:text-white transition-colors">[5] LLaMA 4 Model Series</a> </p> <p> <a href="https://agilayer.com/mome-transforms-multimodal-language-models/" class="text-blue-200 hover:text-white transition-colors">[6] Multimodal Language Models</a> </p> <p> <a href="https://blog.csdn.net/m0_59163425/article/details/154348069" class="text-blue-200 hover:text-white transition-colors">[7] Matryoshka Representation Learning</a> </p> <p> <a href="https://arxiv.org/html/2509.26520v1" class="text-blue-200 hover:text-white transition-colors">[8] Matryoshka Mixture-of-Experts Research</a> </p> </div> </div> <div class="mt-8 pt-8 border-t border-blue-800"> <p class="text-blue-200">© 2025 AI Research Documentation. All rights reserved.</p> </div> </div> </div> </footer> </div> <script> // Initialize Mermaid with enhanced configuration mermaid.initialize({ startOnLoad: true, theme: 'base', themeVariables: { primaryColor: '#1e3a8a', primaryTextColor: '#000000', primaryBorderColor: '#3b82f6', lineColor: '#64748b', secondaryColor: '#f1f5f9', tertiaryColor: '#e2e8f0', background: '#ffffff', mainBkg: '#ffffff', secondBkg: '#f8fafc', tertiaryBkg: '#f1f5f9', // Enhanced contrast settings nodeBkg: '#ffffff', nodeBorder: '#374151', clusterBkg: '#f8fafc', clusterBorder: '#6b7280', defaultLinkColor: '#374151', titleColor: '#111827', edgeLabelBackground: '#ffffff', nodeTextColor: '#111827', // Additional contrast improvements cScale0: '#1e3a8a', cScale1: '#3b82f6', cScale2: '#64748b', cScale3: '#94a3b8', cScale4: '#cbd5e1' }, flowchart: { useMaxWidth: false, htmlLabels: true, curve: 'basis', padding: 20 }, fontSize: 14, fontFamily: 'Inter, sans-serif' }); // Initialize Mermaid Controls for zoom and pan function initializeMermaidControls() { const containers = document.querySelectorAll('.mermaid-container'); containers.forEach(container => { const mermaidElement = container.querySelector('.mermaid'); let scale = 1; let isDragging = false; let startX, startY, translateX = 0, translateY = 0; // 触摸相关状态 let isTouch = false; let touchStartTime = 0; let initialDistance = 0; let initialScale = 1; let isPinching = false; // Zoom controls const zoomInBtn = container.querySelector('.zoom-in'); const zoomOutBtn = container.querySelector('.zoom-out'); const resetBtn = container.querySelector('.reset-zoom'); const fullscreenBtn = container.querySelector('.fullscreen'); function updateTransform() { mermaidElement.style.transform = `translate(${translateX}px, ${translateY}px) scale(${scale})`; if (scale > 1) { container.classList.add('zoomed'); } else { container.classList.remove('zoomed'); } mermaidElement.style.cursor = isDragging ? 'grabbing' : 'grab'; } if (zoomInBtn) { zoomInBtn.addEventListener('click', () => { scale = Math.min(scale * 1.25, 4); updateTransform(); }); } if (zoomOutBtn) { zoomOutBtn.addEventListener('click', () => { scale = Math.max(scale / 1.25, 0.3); if (scale <= 1) { translateX = 0; translateY = 0; } updateTransform(); }); } if (resetBtn) { resetBtn.addEventListener('click', () => { scale = 1; translateX = 0; translateY = 0; updateTransform(); }); } if (fullscreenBtn) { fullscreenBtn.addEventListener('click', () => { if (container.requestFullscreen) { container.requestFullscreen(); } else if (container.webkitRequestFullscreen) { container.webkitRequestFullscreen(); } else if (container.msRequestFullscreen) { container.msRequestFullscreen(); } }); } // Mouse Events mermaidElement.addEventListener('mousedown', (e) => { if (isTouch) return; // 如果是触摸设备,忽略鼠标事件 isDragging = true; startX = e.clientX - translateX; startY = e.clientY - translateY; mermaidElement.style.cursor = 'grabbing'; updateTransform(); e.preventDefault(); }); document.addEventListener('mousemove', (e) => { if (isDragging && !isTouch) { translateX = e.clientX - startX; translateY = e.clientY - startY; updateTransform(); } }); document.addEventListener('mouseup', () => { if (isDragging && !isTouch) { isDragging = false; mermaidElement.style.cursor = 'grab'; updateTransform(); } }); document.addEventListener('mouseleave', () => { if (isDragging && !isTouch) { isDragging = false; mermaidElement.style.cursor = 'grab'; updateTransform(); } }); // 获取两点之间的距离 function getTouchDistance(touch1, touch2) { return Math.hypot( touch2.clientX - touch1.clientX, touch2.clientY - touch1.clientY ); } // Touch Events - 触摸事件处理 mermaidElement.addEventListener('touchstart', (e) => { isTouch = true; touchStartTime = Date.now(); if (e.touches.length === 1) { // 单指拖动 isPinching = false; isDragging = true; const touch = e.touches[0]; startX = touch.clientX - translateX; startY = touch.clientY - translateY; } else if (e.touches.length === 2) { // 双指缩放 isPinching = true; isDragging = false; const touch1 = e.touches[0]; const touch2 = e.touches[1]; initialDistance = getTouchDistance(touch1, touch2); initialScale = scale; } e.preventDefault(); }, { passive: false }); mermaidElement.addEventListener('touchmove', (e) => { if (e.touches.length === 1 && isDragging && !isPinching) { // 单指拖动 const touch = e.touches[0]; translateX = touch.clientX - startX; translateY = touch.clientY - startY; updateTransform(); } else if (e.touches.length === 2 && isPinching) { // 双指缩放 const touch1 = e.touches[0]; const touch2 = e.touches[1]; const currentDistance = getTouchDistance(touch1, touch2); if (initialDistance > 0) { const newScale = Math.min(Math.max( initialScale * (currentDistance / initialDistance), 0.3 ), 4); scale = newScale; updateTransform(); } } e.preventDefault(); }, { passive: false }); mermaidElement.addEventListener('touchend', (e) => { // 重置状态 if (e.touches.length === 0) { isDragging = false; isPinching = false; initialDistance = 0; // 延迟重置isTouch,避免鼠标事件立即触发 setTimeout(() => { isTouch = false; }, 100); } else if (e.touches.length === 1 && isPinching) { // 从双指变为单指,切换为拖动模式 isPinching = false; isDragging = true; const touch = e.touches[0]; startX = touch.clientX - translateX; startY = touch.clientY - translateY; } updateTransform(); }); mermaidElement.addEventListener('touchcancel', (e) => { isDragging = false; isPinching = false; initialDistance = 0; setTimeout(() => { isTouch = false; }, 100); updateTransform(); }); // Enhanced wheel zoom with better center point handling container.addEventListener('wheel', (e) => { e.preventDefault(); const rect = container.getBoundingClientRect(); const centerX = rect.width / 2; const centerY = rect.height / 2; const delta = e.deltaY > 0 ? 0.9 : 1.1; const newScale = Math.min(Math.max(scale * delta, 0.3), 4); // Adjust translation to zoom towards center if (newScale !== scale) { const scaleDiff = newScale / scale; translateX = translateX * scaleDiff; translateY = translateY * scaleDiff; scale = newScale; if (scale <= 1) { translateX = 0; translateY = 0; } updateTransform(); } }); // Initialize display updateTransform(); }); } // Initialize the controls when the DOM is loaded document.addEventListener('DOMContentLoaded', function() { initializeMermaidControls(); }); // Toggle TOC on mobile const tocToggle = document.getElementById('toc-toggle'); const tocClose = document.getElementById('toc-close'); const tocNav = document.getElementById('toc-nav'); tocToggle.addEventListener('click', () => { tocNav.classList.remove('-translate-x-full'); }); tocClose.addEventListener('click', () => { tocNav.classList.add('-translate-x-full'); }); // Smooth scrolling for anchor links document.querySelectorAll('a[href^="#"]').forEach(anchor => { anchor.addEventListener('click', function (e) { e.preventDefault(); const target = document.querySelector(this.getAttribute('href')); if (target) { target.scrollIntoView({ behavior: 'smooth', block: 'start' }); // Close TOC on mobile after clicking a link if (window.innerWidth < 1024) { tocNav.classList.add('-translate-x-full'); } } }); }); // Highlight active section in TOC const sections = document.querySelectorAll('section[id]'); const navLinks = document.querySelectorAll('nav a[href^="#"]'); function highlightActiveSection() { let current = ''; sections.forEach(section => { const sectionTop = section.offsetTop; const sectionHeight = section.clientHeight; if (scrollY >= (sectionTop - 200)) { current = section.getAttribute('id'); } }); navLinks.forEach(link => { link.classList.remove('bg-primary', 'text-white'); if (link.getAttribute('href') === '#' + current) { link.classList.add('bg-primary', 'text-white'); } }); } window.addEventListener('scroll', highlightActiveSection); highlightActiveSection(); // Initial call </script> </body></html>

讨论回复

0 条回复

还没有人回复,快来发表你的看法吧!