MGPUSim与Akita框架

未知用户 (QianXun) • 2025年11月20日 09:55
                        <!DOCTYPE html><html lang="zh-CN"><head>
    <meta charset="utf-8"/>
    <meta content="width=device-width, initial-scale=1.0" name="viewport"/>
    <title>MGPUSim与Akita框架深度研究：多GPU互连架构与分析</title>
    <script src="https://cdn.tailwindcss.com"></script>
    <script>
        tailwind.config = {
            theme: {
                extend: {
                    colors: {
                        primary: '#1a237e',
                        secondary: '#3949ab',
                        accent: '#5c6bc0',
                        neutral: '#374151',
                        'base-100': '#ffffff',
                        'base-200': '#f8fafc',
                        'base-300': '#e2e8f0'
                    },
                    fontFamily: {
                        'serif': ['Playfair Display', 'serif'],
                        'sans': ['Inter', 'sans-serif']
                    }
                }
            }
        }
    </script>
    <link href="https://fonts.googleapis.com/css2?family=Playfair+Display:ital,wght@0,400;0,600;1,400&amp;family=Inter:wght@300;400;500;600;700&amp;display=swap" rel="stylesheet"/>
    <link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.0/css/all.min.css" rel="stylesheet"/>
    <script src="https://cdn.jsdelivr.net/npm/mermaid/dist/mermaid.min.js"></script>
    <style>
        .gradient-overlay {
            background: linear-gradient(135deg, rgba(26, 35, 126, 0.9) 0%, rgba(57, 73, 171, 0.8) 100%);
        }
        
        .text-shadow {
            text-shadow: 0 2px 4px rgba(0, 0, 0, 0.3);
        }
        
        .backdrop-blur {
            backdrop-filter: blur(8px);
        }
        
        .citation-link {
            color: #5c6bc0;
            text-decoration: none;
            font-weight: 500;
            transition: color 0.2s ease;
        }
        
        .citation-link:hover {
            color: #3949ab;
            text-decoration: underline;
        }
        
        .toc-link {
            transition: all 0.2s ease;
        }
        
        .toc-link:hover {
            background-color: rgba(92, 107, 192, 0.1);
            border-left: 3px solid #5c6bc0;
            padding-left: 1.5rem;
        }
        
        .section-divider {
            background: linear-gradient(90deg, transparent 0%, #e2e8f0 50%, transparent 100%);
            height: 1px;
        }
        
        .bento-grid {
            display: grid;
            grid-template-columns: 2fr 1fr;
            grid-template-rows: auto auto;
            gap: 1.5rem;
            height: auto;
        }
        
        .bento-main {
            grid-row: 1 / 3;
        }
        
        .mermaid-container {
            display: flex;
            justify-content: center;
            min-height: 300px;
            max-height: 800px;
            background: #ffffff;
            border: 2px solid #e5e7eb;
            border-radius: 12px;
            padding: 30px;
            margin: 30px 0;
            box-shadow: 0 8px 25px rgba(0, 0, 0, 0.08);
            position: relative;
            overflow: hidden;
        }
        
        .mermaid-container .mermaid {
            width: 100%;
            max-width: 100%;
            height: 100%;
            cursor: grab;
            transition: transform 0.3s ease;
            transform-origin: center center;
            display: flex;
            justify-content: center;
            align-items: center;
            touch-action: none;
            -webkit-user-select: none;
            -moz-user-select: none;
            -ms-user-select: none;
            user-select: none;
        }
        
        .mermaid-container .mermaid svg {
            max-width: 100%;
            height: 100%;
            display: block;
            margin: 0 auto;
        }
        
        .mermaid-container .mermaid:active {
            cursor: grabbing;
        }
        
        .mermaid-container.zoomed .mermaid {
            height: 100%;
            width: 100%;
            cursor: grab;
        }
        
        .mermaid-controls {
            position: absolute;
            top: 15px;
            right: 15px;
            display: flex;
            gap: 10px;
            z-index: 20;
            background: rgba(255, 255, 255, 0.95);
            padding: 8px;
            border-radius: 8px;
            box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);
        }
        
        .mermaid-control-btn {
            background: #ffffff;
            border: 1px solid #d1d5db;
            border-radius: 6px;
            padding: 10px;
            cursor: pointer;
            transition: all 0.2s ease;
            color: #374151;
            font-size: 14px;
            min-width: 36px;
            height: 36px;
            text-align: center;
            display: flex;
            align-items: center;
            justify-content: center;
        }
        
        .mermaid-control-btn:hover {
            background: #f8fafc;
            border-color: #3b82f6;
            color: #3b82f6;
            transform: translateY(-1px);
        }
        
        .mermaid-control-btn:active {
            transform: scale(0.95);
        }
        
        /* Ensure proper text contrast in mermaid diagrams */
        .mermaid .node rect,
        .mermaid .node circle,
        .mermaid .node ellipse,
        .mermaid .node polygon {
            stroke: #1a237e !important;
            stroke-width: 2px !important;
        }
        
        .mermaid .node .label {
            color: #1a237e !important;
            font-weight: 600 !important;
            font-size: 14px !important;
            text-shadow: 0 1px 2px rgba(255, 255, 255, 0.8) !important;
        }
        
        .mermaid .edgePath .path {
            stroke: #3949ab !important;
            stroke-width: 2px !important;
        }
        
        .mermaid .edgeLabel {
            background-color: #ffffff !important;
            color: #1a237e !important;
            font-weight: 500 !important;
            border: 1px solid #e2e8f0 !important;
            border-radius: 4px !important;
            padding: 2px 6px !important;
        }
        
        /* Ensure high contrast for different node colors */
        .mermaid .node.primary rect,
        .mermaid .node.primary circle,
        .mermaid .node.primary polygon {
            fill: #1a237e !important;
        }
        
        .mermaid .node.primary .label {
            color: #ffffff !important;
            text-shadow: 0 1px 2px rgba(0, 0, 0, 0.5) !important;
        }
        
        .mermaid .node.secondary rect,
        .mermaid .node.secondary circle,
        .mermaid .node.secondary polygon {
            fill: #3949ab !important;
        }
        
        .mermaid .node.secondary .label {
            color: #ffffff !important;
            text-shadow: 0 1px 2px rgba(0, 0, 0, 0.5) !important;
        }
        
        .mermaid .node.accent rect,
        .mermaid .node.accent circle,
        .mermaid .node.accent polygon {
            fill: #5c6bc0 !important;
        }
        
        .mermaid .node.accent .label {
            color: #ffffff !important;
            text-shadow: 0 1px 2px rgba(0, 0, 0, 0.5) !important;
        }
        
        .mermaid .node.neutral rect,
        .mermaid .node.neutral circle,
        .mermaid .node.neutral polygon {
            fill: #374151 !important;
        }
        
        .mermaid .node.neutral .label {
            color: #ffffff !important;
            text-shadow: 0 1px 2px rgba(0, 0, 0, 0.5) !important;
        }
        
        .mermaid .node.basewhite rect,
        .mermaid .node.basewhite circle,
        .mermaid .node.basewhite polygon {
            fill: #ffffff !important;
            stroke: #1a237e !important;
            stroke-width: 2px !important;
        }
        
        .mermaid .node.basewhite .label {
            color: #1a237e !important;
            text-shadow: 0 1px 2px rgba(255, 255, 255, 0.8) !important;
        }
        
        .mermaid .node.basewhite {
            fill: #ffffff !important;
            stroke: #1a237e !important;
            stroke-width: 2px !important;
        }
        
        .mermaid .node.basewhite text {
            fill: #1a237e !important;
            font-weight: 600 !important;
            text-shadow: 0 1px 2px rgba(255, 255, 255, 0.8) !important;
        }

        /* Responsive styles for mermaid controls */
        <span class="mention-invalid">@media</span> (max-width: 1024px) {
            .mermaid-control-btn:not(.reset-zoom) {
                display: none;
            }
            .mermaid-controls {
                top: auto;
                bottom: 15px;
                right: 15px;
            }
        }
    </style>
  <base target="_blank">
</head>

  <body class="font-sans bg-base-100 text-neutral leading-relaxed">
    <!-- Fixed Table of Contents -->
    <div class="fixed left-0 top-0 h-full w-80 bg-base-200 border-r border-base-300 z-40 overflow-y-auto hidden md:block">
      <div class="p-6">
        <h3 class="font-serif text-lg font-semibold text-primary mb-4">目录</h3>
        <nav class="space-y-2">
          <a class="toc-link block py-2 px-4 text-sm rounded-lg" href="#overview">1. 框架概述</a>
          <div class="ml-4 space-y-1">
            <a class="toc-link block py-1 px-3 text-xs text-neutral/70" href="#mgpusim-overview">1.1 MGPUSim模拟器</a>
            <a class="toc-link block py-1 px-3 text-xs text-neutral/70" href="#akita-overview">1.2 Akita框架</a>
            <a class="toc-link block py-1 px-3 text-xs text-neutral/70" href="#relationship">1.3 两者关系</a>
          </div>
          <a class="toc-link block py-2 px-4 text-sm rounded-lg" href="#interconnect">2. 多GPU互连架构</a>
          <div class="ml-4 space-y-1">
            <a class="toc-link block py-1 px-3 text-xs text-neutral/70" href="#akita-network">2.1 Akita网络模型</a>
            <a class="toc-link block py-1 px-3 text-xs text-neutral/70" href="#mgpusim-impl">2.2 MGPUSim实现</a>
          </div>
          <a class="toc-link block py-2 px-4 text-sm rounded-lg" href="#performance">3. 性能建模与优化</a>
          <div class="ml-4 space-y-1">
            <a class="toc-link block py-1 px-3 text-xs text-neutral/70" href="#modeling">3.1 性能建模</a>
            <a class="toc-link block py-1 px-3 text-xs text-neutral/70" href="#locality-api">3.2 Locality API</a>
            <a class="toc-link block py-1 px-3 text-xs text-neutral/70" href="#pasi">3.3 PASI策略</a>
          </div>
          <a class="toc-link block py-2 px-4 text-sm rounded-lg" href="#applications">4. 应用场景</a>
          <div class="ml-4 space-y-1">
            <a class="toc-link block py-1 px-3 text-xs text-neutral/70" href="#use-cases">4.1 典型应用</a>
            <a class="toc-link block py-1 px-3 text-xs text-neutral/70" href="#evaluation">4.2 评估方法</a>
          </div>
        </nav>
      </div>
    </div>

    <!-- Main Content -->
    <div class="ml-0 md:ml-80 min-h-screen">
      <!-- Hero Section with Bento Grid -->
      <section class="relative bg-gradient-to-br from-slate-50 to-blue-50 py-16">
        <div class="max-w-7xl mx-auto px-6">
          <div class="bento-grid">
            <!-- Main Content Area -->
            <div class="bento-main relative">
              <div class="relative h-96 rounded-2xl overflow-hidden shadow-2xl">
                <img alt="抽象技术背景中的多GPU互连网络" class="w-full h-full object-cover" src="https://kimi-web-img.moonshot.cn/img/pic2.zhimg.com/6a5bc500908866f3f95ffd8ffd04dce788e0f3c8.jpg" size="wallpaper" aspect="wide" query="抽象技术背景 多GPU互连网络" referrerpolicy="no-referrer" data-modified="1" data-score="0.00"/>
                <div class="absolute inset-0 gradient-overlay"></div>
                <div class="absolute inset-0 flex items-center justify-center p-8">
                  <div class="text-center text-white">
                    <h1 class="font-serif text-4xl md:text-5xl font-bold mb-6 text-shadow leading-tight">
                      <em>MGPUSim与Akita框架</em>
                      <br/>
                      <span class="text-2xl md:text-3xl font-normal">多GPU互连架构深度解析</span>
                    </h1>
                    <p class="text-xl opacity-90 text-shadow">
                      高性能模拟器构建引擎与精确多GPU系统建模的完美结合
                    </p>
                  </div>
                </div>
              </div>
            </div>

            <!-- Side Panels -->
            <div class="space-y-6">
              <div class="bg-white rounded-xl p-6 shadow-lg border border-base-300">
                <div class="flex items-center mb-4">
                  <i class="fas fa-microchip text-primary text-xl mr-3"></i>
                  <h3 class="font-semibold text-lg">核心特性</h3>
                </div>
                <ul class="space-y-2 text-sm">
                  <li class="flex items-center">
                    <i class="fas fa-check-circle text-green-500 mr-2"></i>
                    <span>高精度模拟误差 &lt; 5.5%</span>
                  </li>
                  <li class="flex items-center">
                    <i class="fas fa-check-circle text-green-500 mr-2"></i>
                    <span>并行模拟加速 2.5-3.5倍</span>
                  </li>
                  <li class="flex items-center">
                    <i class="fas fa-check-circle text-green-500 mr-2"></i>
                    <span>支持PCIe、NVLink互连</span>
                  </li>
                </ul>
              </div>

              <div class="bg-white rounded-xl p-6 shadow-lg border border-base-300">
                <div class="flex items-center mb-4">
                  <i class="fas fa-chart-line text-primary text-xl mr-3"></i>
                  <h3 class="font-semibold text-lg">性能提升</h3>
                </div>
                <div class="space-y-3">
                  <div>
                    <div class="flex justify-between text-sm mb-1">
                      <span>Locality API</span>
                      <span class="font-semibold">1.6×</span>
                    </div>
                    <div class="w-full bg-gray-200 rounded-full h-2">
                      <div class="bg-primary h-2 rounded-full" style="width: 60%"></div>
                    </div>
                  </div>
                  <div>
                    <div class="flex justify-between text-sm mb-1">
                      <span>PASI优化</span>
                      <span class="font-semibold">2.6×</span>
                    </div>
                    <div class="w-full bg-gray-200 rounded-full h-2">
                      <div class="bg-secondary h-2 rounded-full" style="width: 80%"></div>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </section>

      <!-- Main Article Content -->
      <article class="max-w-5xl mx-auto px-6 py-12">
        <!-- Introduction -->
        <div class="prose prose-lg max-w-none mb-16">
          <div class="bg-blue-50 border-l-4 border-primary p-6 rounded-r-lg mb-8">
            <p class="text-lg leading-relaxed text-primary font-medium">
              MGPUSim与Akita框架代表计算机体系结构研究领域的重大突破，为研究人员提供了前所未有的多GPU系统模拟和分析能力。Akita作为通用的模拟器构建引擎，与基于其构建的专用MGPUSim模拟器相结合，形成了一个强大而灵活的研究平台。
            </p>
          </div>
        </div>

        <!-- Section 1: Overview -->
        <section class="mb-16" id="overview">
          <header class="mb-8">
            <h2 class="font-serif text-3xl font-bold text-primary mb-4">1. MGPUSim与Akita框架概述</h2>
            <div class="section-divider mb-6"></div>
          </header>

          <div class="prose prose-lg max-w-none">
            <p class="text-lg leading-relaxed mb-6">
              随着数据并行工作负载的规模和复杂性日益增长，单GPU平台已难以满足高性能计算（HPC）领域对算力的极致需求。多GPU系统通过聚合多个GPU的计算能力和内存容量，成为当前及未来高性能计算的主流解决方案。然而，多GPU系统的复杂性也带来了微架构设计、互连结构、运行时库和编程模型等一系列挑战。
            </p>
          </div>

          <!-- MGPUSim Overview -->
          <div class="mb-12" id="mgpusim-overview">
            <h3 class="font-serif text-2xl font-semibold text-secondary mb-6">1.1 MGPUSim：面向AMD GCN3架构的多GPU模拟器</h3>

            <div class="grid md:grid-cols-2 gap-8 mb-8">
              <div>
                <p class="mb-4">
                  MGPUSim是一个开源的、高度灵活且性能卓越的多GPU模拟器，专门用于模拟基于AMD Graphics Core Next 3 (GCN3)指令集架构的GPU <a class="citation-link" href="https://ieeexplore.ieee.org/document/8980359">[116]</a>
                  <a class="citation-link" href="https://dl.acm.org/doi/10.1145/3307650.3322230">[118]</a>。该模拟器采用Go语言开发，旨在为计算机架构研究人员提供一个能够快速、并行化且准确地进行多GPU系统仿真的平台。
                </p>

                <div class="bg-gray-50 p-4 rounded-lg mb-4">
                  <h4 class="font-semibold mb-2">核心特性</h4>
                  <ul class="space-y-1 text-sm">
                    <li>• <strong>高灵活性</strong>：轻松配置不同多GPU系统架构</li>
                    <li>• <strong>高性能</strong>：支持多线程并行模拟</li>
                    <li>• <strong>高准确性</strong>：与实际GPU硬件高度吻合</li>
                  </ul>
                </div>
              </div>

              <div>
                <img alt="AMD GCN3 GPU架构图" class="w-full h-64 object-cover rounded-lg shadow-md" src="https://kimi-web-img.moonshot.cn/img/miro.medium.com/e0b7b5b3e4aace403a0cd6d172b4ad1a90ed7ab7.png" size="medium" aspect="wide" style="photo" query="AMD GCN3 GPU架构" referrerpolicy="no-referrer" data-modified="1" data-score="0.00"/>
              </div>
            </div>

            <div class="bg-blue-50 border border-blue-200 rounded-lg p-6 mb-6">
              <h4 class="font-semibold text-primary mb-3">性能验证结果</h4>
              <div class="grid md:grid-cols-3 gap-4">
                <div class="text-center">
                  <div class="text-2xl font-bold text-primary">5.5%</div>
                  <div class="text-sm text-gray-600">平均模拟误差</div>
                </div>
                <div class="text-center">
                  <div class="text-2xl font-bold text-secondary">3.5×</div>
                  <div class="text-sm text-gray-600">功能仿真加速</div>
                </div>
                <div class="text-center">
                  <div class="text-2xl font-bold text-accent">2.5×</div>
                  <div class="text-sm text-gray-600">时序仿真加速</div>
                </div>
              </div>
            </div>
          </div>

          <!-- Akita Overview -->
          <div class="mb-12" id="akita-overview">
            <h3 class="font-serif text-2xl font-semibold text-secondary mb-6">1.2 Akita：下一代计算机架构模拟框架</h3>

            <p class="mb-6">
              Akita是一个旨在构建下一代高性能、高灵活性计算机架构模拟器的框架，其设计特别关注开发者的使用体验 <a class="citation-link" href="https://sarchlab.org/akita">[5]</a>
              <a class="citation-link" href="https://sarchlab.org/akita">[150]</a>。Akita不仅仅是一个单一的模拟器，而是一个用于构建各种计算机架构模拟器的通用引擎。
            </p>

            <!-- Akita Framework Features -->
            <div class="bg-white rounded-xl shadow-lg p-6 mb-8 border border-base-300">
              <h4 class="font-semibold mb-4 flex items-center">
                <i class="fas fa-cogs text-primary mr-3"></i>
                Akita框架核心组件
              </h4>
              <div class="grid md:grid-cols-2 gap-6">
                <div>
                  <h5 class="font-medium mb-2">基础硬件模型库</h5>
                  <ul class="text-sm space-y-1">
                    <li>• 缓存模型（写直达/写回策略）</li>
                    <li>• TLB和内存控制器</li>
                    <li>• NoC（片上网络）组件</li>
                    <li>• 交换机和互连链路</li>
                  </ul>
                </div>
                <div>
                  <h5 class="font-medium mb-2">开发辅助工具</h5>
                  <ul class="text-sm space-y-1">
                    <li>• <strong>Daisen</strong>：Web可视化工具</li>
                    <li>• <strong>AkitaRTM</strong>：实时监视器</li>
                    <li>• <strong>ArchSim</strong>：模拟即服务平台</li>
                  </ul>
                </div>
              </div>
            </div>
          </div>

          <!-- Relationship -->
          <div class="mb-12" id="relationship">
            <h3 class="font-serif text-2xl font-semibold text-secondary mb-6">1.3 两者关系：MGPUSim作为Akita框架的实例化应用</h3>

            <p class="mb-6">
              MGPUSim与Akita框架之间的关系是<strong>实例与平台、应用与引擎</strong>的关系。MGPUSim并非一个从零开始独立开发的模拟器，而是基于Akita这一强大的模拟框架构建的，专门用于多GPU系统研究的特定实例 <a class="citation-link" href="https://sites.google.com/view/jlabellan/tutorial-mgpusim">[148]</a>。
            </p>

            <!-- Comparison Table -->
            <div class="overflow-x-auto mb-8">
              <table class="w-full bg-white rounded-lg shadow-md border border-base-300">
                <thead class="bg-primary text-white">
                  <tr>
                    <th class="px-6 py-3 text-left font-semibold">特性</th>
                    <th class="px-6 py-3 text-left font-semibold">MGPUSim</th>
                    <th class="px-6 py-3 text-left font-semibold">Akita框架</th>
                  </tr>
                </thead>
                <tbody class="divide-y divide-base-300">
                  <tr>
                    <td class="px-6 py-3 font-medium">定位</td>
                    <td class="px-6 py-3">面向AMD GCN3架构的<strong>多GPU模拟器</strong></td>
                    <td class="px-6 py-3"><strong>模拟器构建引擎</strong>，用于创建各类架构模拟器</td>
                  </tr>
                  <tr class="bg-base-200">
                    <td class="px-6 py-3 font-medium">核心功能</td>
                    <td class="px-6 py-3">模拟多GPU系统，支持GCN3 ISA，提供优化策略（Locality API, PASI）</td>
                    <td class="px-6 py-3">提供事件驱动模拟核心、模块化组件库、开发工具（Daisen, AkitaRTM）</td>
                  </tr>
                  <tr>
                    <td class="px-6 py-3 font-medium">架构依赖</td>
                    <td class="px-6 py-3"><strong>基于Akita框架构建</strong>
                      <a class="citation-link" href="https://sarchlab.org/akita">[5]</a>
                    </td>
                    <td class="px-6 py-3">独立的底层框架</td>
                  </tr>
                </tbody>
              </table>
            </div>

            <!-- Architecture Diagram -->
            <div class="bg-gray-50 p-6 rounded-lg">
              <h4 class="font-semibold mb-4">架构依赖关系</h4>
              <div class="mermaid-container">
                <div class="mermaid-controls">
                  <button class="mermaid-control-btn zoom-in" title="放大">
                    <i class="fas fa-search-plus"></i>
                  </button>
                  <button class="mermaid-control-btn zoom-out" title="缩小">
                    <i class="fas fa-search-minus"></i>
                  </button>
                  <button class="mermaid-control-btn reset-zoom" title="重置">
                    <i class="fas fa-expand-arrows-alt"></i>
                  </button>
                  <button class="mermaid-control-btn fullscreen" title="全屏查看">
                    <i class="fas fa-expand"></i>
                  </button>
                </div>
                <div id="architecture-diagram" class="mermaid">
                  graph TB
                  A[&#34;Akita Framework&#34;] --&gt; B[&#34;Event-driven Simulation Core&#34;]
                  A --&gt; C[&#34;Modular Component Library&#34;]
                  A --&gt; D[&#34;Development Tools&#34;]

                  C --&gt; E[&#34;Network Models&#34;]
                  C --&gt; F[&#34;Cache Models&#34;]
                  C --&gt; G[&#34;Memory Controllers&#34;]

                  E --&gt; H[&#34;PCIe&#34;]
                  E --&gt; I[&#34;NVLink&#34;]

                  M[&#34;MGPUSim&#34;] --&gt; J[&#34;GCN3 GPU Models&#34;]
                  M --&gt; K[&#34;Multi-GPU Interconnect&#34;]
                  M --&gt; L[&#34;Optimization Strategies&#34;]

                  B --&gt; M
                  E --&gt; M
                  F --&gt; M
                  G --&gt; M

                  style A fill:#1a237e,color:#fff,stroke:#3949ab,stroke-width:2px
                  style M fill:#3949ab,color:#fff,stroke:#5c6bc0,stroke-width:2px
                  style B fill:#f8fafc,color:#1a237e,stroke:#5c6bc0,stroke-width:2px
                  style C fill:#f8fafc,color:#1a237e,stroke:#5c6bc0,stroke-width:2px
                  style D fill:#f8fafc,color:#1a237e,stroke:#5c6bc0,stroke-width:2px
                  style E fill:#e2e8f0,color:#1a237e,stroke:#5c6bc0,stroke-width:2px
                  style F fill:#e2e8f0,color:#1a237e,stroke:#5c6bc0,stroke-width:2px
                  style G fill:#e2e8f0,color:#1a237e,stroke:#5c6bc0,stroke-width:2px
                  style H fill:#ffffff,color:#1a237e,stroke:#5c6bc0,stroke-width:2px
                  style I fill:#ffffff,color:#1a237e,stroke:#5c6bc0,stroke-width:2px
                  style J fill:#ffffff,color:#1a237e,stroke:#5c6bc0,stroke-width:2px
                  style K fill:#ffffff,color:#1a237e,stroke:#5c6bc0,stroke-width:2px
                  style L fill:#ffffff,color:#1a237e,stroke:#5c6bc0,stroke-width:2px
                </div>
              </div>
            </div>
          </div>
        </section>

        <!-- Section 2: Interconnect Architecture -->
        <section class="mb-16" id="interconnect">
          <header class="mb-8">
            <h2 class="font-serif text-3xl font-bold text-primary mb-4">2. 多GPU互连的架构设计与实现机制</h2>
            <div class="section-divider mb-6"></div>
          </header>

          <p class="text-lg leading-relaxed mb-8">
            多GPU系统的性能在很大程度上取决于其互连网络的效率。一个设计优良的互连网络能够最大限度地减少GPU之间的通信延迟和带宽瓶颈，从而充分发挥多GPU系统的并行计算潜力。MGPUSim与Akita框架在设计和实现上，充分考虑了多GPU互连的复杂性和多样性，提供了一个通用、灵活且高精度的网络建模解决方案。
          </p>

          <!-- Akita Network Model -->
          <div class="mb-12" id="akita-network">
            <h3 class="font-serif text-2xl font-semibold text-secondary mb-6">2.1 Akita框架的通用网络模型</h3>

            <div class="grid md:grid-cols-3 gap-6 mb-8">
              <div class="bg-white rounded-lg shadow-md p-6 border border-base-300">
                <div class="text-center mb-4">
                  <i class="fas fa-network-wired text-3xl text-primary mb-2"></i>
                  <h4 class="font-semibold">设计理念</h4>
                </div>
                <p class="text-sm text-gray-600">
                  模拟多种互连类型（PCIe、NVLink等），实现通用性和可扩展性 <a class="citation-link" href="https://medium.com/<span class="mention-invalid">@_syifan_</span>/akita-now-supports-network-modeling-9825fdaed5e8">[66]</a>
                </p>
              </div>

              <div class="bg-white rounded-lg shadow-md p-6 border border-base-300">
                <div class="text-center mb-4">
                  <i class="fas fa-clock text-3xl text-secondary mb-2"></i>
                  <h4 class="font-semibold">事件处理</h4>
                </div>
                <p class="text-sm text-gray-600">
                  通过事件分类机制确保模拟正确性，优先处理主要事件
                </p>
              </div>

              <div class="bg-white rounded-lg shadow-md p-6 border border-base-300">
                <div class="text-center mb-4">
                  <i class="fas fa-sitemap text-3xl text-accent mb-2"></i>
                  <h4 class="font-semibold">拓扑支持</h4>
                </div>
                <p class="text-sm text-gray-600">
                  支持树形、总线、星形、网格等多种网络拓扑结构
                </p>
              </div>
            </div>

            <!-- Network Components -->
            <div class="bg-white rounded-xl shadow-lg p-6 mb-8 border border-base-300">
              <h4 class="font-semibold mb-4 flex items-center">
                <i class="fas fa-puzzle-piece text-primary mr-3"></i>
                核心网络组件
              </h4>

              <div class="grid md:grid-cols-2 gap-6">
                <div class="border-l-4 border-primary pl-4">
                  <h5 class="font-medium mb-2">端点（Endpoint）</h5>
                  <p class="text-sm text-gray-600 mb-2">
                    设备接入网络的接口，负责消息的收发和分片处理。
                  </p>
                  <ul class="text-xs space-y-1">
                    <li>• 收集设备端口消息</li>
                    <li>• 分割成flit传输单元</li>
                    <li>• 重组接收到的flit</li>
                  </ul>
                </div>

                <div class="border-l-4 border-secondary pl-4">
                  <h5 class="font-medium mb-2">交换机（Switch）</h5>
                  <p class="text-sm text-gray-600 mb-2">
                    网络中的路由和转发节点，负责flit的转发和冲突仲裁。
                  </p>
                  <ul class="text-xs space-y-1">
                    <li>• 路由算法决定转发路径</li>
                    <li>• 仲裁算法解决端口冲突</li>
                    <li>• 支持多种网络特性配置</li>
                  </ul>
                </div>
              </div>
            </div>
          </div>

          <!-- MGPUSim Implementation -->
          <div class="mb-12" id="mgpusim-impl">
            <h3 class="font-serif text-2xl font-semibold text-secondary mb-6">2.2 MGPUSim的多GPU互连实现</h3>

            <div class="grid md:grid-cols-2 gap-8 mb-8">
              <div>
                <h4 class="font-semibold mb-4">集成Akita网络模型</h4>
                <p class="mb-4">
                  MGPUSim通过直接调用Akita框架提供的网络模型API，实现了对多GPU互连网络的<strong>周期级（cycle-level）模拟</strong>。这意味着模拟器能够精确地跟踪和计算每一个数据包在网络中的传输过程。
                </p>

                <div class="bg-blue-50 p-4 rounded-lg">
                  <h5 class="font-medium mb-2">关键优势</h5>
                  <ul class="text-sm space-y-1">
                    <li>• 精确计算通信延迟</li>
                    <li>• 真实反映NUMA效应</li>
                    <li>• 支持复杂网络拓扑</li>
                  </ul>
                </div>
              </div>

              <div>
                <img alt="多GPU系统互连架构示意图" class="w-full h-64 object-cover rounded-lg shadow-md" src="https://kimi-web-img.moonshot.cn/img/pica.zhimg.com/03bed55e921fb3a3746f0f3cd072b64112ca2b9c.jpg" size="medium" aspect="wide" query="多GPU互连架构" referrerpolicy="no-referrer" data-modified="1" data-score="0.00"/>
              </div>
            </div>

            <!-- Implementation Details -->
            <div class="space-y-6">
              <div class="bg-white rounded-lg shadow-md p-6 border border-base-300">
                <h5 class="font-semibold mb-3 flex items-center">
                  <i class="fas fa-layer-group text-primary mr-2"></i>
                  端点实现：消息分片与重组
                </h5>
                <p class="text-sm text-gray-600 mb-3">
                  端点作为GPU与互连网络之间的桥梁，负责处理所有进出GPU的网络流量。它将消息分割成flit进行传输，并在接收端重组完整消息。
                </p>
              </div>

              <div class="bg-white rounded-lg shadow-md p-6 border border-base-300">
                <h5 class="font-semibold mb-3 flex items-center">
                  <i class="fas fa-route text-secondary mr-2"></i>
                  交换机实现：路由与仲裁算法
                </h5>
                <p class="text-sm text-gray-600 mb-3">
                  交换机支持可配置的路由算法和仲裁算法，研究人员能够深入分析和比较不同策略对多GPU系统性能的影响。
                </p>
              </div>
            </div>
          </div>
        </section>

        <!-- Section 3: Performance Modeling -->
        <section class="mb-16" id="performance">
          <header class="mb-8">
            <h2 class="font-serif text-3xl font-bold text-primary mb-4">3. 性能建模与优化策略</h2>
            <div class="section-divider mb-6"></div>
          </header>

          <p class="text-lg leading-relaxed mb-8">
            在多GPU系统中，性能瓶颈往往源于GPU之间低效的数据传输和通信。为了准确地评估和优化多GPU系统的性能，MGPUSim不仅提供了高精度的性能建模能力，还集成了多种创新的优化策略。
          </p>

          <!-- Performance Modeling -->
          <div class="mb-12" id="modeling">
            <h3 class="font-serif text-2xl font-semibold text-secondary mb-6">3.1 MGPUSim的性能建模方法</h3>

            <div class="grid md:grid-cols-2 gap-8 mb-8">
              <div>
                <h4 class="font-semibold mb-4">性能瓶颈分析</h4>
                <p class="mb-4">
                  MGPUSim能够对一个多GPU系统进行全面的性能分析，帮助研究人员识别和理解系统中的主要性能瓶颈。通过周期级的模拟，可以精确统计各种性能指标。
                </p>

                <div class="space-y-3">
                  <div class="flex items-center">
                    <i class="fas fa-chart-bar text-primary mr-2"></i>
                    <span class="text-sm">计算时间与内存访问延迟分析</span>
                  </div>
                  <div class="flex items-center">
                    <i class="fas fa-exchange-alt text-secondary mr-2"></i>
                    <span class="text-sm">GPU间通信流量和延迟统计</span>
                  </div>
                  <div class="flex items-center">
                    <i class="fas fa-bullseye text-accent mr-2"></i>
                    <span class="text-sm">缓存命中率与NUMA效应识别</span>
                  </div>
                </div>
              </div>

              <div>
                <h4 class="font-semibold mb-4">准确性验证</h4>
                <p class="mb-4">
                  为了确保性能建模的准确性和可靠性，MGPUSim经过了严格的验证，包括微基准测试验证和全基准测试验证 <a class="citation-link" href="https://ece.northeastern.edu/groups/nucar/publications/Yifan_Sun_thesis.pdf">[149]</a>。
                </p>

                <div class="bg-green-50 border border-green-200 rounded-lg p-4">
                  <div class="text-center">
                    <div class="text-3xl font-bold text-green-600 mb-1">5.5%</div>
                    <div class="text-sm text-green-700">平均模拟误差</div>
                    <div class="text-xs text-green-600 mt-1">通过严格验证获得</div>
                  </div>
                </div>
              </div>
            </div>
          </div>

          <!-- Locality API -->
          <div class="mb-12" id="locality-api">
            <h3 class="font-serif text-2xl font-semibold text-secondary mb-6">3.2 优化策略：Locality API</h3>

            <div class="grid md:grid-cols-3 gap-6 mb-8">
              <div class="md:col-span-2">
                <h4 class="font-semibold mb-4">设计目标</h4>
                <p class="mb-4">
                  Locality API的核心设计目标是赋予程序员对<strong>数据和计算布局的精确控制能力</strong>，从而主动地优化数据局部性，减少跨GPU的数据传输开销 <a class="citation-link" href="https://ieeexplore.ieee.org/document/8980359">[116]</a>
                  <a class="citation-link" href="https://dl.acm.org/doi/10.1145/3307650.3322230">[118]</a>。
                </p>

                <div class="bg-blue-50 p-4 rounded-lg">
                  <h5 class="font-medium mb-2">实现机制</h5>
                  <p class="text-sm">
                    通过提供一组API扩展，允许程序员明确地指定数据应该放置在哪个GPU的内存中，以及计算任务应该在哪个GPU上执行 <a class="citation-link" href="https://ece.northeastern.edu/groups/nucar/publications/Yifan_Sun_thesis.pdf">[123]</a>。
                  </p>
                </div>
              </div>

              <div>
                <div class="bg-white rounded-lg shadow-md p-6 border border-base-300">
                  <div class="text-center">
                    <i class="fas fa-rocket text-4xl text-primary mb-3"></i>
                    <h4 class="font-semibold mb-2">性能提升</h4>
                    <div class="text-3xl font-bold text-primary mb-1">1.6×</div>
                    <p class="text-sm text-gray-600">4-GPU系统几何平均值</p>
                  </div>
                </div>
              </div>
            </div>
          </div>

          <!-- PASI -->
          <div class="mb-12" id="pasi">
            <h3 class="font-serif text-2xl font-semibold text-secondary mb-6">3.3 优化策略：PASI（可编程加速共享内存）</h3>

            <div class="grid md:grid-cols-2 gap-8 mb-8">
              <div>
                <h4 class="font-semibold mb-4">设计目标</h4>
                <p class="mb-4">
                  PASI旨在通过硬件自动调整数据在多GPU系统中的放置，来动态地优化数据局部性，对程序员完全透明，无需修改应用程序代码 <a class="citation-link" href="https://ieeexplore.ieee.org/document/8980359">[116]</a>
                  <a class="citation-link" href="https://dl.acm.org/doi/10.1145/3307650.3322230">[118]</a>。
                </p>

                <div class="space-y-3">
                  <div class="flex items-start">
                    <i class="fas fa-brain text-primary mr-2 mt-1"></i>
                    <div>
                      <div class="font-medium text-sm">智能页面迁移</div>
                      <div class="text-xs text-gray-600">根据访问模式动态迁移页面</div>
                    </div>
                  </div>
                  <div class="flex items-start">
                    <i class="fas fa-cut text-secondary mr-2 mt-1"></i>
                    <div>
                      <div class="font-medium text-sm">页面分割机制</div>
                      <div class="text-xs text-gray-600">避免假共享问题</div>
                    </div>
                  </div>
                  <div class="flex items-start">
                    <i class="fas fa-shield-alt text-accent mr-2 mt-1"></i>
                    <div>
                      <div class="font-medium text-sm">ESI一致性协议</div>
                      <div class="text-xs text-gray-600">保证数据一致性</div>
                    </div>
                  </div>
                </div>
              </div>

              <div>
                <img alt="多GPU内存管理系统架构图" class="w-full h-64 object-cover rounded-lg shadow-md mb-4" src="https://kimi-web-img.moonshot.cn/imagegen/20251120/0217636302115789e905c76a3108a9d5bfcf0437017e81d323444_0.jpeg" size="medium" aspect="wide" style="linedrawing" query="多GPU内存管理系统架构" referrerpolicy="no-referrer" data-modified="1" data-score="0.00"/>

                <div class="bg-green-50 border border-green-200 rounded-lg p-4">
                  <div class="text-center">
                    <div class="text-2xl font-bold text-green-600 mb-1">2.6×</div>
                    <div class="text-sm text-green-700">4-GPU系统几何平均值</div>
                    <div class="text-xs text-green-600 mt-1">显著优于Locality API</div>
                  </div>
                </div>
              </div>
            </div>

            <!-- PASI Implementation -->
            <div class="bg-white rounded-xl shadow-lg p-6 border border-base-300">
              <h4 class="font-semibold mb-4 flex items-center">
                <i class="fas fa-microchip text-primary mr-3"></i>
                PASI实现机制
              </h4>

              <div class="grid md:grid-cols-3 gap-4">
                <div class="text-center p-4 bg-gray-50 rounded-lg">
                  <i class="fas fa-server text-2xl text-primary mb-2"></i>
                  <h5 class="font-medium mb-2">页面迁移控制器 (PMC)</h5>
                  <p class="text-xs text-gray-600">处理L2缓存读写请求，判断远程数据获取需求</p>
                </div>

                <div class="text-center p-4 bg-gray-50 rounded-lg">
                  <i class="fas fa-exchange-alt text-2xl text-secondary mb-2"></i>
                  <h5 class="font-medium mb-2">仅缓存内存架构</h5>
                  <p class="text-xs text-gray-600">数据可在多个GPU上同时缓存，节省内存空间</p>
                </div>

                <div class="text-center p-4 bg-gray-50 rounded-lg">
                  <i class="fas fa-sync-alt text-2xl text-accent mb-2"></i>
                  <h5 class="font-medium mb-2">三种实现方式</h5>
                  <p class="text-xs text-gray-600">仅迁移、迁移+ESI、迁移+ESI+页面分割</p>
                </div>
              </div>
            </div>
          </div>
        </section>

        <!-- Section 4: Applications -->
        <section class="mb-16" id="applications">
          <header class="mb-8">
            <h2 class="font-serif text-3xl font-bold text-primary mb-4">4. 应用场景与评估方法</h2>
            <div class="section-divider mb-6"></div>
          </header>

          <p class="text-lg leading-relaxed mb-8">
            MGPUSim与Akita框架凭借其高灵活性、高性能和高准确性的特点，在计算机体系结构研究领域具有广泛的应用前景。它们不仅为研究人员提供了强大的工具来探索和评估下一代多GPU系统设计，还为优化现有应用程序在多GPU平台上的性能提供了有效的手段。
          </p>

          <!-- Use Cases -->
          <div class="mb-12" id="use-cases">
            <h3 class="font-serif text-2xl font-semibold text-secondary mb-6">4.1 典型应用场景</h3>

            <div class="grid md:grid-cols-3 gap-6">
              <div class="bg-white rounded-lg shadow-md p-6 border border-base-300">
                <div class="text-center mb-4">
                  <i class="fas fa-drafting-compass text-4xl text-primary mb-3"></i>
                  <h4 class="font-semibold">架构设计与评估</h4>
                </div>
                <p class="text-sm text-gray-600 mb-4">
                  研究人员可以利用MGPUSim高度灵活的配置能力，轻松构建和模拟各种不同的多GPU系统架构 <a class="citation-link" href="https://sites.google.com/view/jlabellan/tutorial-mgpusim">[148]</a>。
                </p>
                <ul class="text-xs space-y-1">
                  <li>• 探索不同GPU互连拓扑</li>
                  <li>• 评估互连技术性能影响</li>
                  <li>• 研究缓存一致性协议</li>
                </ul>
              </div>

              <div class="bg-white rounded-lg shadow-md p-6 border border-base-300">
                <div class="text-center mb-4">
                  <i class="fas fa-chart-line text-4xl text-secondary mb-3"></i>
                  <h4 class="font-semibold">程序性能分析</h4>
                </div>
                <p class="text-sm text-gray-600 mb-4">
                  通过MGPUSim内置的可视化分析工具Daisen，深入洞察应用程序在多GPU系统上的执行细节 <a class="citation-link" href="https://akitasim.dev/docs/akita/">[159]</a>。
                </p>
                <ul class="text-xs space-y-1">
                  <li>• 生成详细执行轨迹</li>
                  <li>• 定位性能瓶颈</li>
                  <li>• 优化数据划分策略</li>
                </ul>
              </div>

              <div class="bg-white rounded-lg shadow-md p-6 border border-base-300">
                <div class="text-center mb-4">
                  <i class="fas fa-atom text-4xl text-accent mb-3"></i>
                  <h4 class="font-semibold">新型互连技术</h4>
                </div>
                <p class="text-sm text-gray-600 mb-4">
                  为研究新型互连技术提供理想平台，评估其在多GPU系统中的性能表现。
                </p>
                <ul class="text-xs space-y-1">
                  <li>• 硅光子互连研究</li>
                  <li>• Chiplet互连方案</li>
                  <li>• 能效和性能权衡分析</li>
                </ul>
              </div>
            </div>
          </div>

          <!-- Evaluation Methods -->
          <div class="mb-12" id="evaluation">
            <h3 class="font-serif text-2xl font-semibold text-secondary mb-6">4.2 评估方法与验证结果</h3>

            <div class="space-y-8">
              <!-- Evaluation Metrics -->
              <div class="bg-white rounded-xl shadow-lg p-6 border border-base-300">
                <h4 class="font-semibold mb-4 flex items-center">
                  <i class="fas fa-ruler-combined text-primary mr-3"></i>
                  评估指标
                </h4>

                <div class="grid md:grid-cols-3 gap-6">
                  <div class="text-center">
                    <i class="fas fa-stopwatch text-2xl text-primary mb-2"></i>
                    <h5 class="font-medium mb-2">性能</h5>
                    <p class="text-sm text-gray-600">应用程序执行时间，直接评估模拟器准确性</p>
                  </div>

                  <div class="text-center">
                    <i class="fas fa-tachometer-alt text-2xl text-secondary mb-2"></i>
                    <h5 class="font-medium mb-2">带宽</h5>
                    <p class="text-sm text-gray-600">GPU间峰值带宽、实际带宽及利用率</p>
                  </div>

                  <div class="text-center">
                    <i class="fas fa-clock text-2xl text-accent mb-2"></i>
                    <h5 class="font-medium mb-2">延迟</h5>
                    <p class="text-sm text-gray-600">网络延迟、内存访问延迟等传输耗时</p>
                  </div>
                </div>
              </div>

              <!-- Validation Method -->
              <div class="bg-white rounded-xl shadow-lg p-6 border border-base-300">
                <h4 class="font-semibold mb-4 flex items-center">
                  <i class="fas fa-check-double text-primary mr-3"></i>
                  验证方法：与实际硬件对比
                </h4>

                <div class="grid md:grid-cols-2 gap-6">
                  <div>
                    <h5 class="font-medium mb-2 flex items-center">
                      <i class="fas fa-microscope text-primary mr-2"></i>
                      微基准测试验证
                    </h5>
                    <p class="text-sm text-gray-600 mb-3">
                      使用专门设计的微基准测试程序，独立测试L1/L2缓存、DRAM、ALU等关键组件性能 <a class="citation-link" href="https://ece.northeastern.edu/groups/nucar/publications/Yifan_Sun_thesis.pdf">[149]</a>。
                    </p>
                    <div class="bg-blue-50 p-3 rounded text-xs">
                      验证结果：模拟曲线与真实硬件执行曲线几乎完全重合
                    </div>
                  </div>

                  <div>
                    <h5 class="font-medium mb-2 flex items-center">
                      <i class="fas fa-cogs text-secondary mr-2"></i>
                      全基准测试验证
                    </h5>
                    <p class="text-sm text-gray-600 mb-3">
                      使用AMD APP SDK和Hetero-Mark等标准基准测试套件的代表性应用程序 <a class="citation-link" href="https://ece.northeastern.edu/groups/nucar/publications/Yifan_Sun_thesis.pdf">[149]</a>。
                    </p>
                    <div class="bg-green-50 p-3 rounded text-xs">
                      测试平台：2个AMD R9 Nano GPU组成的多GPU系统
                    </div>
                  </div>
                </div>
              </div>

              <!-- Validation Results -->
              <div class="bg-gradient-to-r from-green-50 to-blue-50 rounded-xl shadow-lg p-6 border border-base-300">
                <h4 class="font-semibold mb-4 flex items-center">
                  <i class="fas fa-trophy text-primary mr-3"></i>
                  验证结果：模拟误差控制在5.5%以内
                </h4>

                <div class="grid md:grid-cols-2 gap-6">
                  <div>
                    <div class="bg-white rounded-lg p-4 shadow-sm">
                      <div class="flex items-center justify-between mb-2">
                        <span class="font-medium">平均模拟误差</span>
                        <span class="text-2xl font-bold text-green-600">5.5%</span>
                      </div>
                      <div class="w-full bg-gray-200 rounded-full h-2">
                        <div class="bg-green-500 h-2 rounded-full" style="width: 5.5%"></div>
                      </div>
                      <p class="text-xs text-gray-600 mt-2">
                        极低的误差率充分证明了MGPUSim的准确性和可信度
                      </p>
                    </div>
                  </div>

                  <div>
                    <div class="bg-white rounded-lg p-4 shadow-sm">
                      <h5 class="font-medium mb-2">性能表现</h5>
                      <ul class="text-xs space-y-1">
                        <li>• 能够准确捕捉多GPU系统中的复杂交互行为</li>
                        <li>• 包括计算、通信和内存访问等方面</li>
                        <li>• 满足绝大多数计算机体系结构研究需求</li>
                      </ul>
                    </div>
                  </div>
                </div>

                <div class="mt-4 p-4 bg-yellow-50 rounded-lg border border-yellow-200">
                  <p class="text-sm text-yellow-800">
                    <i class="fas fa-info-circle mr-2"></i>
                    <strong>注：</strong>在特定基准测试（如FIR和SC）中，误差峰值可能达到20%，主要由于未公开的GPU硬件细节难以完全复现 <a class="citation-link" href="https://ece.northeastern.edu/groups/nucar/publications/Yifan_Sun_thesis.pdf">[149]</a>。
                  </p>
                </div>
              </div>
            </div>
          </div>
        </section>

        <!-- Conclusion -->
        <section class="mb-16">
          <div class="bg-gradient-to-br from-primary to-secondary rounded-xl shadow-2xl p-8 text-white">
            <h2 class="font-serif text-2xl font-bold mb-6">研究前景与发展方向</h2>

            <div class="grid md:grid-cols-2 gap-8">
              <div>
                <h3 class="font-semibold mb-4">当前成就</h3>
                <ul class="space-y-2 text-sm">
                  <li class="flex items-center">
                    <i class="fas fa-check-circle mr-2"></i>
                    高精度多GPU系统模拟（误差&lt;5.5%）
                  </li>
                  <li class="flex items-center">
                    <i class="fas fa-check-circle mr-2"></i>
                    创新的优化策略（Locality API, PASI）
                  </li>
                  <li class="flex items-center">
                    <i class="fas fa-check-circle mr-2"></i>
                    广泛的应用场景覆盖
                  </li>
                </ul>
              </div>

              <div>
                <h3 class="font-semibold mb-4">未来发展</h3>
                <ul class="space-y-2 text-sm">
                  <li class="flex items-center">
                    <i class="fas fa-arrow-right mr-2"></i>
                    支持NVIDIA GPU架构模拟
                  </li>
                  <li class="flex items-center">
                    <i class="fas fa-arrow-right mr-2"></i>
                    新型互连技术研究
                  </li>
                  <li class="flex items-center">
                    <i class="fas fa-arrow-right mr-2"></i>
                    更大规模多GPU系统建模
                  </li>
                </ul>
              </div>
            </div>
          </div>
        </section>
      </article>

      <!-- Footer -->
      <footer class="bg-base-200 border-t border-base-300 py-8">
        <div class="max-w-5xl mx-auto px-6">
          <div class="text-center text-sm text-gray-600">
            <p class="mb-2">MGPUSim与Akita框架深度研究：多GPU互连架构与分析</p>
            <p> 2025 Computer Architecture Research</p>
          </div>
        </div>
      </footer>
    </div>

    <script>
        // Initialize Mermaid
        mermaid.initialize({
            startOnLoad: true,
            theme: 'base',
            themeVariables: {
                primaryColor: '#1a237e',
                primaryTextColor: '#1a237e',
                primaryBorderColor: '#3949ab',
                lineColor: '#3949ab',
                secondaryColor: '#f8fafc',
                tertiaryColor: '#e2e8f0',
                background: '#ffffff',
                mainBkg: '#ffffff',
                secondBkg: '#f8fafc',
                tertiaryBkg: '#e2e8f0',
                nodeBorder: '#1a237e',
                clusterBkg: '#f8fafc',
                clusterBorder: '#3949ab',
                defaultLinkColor: '#3949ab',
                titleColor: '#1a237e',
                edgeLabelBackground: '#ffffff',
                nodeTextColor: '#1a237e'
            },
            flowchart: {
                useMaxWidth: false,
                htmlLabels: true,
                curve: 'basis',
                padding: 20
            },
            sequence: {
                useMaxWidth: false,
                wrap: true
            },
            gantt: {
                useMaxWidth: false
            }
        });

        // Initialize Mermaid Controls for zoom and pan
        function initializeMermaidControls() {
            const containers = document.querySelectorAll('.mermaid-container');

            containers.forEach(container => {
            const mermaidElement = container.querySelector('.mermaid');
            let scale = 1;
            let isDragging = false;
            let startX, startY, translateX = 0, translateY = 0;

            // 触摸相关状态
            let isTouch = false;
            let touchStartTime = 0;
            let initialDistance = 0;
            let initialScale = 1;
            let isPinching = false;

            // Zoom controls
            const zoomInBtn = container.querySelector('.zoom-in');
            const zoomOutBtn = container.querySelector('.zoom-out');
            const resetBtn = container.querySelector('.reset-zoom');
            const fullscreenBtn = container.querySelector('.fullscreen');

            function updateTransform() {
                mermaidElement.style.transform = `translate(${translateX}px, ${translateY}px) scale(${scale})`;

                if (scale > 1) {
                container.classList.add('zoomed');
                } else {
                container.classList.remove('zoomed');
                }

                mermaidElement.style.cursor = isDragging ? 'grabbing' : 'grab';
            }

            if (zoomInBtn) {
                zoomInBtn.addEventListener('click', () => {
                scale = Math.min(scale * 1.25, 4);
                updateTransform();
                });
            }

            if (zoomOutBtn) {
                zoomOutBtn.addEventListener('click', () => {
                scale = Math.max(scale / 1.25, 0.3);
                if (scale <= 1) {
                    translateX = 0;
                    translateY = 0;
                }
                updateTransform();
                });
            }

            if (resetBtn) {
                resetBtn.addEventListener('click', () => {
                scale = 1;
                translateX = 0;
                translateY = 0;
                updateTransform();
                });
            }

            if (fullscreenBtn) {
                fullscreenBtn.addEventListener('click', () => {
                if (container.requestFullscreen) {
                    container.requestFullscreen();
                } else if (container.webkitRequestFullscreen) {
                    container.webkitRequestFullscreen();
                } else if (container.msRequestFullscreen) {
                    container.msRequestFullscreen();
                }
                });
            }

            // Mouse Events
            mermaidElement.addEventListener('mousedown', (e) => {
                if (isTouch) return; // 如果是触摸设备，忽略鼠标事件

                isDragging = true;
                startX = e.clientX - translateX;
                startY = e.clientY - translateY;
                mermaidElement.style.cursor = 'grabbing';
                updateTransform();
                e.preventDefault();
            });

            document.addEventListener('mousemove', (e) => {
                if (isDragging && !isTouch) {
                translateX = e.clientX - startX;
                translateY = e.clientY - startY;
                updateTransform();
                }
            });

            document.addEventListener('mouseup', () => {
                if (isDragging && !isTouch) {
                isDragging = false;
                mermaidElement.style.cursor = 'grab';
                updateTransform();
                }
            });

            document.addEventListener('mouseleave', () => {
                if (isDragging && !isTouch) {
                isDragging = false;
                mermaidElement.style.cursor = 'grab';
                updateTransform();
                }
            });

            // 获取两点之间的距离
            function getTouchDistance(touch1, touch2) {
                return Math.hypot(
                touch2.clientX - touch1.clientX,
                touch2.clientY - touch1.clientY
                );
            }

            // Touch Events - 触摸事件处理
            mermaidElement.addEventListener('touchstart', (e) => {
                isTouch = true;
                touchStartTime = Date.now();

                if (e.touches.length === 1) {
                // 单指拖动
                isPinching = false;
                isDragging = true;

                const touch = e.touches[0];
                startX = touch.clientX - translateX;
                startY = touch.clientY - translateY;

                } else if (e.touches.length === 2) {
                // 双指缩放
                isPinching = true;
                isDragging = false;

                const touch1 = e.touches[0];
                const touch2 = e.touches[1];
                initialDistance = getTouchDistance(touch1, touch2);
                initialScale = scale;
                }

                e.preventDefault();
            }, { passive: false });

            mermaidElement.addEventListener('touchmove', (e) => {
                if (e.touches.length === 1 && isDragging && !isPinching) {
                // 单指拖动
                const touch = e.touches[0];
                translateX = touch.clientX - startX;
                translateY = touch.clientY - startY;
                updateTransform();

                } else if (e.touches.length === 2 && isPinching) {
                // 双指缩放
                const touch1 = e.touches[0];
                const touch2 = e.touches[1];
                const currentDistance = getTouchDistance(touch1, touch2);

                if (initialDistance > 0) {
                    const newScale = Math.min(Math.max(
                    initialScale * (currentDistance / initialDistance),
                    0.3
                    ), 4);
                    scale = newScale;
                    updateTransform();
                }
                }

                e.preventDefault();
            }, { passive: false });

            mermaidElement.addEventListener('touchend', (e) => {
                // 重置状态
                if (e.touches.length === 0) {
                isDragging = false;
                isPinching = false;
                initialDistance = 0;

                // 延迟重置isTouch，避免鼠标事件立即触发
                setTimeout(() => {
                    isTouch = false;
                }, 100);
                } else if (e.touches.length === 1 && isPinching) {
                // 从双指变为单指，切换为拖动模式
                isPinching = false;
                isDragging = true;

                const touch = e.touches[0];
                startX = touch.clientX - translateX;
                startY = touch.clientY - translateY;
                }

                updateTransform();
            });

            mermaidElement.addEventListener('touchcancel', (e) => {
                isDragging = false;
                isPinching = false;
                initialDistance = 0;

                setTimeout(() => {
                isTouch = false;
                }, 100);

                updateTransform();
            });

            // Enhanced wheel zoom with better center point handling
            container.addEventListener('wheel', (e) => {
                e.preventDefault();
                const rect = container.getBoundingClientRect();
                const centerX = rect.width / 2;
                const centerY = rect.height / 2;

                const delta = e.deltaY > 0 ? 0.9 : 1.1;
                const newScale = Math.min(Math.max(scale * delta, 0.3), 4);

                // Adjust translation to zoom towards center
                if (newScale !== scale) {
                const scaleDiff = newScale / scale;
                translateX = translateX * scaleDiff;
                translateY = translateY * scaleDiff;
                scale = newScale;

                if (scale <= 1) {
                    translateX = 0;
                    translateY = 0;
                }

                updateTransform();
                }
            });

            // Initialize display
            updateTransform();
            });
        }

        // Smooth scrolling for anchor links
        document.querySelectorAll('a[href^="#"]').forEach(anchor => {
            anchor.addEventListener('click', function (e) {
                e.preventDefault();
                const target = document.querySelector(this.getAttribute('href'));
                if (target) {
                    target.scrollIntoView({
                        behavior: 'smooth',
                        block: 'start'
                    });
                }
            });
        });

        // Highlight current section in TOC
        window.addEventListener('scroll', function() {
            const sections = document.querySelectorAll('section[id], div[id]');
            const tocLinks = document.querySelectorAll('.toc-link');
            
            let current = '';
            sections.forEach(section => {
                const sectionTop = section.offsetTop;
                const sectionHeight = section.clientHeight;
                if (scrollY >= (sectionTop - 200)) {
                    current = section.getAttribute('id');
                }
            });

            tocLinks.forEach(link => {
                link.classList.remove('bg-primary', 'text-white');
                if (link.getAttribute('href') === '#' + current) {
                    link.classList.add('bg-primary', 'text-white');
                }
            });
        });

        // Initialize mermaid controls
        initializeMermaidControls();
    </script>

  

</body></html>                    
讨论回复

0 条回复
还没有人回复，快来发表你的看法吧！
需要登录才能发表回复
登录注册
MGPUSim与Akita框架

讨论回复

推荐

Nested Learning: The Illusion of Deep Learning

Knowledgeable Reinforcement Learning for Factuality

all-MiniLM-L6-v2模型全面解析

多智能体系统研究现状与核心挑战分析

MindSearch: 模拟人类思维的人工智能搜索框架 思·索 — 通过多智能体框架实现深度网络信息搜索与整合

MindSearch: 模拟人类思维的人工智能搜索框架思·索 — 通过多智能体框架实现深度网络信息搜索与整合