Skip to content

Commit

Permalink
initial data dump
Browse files Browse the repository at this point in the history
  • Loading branch information
mktip committed Apr 25, 2024
1 parent 417dad4 commit 386b3dd
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions dist/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -296,19 +296,19 @@ <h2>Pioneering the Future of Post-Moore Computing</h2>
<img width="32" src="./_file/assets/git.aea14968.webp">
<a href="https://github.com/ParCoreLab/CPU-Free-model" class="text-lg font-semibold font-serif visited:text-blue-900">CPU Free Model</a>
<p class="text-sm">This project introduces a fully autonomous execution model for multi-GPU applications, eliminating CPU involvement beyond initial kernel launch. In conventional setups, the CPU orchestrates execution, causing overhead. We propose delegating this control flow entirely to devices, leveraging techniques like persistent kernels and device-initiated communication. Our CPU-free model significantly reduces communication overhead. Demonstrations on 2D/3D Jacobi stencil and Conjugate Gradient solvers show up to a 58.8% improvement in communication latency and a 1.63x speedup for CG on 8 NVIDIA A100 GPUs compared to CPU-controlled baselines.</p>
<img src="./_file/assets/CPU-Free-Model.5d0d6917.png">
<img width="100%" src="./_file/assets/CPU-Free-Model.5d0d6917.png">
</div>
<div class="card flex flex-col justify-start items-center gap-3">
<img width="32" src="./_file/assets/git.aea14968.webp">
<a href="https://github.com/ParCoreLab/snoopie" class="text-lg font-semibold font-serif visited:text-blue-900">Snoopie</a>
<p class="text-sm">With data movement posing a significant bottleneck in computing, profiling tools are essential for scaling multi-GPU applications efficiently. However, existing tools focus primarily on single GPU compute operations and lack support for monitoring GPU-GPU transfers and communication library calls. Addressing these gaps, we present Snoopie, an instrumentation-based multi-GPU communication profiling tool. Snoopie accurately tracks peer-to-peer transfers and GPU-centric communication library calls, attributing data movement to specific source code lines and objects. It offers various visualization modes, from system-wide overviews to detailed instructions and addresses, enhancing programmer productivity.</p>
<img src="./_file/assets/Snoopie.fb32953b.jpg">
<img width="100%" src="./_file/assets/Snoopie.fb32953b.jpg">
</div>
<div class="card flex flex-col justify-start items-center gap-3">
<img width="32" src="./_file/assets/git.aea14968.webp">
<a href="https://github.com/msasongko17/multigpu_callback" class="text-lg font-semibold font-serif visited:text-blue-900">Multi-GPU Callbacks</a>
<p class="text-sm">To address resource underutilization in multi-GPU systems, particularly in irregular applications, we propose a GPU-sided resource allocation method. This method dynamically adjusts the number of GPUs in use based on workload changes, utilizing GPU-to-CPU callbacks to request additional devices during kernel execution. We implemented and tested multiple callback methods, measuring their overheads on Nvidia and AMD platforms. Demonstrating the approach in an irregular application like Breadth-First Search (BFS), we achieved a 15.7% reduction in time to solution on average, with callback overheads as low as 6.50 microseconds on AMD and 4.83 microseconds on Nvidia. Additionally, the model can reduce total device usage by up to 35%, improving energy efficiency.</p>
<img src="./_file/assets/Multi-GPU-callback.c2f7436f.png">
<img width="100%" src="./_file/assets/Multi-GPU-callback.c2f7436f.png">
</div>
<div class="card flex flex-col justify-start items-center gap-3">
<img width="32" src="./_file/assets/git.aea14968.webp">
Expand Down

0 comments on commit 386b3dd

Please sign in to comment.