Skip to content

Commit

Permalink
Increase shared mem. size
Browse files Browse the repository at this point in the history
  • Loading branch information
ahadnagy committed Jan 3, 2025
1 parent 40c17d6 commit 43c9d60
Showing 1 changed file with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -728,7 +728,7 @@ __global__ void Marlin(
// latency hiding. At the same time, we want relatively few warps to have many registers per warp and small tiles.
const int THREADS = 256;
const int STAGES = 4; // 4 pipeline stages fit into shared memory
const int SHARED_MEM = 96 * 1024; // max shared memory on compute capability 8.6 (< 8.0)
const int SHARED_MEM = 164 * 1024; // max shared memory on compute capability 8.0

// ADDED: add scaled zero pointer
#define CALL_IF(THREAD_M_BLOCKS, THREAD_N_BLOCKS, THREAD_K_BLOCKS, GROUP_BLOCKS) \
Expand Down

0 comments on commit 43c9d60

Please sign in to comment.