This repo contains several sample programs building up primitives for a synchronous slide FFT.
This code is compatible with Cerebras SDK 1.0.0.
Register for access to the Cerebras SDK here. Documentation for the SDK can be found here.
Constructs a program with a row of kernel_width
PEs, where kernel_width
is a multiple of four from 8 to 32.
Each PE has arrays arr0
and arr1
of size num_elems
.
Each PE from 0 to kernel_width
/ 2 - 1 initializes arr0
.
The kernel shifts the data by kernel_width
/ 4 PEs, with the shifted
data ending up in arr1
.
Thus, for PE i, where i < kernel_width
/ 2, the contents of arr0
are moved to arr1
of PE i + kernel_width
/ 4.
Same as above, except with data shifting in both directions.
For PE i, where i < kernel_width
/ 2, the contents of arr0
are moved to arr1
of PE i + kernel_width
/ 4.
For PE i, where i >= kernel_width
/ 2, the contents of arr0
are moved to arr2
of PE i - kernel_width
/ 4.
The sweep.py
script performs a sweep over multiple kernel widths and
number of elements, producing a CSV file with cycle counts on the left,
middle, and right PEs for performing the synchronous slide.