Skip to content

n-io/celerity-runtime

 
 

Repository files navigation

Celerity Logo

Celerity Runtime - MIT License Semver 2.0 PRs # Welcome

The Celerity distributed runtime and API aims to bring the power and ease of use of SYCL to distributed memory clusters.

NOTE: Celerity is a research project first and foremost, and is still in early development. While it does work for certain applications, it probably does not fully support your use case just yet. We'd however love for you to give it a try and tell us about how you could imagine using Celerity for your projects in the future.

Dependencies

  • A supported SYCL implementation, either
  • Boost (tested with version 1.65 - 1.68)
  • A MPI 2 implementation (tested with OpenMPI 4.0 and MSMPI 10.0)
  • CMake
  • A C++14 compiler

Building

Building can be as simple as calling cmake && make, depending on your setup you might however also have to provide some library paths etc.

The runtime comes with several examples that are built automatically when the CELERITY_BUILD_EXAMPLES CMake option is set (true by default).

Using Celerity as a Library

Simply run make install (or equivalent, depending on build system) to copy all relevant header files and libraries to the CMAKE_INSTALL_PREFIX. This includes a CMake package configuration file which is placed inside the lib/cmake directory. Once included in a CMake project, you can use the add_celerity_to_target(TARGET target SOURCES source1 source2...) function to set everything up.

Running a Celerity Application

Celerity is built on top of MPI, which means a Celerity application can be executed like any other MPI application (i.e., using mpirun or equivalent).

Environment Variables

  • CELERITY_LOG_LEVEL controls the logging output level. One of trace, debug, info, warn, err, critical, or off.
  • CELERITY_DEVICES can be used to assign different compute devices to CELERITY nodes on a single host. The syntax is as follows: CELERITY_DEVICES="<platform_id> <first device_id> <second device_id> ... <nth device_id>". Note that this should normally not be required, as Celerity will attempt to automatically assign a unique device to each node on a host.
  • CELERITY_FORCE_WG=<work_group_size> can be used to force a particular work group size for every kernel and every dimension.
  • CELERITY_PROFILE_OCL controls whether OpenCL-level profiling information should be used or not (currently not supported when using hipSYCL).

About

High-level C++ for Accelerator Clusters

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 86.0%
  • CMake 10.7%
  • JavaScript 2.4%
  • Other 0.9%