-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] OCCA Backend Update #1043
base: main
Are you sure you want to change the base?
Conversation
Thanks @kris-rowe! Feel free to tag in myself or @jeremylt at any time. |
Great, I see this compiling. This issue is probably an easy fix, but I get this failure. $ make test search=t001 BACKENDS=/cpu/self/occa
[...]
not ok 3 t001-ceed /cpu/self/occa stderr
# +/home/jed/src/libCEED/backends/occa/ceed-occa.cpp:336 in registerBackend():
# +---[ Error ]--------------------------------------------------------------------
# +File : /home/jed/src/occa/src/types/json.cpp
# +Line : 491
# +Function : operator[]
# +Message : Path '' is not an object
# +Stack
# +4 build/t001-ceed(+0x11a6)
# +3 /usr/lib/libc.so.6(+0x232d0)
# +2 /usr/lib/libc.so.6(__libc_start_main+0x8a)
# +1 build/t001-ceed(+0x10a5) Building with
|
I tracked down the configuration issue. Will see if the rest of the tests pass locally now. |
@@ -0,0 +1,4 @@ | |||
#ifndef _OCCA_INCLUDE_CEED_H_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to include this during JiT, see the example files like this in `/include/ceed/jit-source/cuda/cuda-jit.h'
Any files that will be used as JiT source should go in include/ceed/jit-source/occa
@@ -19,10 +19,11 @@ CEED_QFUNCTION(Mass2DBuild)(void *ctx, const CeedInt Q, | |||
// *INDENT-OFF* | |||
// in[0] is Jacobians with shape [2, nc=2, Q] | |||
// in[1] is quadrature weights, size (Q) | |||
const CeedScalar (*J)[2][CEED_Q_VLA] = (const CeedScalar(*)[2][CEED_Q_VLA])in[0], | |||
*w = in[1]; | |||
typedef CeedScalar array_t[3][CEED_Q_VLA]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that we want to force users to have to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure. This is a temporary hack since there is a deadline next week.
I will update the OCCA parser to handle pointers-to-arrays correctly in the near future.
Our CI is back online so I think we'll be able to test this now with CUDA and/or ROCm. We have oneAPI compilers installed, though not with Intel hardware. What would you consider a good choice for testing? |
examples/fluids/qfunctions/channel.h
Outdated
@@ -29,11 +28,12 @@ struct ChannelContext_ { | |||
CeedScalar B; // !< Body-force driving the flow | |||
struct NewtonianIdealGasContext_ newtonian_ctx; | |||
}; | |||
#define ChannelContext struct ChannelContext_*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#define ChannelContext struct ChannelContext_*; | |
#define ChannelContext struct ChannelContext_* |
I can run this variant of the channel test (which includes output validation) with $ build/fluids-navierstokes -options_file examples/fluids/channel.yaml -compare_final_state_atol 2e-11 -compare_final_state_filename examples/fluids/tests-output/fluids-navierstokes-channel.bin -dm_plex_box_faces 5,5,1 -ts_max_steps 5 -ceed /cpu/self/occa -dm_mat_preallocate_skip 0 -snes_fd_color The trace is:
I'm just testing locally so far; does this work for you at JLSE? |
The OCCA CUDA backend should be sufficient for testing. If you really wanted to test the SYCL backend specifically, you could build the public Intel LLVM compilers with the SYCL CUDA plugin enabled. |
Correct me if I'm wrong, but I understand from the comments and commits that the core tests, t1*-t5* and ex1, ex2 all pass with these changes. In preparation for our upcoming release, I'd like to get this initial work merged. Specifically, I would like to merge the changes in I think a couple of changes need to be added
I can help with any/all of those changes |
Thanks for all the hard work here @kris-rowe! PETSc released last week, so I'd like to tidy up the big open PRs and get a libCEED release soon. I don't want to step on any toes, but I have the time to run down those small tasks I listed above so we get this PR into the release if that's ok with you. |
If you have time that would be a huge help. |
Ok, #1072 should be ready to merge now |
Shall we rebase this after #1072? And is it possible to make block diagonal assembly fall back to CPU for now? This will allow us to run ceed-fluids on PVC with a narrowly localized performance distortion. |
8801fe3
to
d15d972
Compare
#ifndef _OCCA_INCLUDE_CEED_H_ | ||
#define _OCCA_INCLUDE_CEED_H_ | ||
// Phony header to include when compiling OKL | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's really intended that JiT headers don't go here in the backend folder.
d15d972
to
1760b84
Compare
<<<<<<< HEAD | ||
const CeedScalar(*J)[CEED_Q_VLA] = (const CeedScalar(*)[CEED_Q_VLA])in[0]; | ||
const CeedScalar(*w) = in[1]; | ||
|
||
// Outputs | ||
CeedScalar(*q_data_sur)[CEED_Q_VLA] = (CeedScalar(*)[CEED_Q_VLA])out[0]; | ||
======= | ||
typedef CeedScalar vec_t[CEED_Q_VLA]; | ||
const vec_t* J = (const vec_t*) in[0]; | ||
const CeedScalar * const w = in[1]; | ||
// Outputs | ||
vec_t * q_data_sur = (vec_t*) out[0]; | ||
// *INDENT-ON* | ||
>>>>>>> f0e9d76d (Rewrites fluids example qfunctions to be compatible with OCCA.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conflict markers
Here's a suggested test. diff --git i/examples/fluids/blasius.yaml w/examples/fluids/blasius.yaml
index cf3056b1e..fd7516ee6 100644
--- i/examples/fluids/blasius.yaml
+++ w/examples/fluids/blasius.yaml
@@ -4,7 +4,7 @@ implicit: true
ts:
adapt_type: 'none'
type: 'beuler'
- dt: 0.2e-5
+ dt: 0.1e-5
max_time: 1.0e-3
output_freq: 10
@@ -51,3 +51,15 @@ stg:
use: false
inflow_path: "./STGInflow_blasius.dat"
mean_only: true
+
+pmat_pbdiagonal:
+ksp_type: bcgsl
+pc_type: vpbjacobi
+amat_type: shell
+
+# monitors
+ts_monitor:
+snes_monitor:
+ksp_converged_reason:
+
+ceed: /gpu/cuda It should produce output like build/fluids-navierstokes -options_file examples/fluids/blasius.yaml
[...]
0 TS dt 1e-06 time 0.
0 SNES Function norm 6.801978174922e-04
Linear solve converged due to CONVERGED_RTOL iterations 222
1 SNES Function norm 1.515152456321e-07
Linear solve converged due to CONVERGED_RTOL iterations 286
2 SNES Function norm 4.628934945817e-11
1 TS dt 1e-06 time 1e-06
[...] |
For performance monitoring, you can add |
When complete, this PR will update the OCCA backend to be compatible with OCCA v1.4 and add support for the OCCA OpenCL and SYCL backends.
This PR will close #816.
This PR conflicts with #1007.