-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENABLE_PARALLELRESTART functionality produces failures on some UFS regression test platforms #368
Comments
I took a look at the issue and want to note the error in more detail for those who do not have access to Hera to look at the logs: |
@grantfirl has @dkokron been made aware of this issue? |
Yes indeed. He tried to reproduce the error on Acorn without success. |
@grantfirl (cc @dkokron) - Looking at the error message @laurenchilutti included, I am inclined to believe this might could be a resource issue. A sigsegv when we are most likely asking for more memory down in the NetCDF/HDF layer is indicative of a lack of memory resources. The fact it works on other machines makes it even more likely, in my mind. Can you double check that the user environment running these tests is setting the shell stacklimit to unlimited in the shell startup rc/profile files. |
Describe the bug
This is related to the NOAA-EMC fork and dev/emc branch.
During testing of ufs-community/ufs-weather-model#2529 that included atmos_cubed_sphere PR NOAA-EMC#89, the following error was noted:
The control_restart_p8_intel test is failing with a segmentation fault.
The err log shows the first non-libarary error as:
0x0000000002232a28 fv_io_mod_mp_fv_io_read_restart_() /scratch1/BMC/gmtb/Grant.Firl/ufs-weather-model-grantfirl/FV3/atmos_cubed_sphere/tools/fv_io.F90:495
To Reproduce
Turn ENABLE_PARALLELRESTART to ON in CMakeLists.txt and run the control_restart_p8_intel UFS regression test on Hera or Hercules (error was not reproduced on Acorn, according to @dkokron).
Expected behavior
The test completes without error.
System Environment
Hera and Hercules UFS RT environment
Additional context
See NOAA-EMC/fv3atm#896 for some related discussion.
The text was updated successfully, but these errors were encountered: