Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GFDL->main PR regression and bug fixes #782

Merged

Conversation

marshallward
Copy link
Member

Several fixes to the candidate PR to main:

  • Fixed bugs due to introduction of MEKE biharmonic friction
  • Restored performance due to new visc-limit flag in MEKE FrictWork. This is now applied separately, and only if EY24_EBT_BR is enabled.
  • Restored performance due to multiple damped MEKE diagnostics. Nearly every calculation is now conditional.
  • Bugfix to an internal tide control struct declaration

There is also a commit to restore answers related to MEKE backscatter, but this is still under some discussion. I'm leaving it in for now, but we may change or remove this one if NCAR approves.

This needs review from one of the following:

Further testing will eventually be needed from @alperaltuntas and @jiandewang but this can wait until Wenda and/or Elizabeth have reviewed the modificatins.

The MEKE GM computation of `src` and `src_GM`, a diagnostic array, were
placed in a single loop.  The similar RHS of each expression made it
unfavorable to use FMAs on the `src` update.  Older production runs
depending on this FMA were seeing answer changes.

This patch restores the FMA loop update of `src` by separating `src` and
`src_GM` into separate loops.
This patch makes several adjustments to MOM_MEKE.F90 and
MOM_hor_visc.F90 to ensure that the Laplacian and biharmonic friction
coefficients are computed separately, and only if their respective terms
are enabled.

This resolves some subtle bugs where the default biharmonic value of -1
was applied to the Laplacian case, even when the biharmonic MEKE
friction was disabled.
@marshallward
Copy link
Member Author

The regression is due to the visc_lim_* diagnostics, which are now conditionally registered if CS%EY24_EBT_BS is enabled.

Although this is a regression into NOAA-GFDL:gfdl-to-main-2024-11-27 it should not cause a regression of NOAA-GFDL:gfdl-to-main-2024-11-27 into main.

src/parameterizations/lateral/MOM_MEKE.F90 Outdated Show resolved Hide resolved
src/parameterizations/lateral/MOM_MEKE.F90 Show resolved Hide resolved
src/parameterizations/lateral/MOM_MEKE.F90 Outdated Show resolved Hide resolved
src/parameterizations/lateral/MOM_MEKE.F90 Outdated Show resolved Hide resolved
src/parameterizations/lateral/MOM_hor_visc.F90 Outdated Show resolved Hide resolved
@Wendazhang33
Copy link

Wendazhang33 commented Dec 20, 2024 via email

@marshallward marshallward force-pushed the ebt_src_gm_split branch 4 times, most recently from 4f000cd to 47487a4 Compare December 23, 2024 14:38
The if-test inside of the FrictWork loops are likely to impede
performance.  Even if the total work is reduced, they are likely to
interrupt pipelines.  When EY24_EBT_BS is disabled, they will clearly
reduce performance.

This patch moves those tests outside of the if-block and applies them
separately.

(Calculation would be slightly improved if the meaning of the flag were
reversed, but I don't want to make additional changes.)
The damping MEKE loop also included updates to multiple diagnostics,
even if they were not registered.  This would presumably have a negative
impact on performance.

This patch moves each diagnostic into a separate loop.  It also
conditionally precomputes the damping and damp_rate parameters, which
are now stored as 2d arrays rather than in-loop scalars.

As before, the MEKE calculation is left unchanged in order to preserve
bit reproducibility.
The redefining of int_tide_CS control struct in set_diffusivity_init
caused errors in debug-mode for Intel compilers.  The issue appears to
be an internal function that expects a pointer rather than the type.

This patch reverts this back to a pointer.  We can revisit this if there
is a need to reduce reliance on pointers.
@marshallward
Copy link
Member Author

I've made several adjustments based on suggestions by @Wendazhang33

  • The reverted mom_src update for MEKE_BACKSCAT_RO_C was un-reverted. Although this expression was required to produce the original bit-reproducible solutions, it was also wrong for two reasons:

    • Code following the endif should have been in an else-block, causing terms to be double-computed. (I had originally prepared a bug flag to replicate this effect, but that was also removed.)

    • The expression would have only been correct for FrictWork_bug = True.

    The new expression is both concise and correct for both values of FrictWork_bug, and NCAR supports the existing answer-changing expression, so I have removed this commit.

  • Some flow control is more accurately based on FrCoeff rather than allocation state.

  • Diagnostics which depend on FrCoeff are not registered if FrCoeff is unset.

  • Fixed two bugs caught by Wenda

I believe this is now ready for review, and merge into the PR to main.

This patch updates the expression for FrictWork_bh (biharmonic
frictional work) when the FrictWork_bug flag is enabled.  The new form
is symmetric to rotations when FMA instructions are enabled.
Copy link
Member

@Hallberg-NOAA Hallberg-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have examined all of these changes, and I have checked out the updated code and used it to run the MOM6-examples regression suite. The change in diagnostics in the TC testing is properly explained. I am satisfied that this is correct and that it is likely to address the issues that were identified with the 2024-11-27 GFDL-to-main (mom-ocean#1647) pull request.

@marshallward
Copy link
Member Author

Gaea regression: https://gitlab.gfdl.noaa.gov/ogrp/mom6ci/MOM6/-/pipelines/25932 ✔️ 🟡

This passes our own tests, I will merge this into the PR for consortium review.

@marshallward marshallward merged commit c346c73 into NOAA-GFDL:gfdl-to-main-2024-11-27 Jan 3, 2025
8 of 10 checks passed
@marshallward marshallward deleted the ebt_src_gm_split branch January 14, 2025 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants