Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test against DM implementation of CTInterpolator #84

Merged
merged 7 commits into from
Sep 30, 2024

Conversation

arunkannawadi
Copy link
Member

@arunkannawadi arunkannawadi commented Jul 30, 2024

Since the interpolator objects can now be specified, I have added a test in this PR to compare that the CT interpolation implemented in DM behaves identical to the implementation here. The implementation differs in that i) the computationally expensive search function under the hood that finds good pixels around bad pixels is implemented in C++ instead of numba for policy reasons and ii) it uses the SpanSet data structure to find the bad pixels more efficiently than the multipass brute-force search implemented here. The test here ensures that the end results are the same and will highlight any regressions in the future.

@arunkannawadi arunkannawadi force-pushed the DM_CT_interpolator branch 4 times, most recently from 59bb716 to 93d047d Compare July 30, 2024 23:50
@arunkannawadi arunkannawadi marked this pull request as ready for review July 30, 2024 23:54
@arunkannawadi arunkannawadi requested a review from esheldon July 30, 2024 23:54
descwl_coadd/tests/test_coadd_and_psf_coadd_agree.py Outdated Show resolved Hide resolved
@@ -78,7 +78,7 @@ def test_coadd_image_correct(crazy_wcs, crazy_obj):
world_origin = galsim.CelestialCoord(0 * galsim.degrees, 0 * galsim.degrees)

aff = galsim.PixelScale(scale).affine()
aff = aff.withOrigin(
aff = aff.shiftOrigin(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is shiftOrigin more appropriate than withOrigin here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been deprecated since GalSim v2.3. I just noticed it when looking at the GitHub Actions logs when I had a failure in the beginning.

https://github.com/GalSim-developers/GalSim/blob/f6499baeadf433a34f98f5c918b2580e6d28bf7b/docs/older.rst#L92

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

withOrigin is basically a call to shiftOrigin with a deprecation warning:

https://github.com/GalSim-developers/GalSim/blob/f6499baeadf433a34f98f5c918b2580e6d28bf7b/galsim/wcs.py#L699-L702

Comment on lines 226 to 227
assert (coadd_data['coadd_exp'].image.array ==
dm_coadd_data['coadd_exp'].image.array).all()
Copy link
Collaborator

@beckermr beckermr Jul 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This level of agreement is surprising in that you'd need the set of good pixels around the bad ones to match exactly.

Let's add an assert that some data was actually interpolated in the test. Otherwise things would appear to agree but it'd be because we didn't do any interpolation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic of selecting the bad pixels is exactly the same and involves only integer tuples. The floating point operations are all done by the scipy function, so I am not surprised that the agreement is exact. But I'll add a test to see that some pixels are actually interpolated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned in the pr description that the dm version was more efficient since it used some sort of spanset data structure / algorithm. I assumed this might not give the same results in all cases. Is that not true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I said that we have a different implementation of this function in the DM codebase: https://github.com/LSSTDESC/descwl_coadd/blob/master/descwl_coadd/interp.py#L13-L14

Once we get the location of good pixels and bad pixels, we call the scipy.CloughTocher2DInterpolator the same way as done here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, here's the C++ implementation of the function that behaves identical to the function I posted above: https://github.com/lsst/meas_algorithms/blob/main/src/CloughTocher2DInterpolatorUtils.cc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch with the pixels not being interpolated in the first place, Matt. I'm going to do some more local testing and raise it again later. Marking it as a draft for now.

@arunkannawadi arunkannawadi force-pushed the DM_CT_interpolator branch 3 times, most recently from 10b2217 to b6a6c0f Compare July 31, 2024 16:49
@arunkannawadi arunkannawadi marked this pull request as draft July 31, 2024 20:14
@arunkannawadi
Copy link
Member Author

arunkannawadi commented Sep 6, 2024

It turns out that the Delaunay triangulation happening under the hood is not invariant under x-y flip (not too surprising). The DM implementation uses (x, y) notation which is flipped relative to the [row, col] notation used with the numpy arrays here. This results in slightly different values in the output of scipy.CloughTocher2DInterpolator. I've included a PR in meas_algorithms to support either modes. Both modes are self-consistent and I don't think we can say that one is better than the other. The tests should run fine once a weekly release with that PR is available on stackvana.

@arunkannawadi arunkannawadi force-pushed the DM_CT_interpolator branch 3 times, most recently from 568ee55 to c029a46 Compare September 16, 2024 15:09
@arunkannawadi arunkannawadi marked this pull request as ready for review September 16, 2024 16:31
@arunkannawadi
Copy link
Member Author

w37 appeared on stackvana over the weekend, which means this PR is ready to be reviewed and merged.

actual=dm_coadd_data['coadd_exp'].image.array,
desired=coadd_data['coadd_exp'].image.array,
atol=0.0,
rtol=1653.16,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we allow such a large rtol?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these values are empirically set based on observed differences and are intended to act as benchmarking rather than tests. For small pixel values close to zero, this shouldn't be surprising but this is just how sensitive CTInterpolator is to the choice of coordinate frames.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reducing rtol and running the tests gives me this.

image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test is better specified in atol anyways - the values cross zero

descwl_coadd/tests/test_coadd.py Show resolved Hide resolved
actual=dm_coadd_data['coadd_exp'].image.array,
desired=coadd_data['coadd_exp'].image.array,
atol=0.0,
rtol=1653.16,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test is better specified in atol anyways - the values cross zero

@arunkannawadi
Copy link
Member Author

Why does values crossing zero matter? rtol is the ratio between the absolute value of the difference to the absolute value of the desired value.

@beckermr
Copy link
Collaborator

Here is the formula from the numpy docs:

absolute(a - b) <= (atol + rtol * absolute(b))

If b is close to zero and atol=0, then rtol may need to be quite large to acomodate a modest change in the absolute value. For example, if a=1e-10 and b=1e-16, then you'd need rtol > |a-b|/b ~ 9.99e5 for the test to pass. On the other hand, an absolute tolerance of ~1e-7 might be more sensible if you think both numbers effectively mean zero.

In this case, the image values are of order -1 to 1 and sometimes quite small. Thus we hit the math above near zero quite a bit, causing the need for a large rtol. Using an absolute tolerance of a few tenths is more sensible and expresses better what is happening.

@arunkannawadi
Copy link
Member Author

That's all consistent with what I told Erin. There's a test with atol as well in addition. If this rtol is too high, then this is as good as not having a test with rtol anyway. Is the concern that a future version of numpy or scipy might make this test fail?

@beckermr
Copy link
Collaborator

Yes, I suspect the test based on atol will be more robust.

@arunkannawadi
Copy link
Member Author

@esheldon @beckermr am I good to merge this?

@esheldon
Copy link
Collaborator

LGTM

@arunkannawadi
Copy link
Member Author

I'll merge this despite the stale 'Changes requested' from @beckermr since i) the comments have been addressed and ii) this is largely an addition to unit tests and not a change to the behavior of the code

@arunkannawadi arunkannawadi merged commit 2e771d7 into master Sep 30, 2024
1 check passed
@arunkannawadi arunkannawadi deleted the DM_CT_interpolator branch September 30, 2024 18:58
@beckermr
Copy link
Collaborator

Thanks and sorry for missing the formal lgtm!

@arunkannawadi
Copy link
Member Author

Well, we do need your help now. GHA is failing with

InvalidSpec: The package "conda-forge/noarch::mpi==1.0=mpich" is not available for the specified platform

that was not the case about a week ago.

@beckermr
Copy link
Collaborator

Yep - will get fixed by conda-forge/admin-requests#1094

Just merged - will be maybe 30 mins

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants