Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add smoothing spline baseline #299

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

aafkevandenberg
Copy link
Collaborator

I added functions to fit a baseline with a smoothing spline.
The smoothing factor is optimized using k-fold cross validation.
Therefore I added a function to plot the mse on the test data vs the smoothing factors.

Let me know what you think.

JoepVanlier and others added 5 commits April 15, 2022 17:10
I think there is a little tricky thing here regarding the testing. Because the distance should be used as-is (with baseline) and the force should be baseline corrected, you end up with an implicit equation in the trap position:

`(trap_position - trap2_ref) = tether_length + 2 * bead_radius + 2 * (wlc_force + baseline(trap_position)) / stiffness`

Note how trap position appears on the left and right thanks to the baseline.
Copy link
Member

@JoepVanlier JoepVanlier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I've had a first look at this and come to the same conclusion as you, that the smoothing factor based optimization doesn't really work that well unless you also downsample. This suggests that we should provide a good default for this. Otherwise, looking pretty good already!

For tests, I think adding a test which tests unique_sorted and one that tests the baseline (both code paths, with fixed and non-fixed smoothing factor) should suffice.

Adding a package involves adding it to setup.py. For conda, it might be a bit trickier since I don't see this package on conda-forge. I'd have to look into it.

To muddy the waters a bit more, another class of algorithms that's probably worth exploring for this purpose is LO(W)ESS regression.

Comment on lines +19 to +24
x = trap_position.data
u, c = np.unique(x, return_counts=True)
m = np.isin(x, [u[c < 2]])
ind = np.argsort(x[m])

return x[m][ind], force.data[m][ind]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could consider simplifying this to:

def unique_sorted(trap_position, force):
    """Sort and remove duplicates trap_position data to prepare for fit smoothing spline.

    Parameters
    ----------
    trap_position : array_like
        Trap mirror position
    force : array_like
        Force data
    """

    sorted_position, idx, count = np.unique(trap_position, return_index=True, return_counts=True)
    return sorted_position[count < 2], force[idx][count < 2]

Saves you a search and a sort (note that the result from np.unique is already sorted).

Given that you return the raw numpy arrays rather than slices, I would also consider just taking raw numpy arrays as input and extracting the data where its called (rather than passing a slice).

Trap mirror position data
force : lumicks.pylake.Slice
Force data
smoothing_factor : float
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether it would make sense to merge smoothing_factor and smoothing_factors.

If the input is just a value or has only one element, you use it as fixed value and otherwise you perform the optimization. You can use np.atleast_1d to upconvert a value to a numpy array so that you can always use the len operation on it. The default can then just be a default list that generally works.

)
model = csaps(x_sorted, y_sorted, smooth=smoothing_factor)

return cls(model, trap_position, force)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue with calling the constructor like this is that you don't actually put the data that you used for fitting in now (you fitted to x_sorted and y_sorted but store the original raw data).

Comment on lines +49 to +50
mse_test_vals = np.zeros(len(smoothing_factors))
x_sorted, y_sorted = unique_sorted(trap_position, force)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I find a bit surprising is that you choose to pass trap position and force to this function, despite already having x_sorted and y_sorted. Is there a specific reason you prefer this?

force,
smoothing_factors,
n_repeats,
plot_smoothingfactor_mse,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering that the function has smoothing_factor in the name, I reckon plot_mse would be sufficient.

@JoepVanlier JoepVanlier changed the base branch from main to piezo_tracking May 5, 2022 13:58
@JoepVanlier JoepVanlier changed the base branch from piezo_tracking to main May 5, 2022 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants