Add `average_log_prob` args for cpo #510

Mecoli1219 · 2025-01-03T17:22:23Z

Summary

trl CPO implementation didn't average the log_probs, while the liger kernel averages it when computing the loss. This will cause a mismatch when integrating them.

Testing Done

Updating unit test (still investigating why unit test fail locally)

Hardware Type:
run make test to ensure correctness
run make checkstyle to ensure code style
run make test-convergence to ensure convergence

Signed-off-by: Mecoli1219 <[email protected]>

src/liger_kernel/chunked_loss/fused_linear_preference.py

kashif · 2025-01-03T20:28:01Z

TRL is using the default as in the official repo for CPO: https://github.com/fe1ixxu/CPO_SIMPO/blob/main/scripts/cpo_trainer.py#L626

Co-authored-by: Kashif Rasul <[email protected]>

Signed-off-by: Mecoli1219 <[email protected]>

austin362667 · 2025-01-08T06:06:56Z

test/chunked_loss/test_cpo_loss.py

@@ -139,7 +140,7 @@ def forward(self, x, y):
 @pytest.mark.parametrize(
    "scalar, dtype, atol, rtol",
    [
-        (1.0, torch.bfloat16, 5e-3, 5e-3),
+        (1.0, torch.bfloat16, 5e-2, 5e-2),


What's the reasoning behind this adjustment?

@kashif and I find that after disabling average_log_prob for CPO, it will have a higher deviation from HF implementation when the model is large and the data type is bf16. Since the result is still close within both methods, we increase atol and rtol to make this test pass.

as bfloat16 is less accurate for larger numbers, this is needed to make the test pass and is the same as in the other bfloat16 tests

Then adjusting tol makes sense. ❤️

austin362667

Thank you both for making this PR. Hopefully, it unblocks huggingface/trl#2506.

kashif · 2025-01-08T08:11:33Z

awesome thank you! we would still need a release of liger-kernel for the CI to pass but yes it will hopefully unblock!

Mecoli1219 added 3 commits January 3, 2025 09:04

Add average_log_prob args for cpo

09f635b

Signed-off-by: Mecoli1219 <[email protected]>

updat unit test

b66b925

Signed-off-by: Mecoli1219 <[email protected]>

update simpo unit test

89e2924

Signed-off-by: Mecoli1219 <[email protected]>

kashif mentioned this pull request Jan 3, 2025

[Liger] Integrate Liger CPO & SimPO huggingface/trl#2506

Open

6 tasks

kashif reviewed Jan 3, 2025

View reviewed changes

src/liger_kernel/chunked_loss/fused_linear_preference.py Outdated Show resolved Hide resolved

Mecoli1219 and others added 3 commits January 4, 2025 05:56

Update src/liger_kernel/chunked_loss/fused_linear_preference.py

d82e918

Co-authored-by: Kashif Rasul <[email protected]>

increase atol/rtol for cpo

64ceead

Signed-off-by: Mecoli1219 <[email protected]>

Merge branch 'main' into cpo-average-log-prob

59ebcc8

austin362667 reviewed Jan 8, 2025

View reviewed changes

Merge branch 'main' into cpo-average-log-prob

6a97713

austin362667 approved these changes Jan 8, 2025

View reviewed changes

austin362667 merged commit 23e3772 into linkedin:main Jan 8, 2025
2 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `average_log_prob` args for cpo #510

Add `average_log_prob` args for cpo #510

Mecoli1219 commented Jan 3, 2025 •

edited

Loading

kashif commented Jan 3, 2025 •

edited

Loading

austin362667 Jan 8, 2025 •

edited

Loading

Mecoli1219 Jan 8, 2025

kashif Jan 8, 2025

austin362667 Jan 8, 2025

austin362667 left a comment

kashif commented Jan 8, 2025

Add average_log_prob args for cpo #510

Add average_log_prob args for cpo #510

Conversation

Mecoli1219 commented Jan 3, 2025 • edited Loading

Summary

Testing Done

kashif commented Jan 3, 2025 • edited Loading

austin362667 Jan 8, 2025 • edited Loading

Choose a reason for hiding this comment

Mecoli1219 Jan 8, 2025

Choose a reason for hiding this comment

kashif Jan 8, 2025

Choose a reason for hiding this comment

austin362667 Jan 8, 2025

Choose a reason for hiding this comment

austin362667 left a comment

Choose a reason for hiding this comment

kashif commented Jan 8, 2025

Add `average_log_prob` args for cpo #510

Add `average_log_prob` args for cpo #510

Mecoli1219 commented Jan 3, 2025 •

edited

Loading

kashif commented Jan 3, 2025 •

edited

Loading

austin362667 Jan 8, 2025 •

edited

Loading