Update: Test for Compatibility with Transformers 4.48 #239
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We recently encountered a failure due to the release of Transformers 4.48. Details of the failure can be found here: GitHub Actions Failure.
Cause
The failure occurred because the Llama model definition in Transformers was updated to use a single rotary embedding for the model, replacing the previous implementation where there was one rotary embedding per attention block. This change was introduced in the following commit: Transformers Commit (line 261).
Fix and Validation
Instead of relying o hardcoded values, we count the modules in question now, and check if after quantization config application the number stays the same. Credits to @horheynm for the better fix #238. The test now successfully passes with all versions of Transformers, as demonstrated below:
Transformers 4.47.1
Transformers 4.48.0
Summary
This update ensures compatibility with the latest Transformers release (4.48) while maintaining support for previous versions. All relevant tests now pass, confirming the fix.