Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support telechat2 #35415

Open
wants to merge 51 commits into
base: main
Choose a base branch
from
Open

Conversation

shunxing12345
Copy link

@shunxing12345 shunxing12345 commented Dec 25, 2024

What does this PR do?

Description

This PR introduces TeleChat2, a new model designed for [specific use case, e.g., multilingual conversational AI or general-purpose language modeling]. TeleChat2 supports model sizes of [e.g., 3B, 7B, and 35B] and has been pre-trained and fine-tuned on large datasets across English and Chinese.

Motivation

Adding TeleChat2 to the Transformers repository allows the community to benefit from its capabilities, leveraging state-of-the-art performance on [specific benchmarks or tasks]. This inclusion aligns with the repository's goal of supporting a diverse range of models and architectures.

Key Changes

  1. Added TeleChat2Config for model configuration.
  2. Added TeleChat2Model, TeleChat2ForCausalLM, and corresponding tokenizers.
  3. Included detailed documentation and tests to ensure compatibility and coverage.
  4. Updated integration points in AutoModel and other relevant files.

Documentation and Tests

  • Model architecture, configuration, and usage examples are documented in [relevant files].
  • Added unit tests for pretraining and fine-tuning workflows.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

xiangw2 and others added 23 commits December 18, 2024 13:37
Signed-off-by: xiangw2 <[email protected]>
fix
Signed-off-by: xiangw2 <[email protected]>
fix
Signed-off-by: xiangw2 <[email protected]>
fix
Signed-off-by: xiangw2 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
@shunxing12345
Copy link
Author

Hi team, 👋

Just following up on this PR. I understand you may have a lot on your plate, but I’d love to hear your thoughts or feedback on this contribution whenever you have time. Please let me know if there’s anything I can do to assist or improve the PR!

Thanks for your time and all the great work you do maintaining this repository! 🙏

Best,

shunxing12345 and others added 5 commits January 2, 2025 15:48
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
@LysandreJik
Copy link
Member

Hey @shunxing12345 👋

Most of us are on holidays this week for the new year. The rhythm will pickup early next week and the team will review your PR as soon as possible. Thanks a lot for your patience and your contribution 🤗

Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
@shunxing12345
Copy link
Author

@Cyrilvallez Thank you for your support! I noticed that some classes or methods differ only in terms of their parameters, while the overall logic remains consistent.

To avoid duplicating code, could you suggest a best practice for handling such cases? For example:

  1. Would it be appropriate to introduce a base class or utility method to centralize the shared logic and allow parameter customization through inheritance or function arguments?
  2. Alternatively, is there a recommended pattern in the modular design to efficiently handle such cases?

I truly appreciate your insights and guidance on this matter. Thank you for your time and support!

Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the docs!

docs/source/en/model_doc/telechat2.md Outdated Show resolved Hide resolved
docs/source/en/model_doc/telechat2.md Outdated Show resolved Hide resolved
@shunxing12345
Copy link
Author

Thanks for the docs!

@stevhliu Thank you for your review! I have updated the docs as requested. Please let me know if there's anything else that needs adjustment. 😊

Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
@shunxing12345
Copy link
Author

Hey! Thanks for the contribution and your patience! From what I can see, this is already in a very healthy state! 🤗 You are on page with our latest Attention standards 👌 However, you miss a modular file! We no longer use the "Copied from" syntax, in favor of a "modular file". You can get started with this link, and check other models' modular file (look for modular_xxx.py in the library, e.g. modular_mistral) for examples! We basically use inheritance in the modular, which is then automatically unravelled in the "modeling file" based on the inherited model. So for example, classes which are fully "Copied from" can very simply inherit from the base class 😉 Let me know if you need more guidance/have any questions!

Hi @Cyrilvallez ,

Thank you so much for your feedback and detailed guidance! 😊 I’ve addressed the issue and added the modular file as suggested. Please let me know if there’s anything else I should refine or improve further.

Thanks again for your support!

Copy link
Member

@Cyrilvallez Cyrilvallez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! Basically, the easiest (if you do not have many models checkpoints on the hub already?) is to match your config arguments names and model layer names to existing ones in Llama (it is extremely important for everyone to always see the same names so that they know what they are immediately, instead of figuring it out for every model/config). For the config it is straightforward, however model checkpoints will require a conversion script as for e.g. https://github.com/huggingface/transformers/blob/main/src/transformers/models/glm/convert_glm_weights_to_hf.py to change the layer names, and in your case split the k/v layers in 2!

Doing so will greatly simplify the modular as you will be able to inherit everything directly! (I did not review all the file, just the beginning so that you can understand my point!)

src/transformers/models/telechat2/modular_telechat2.py Outdated Show resolved Hide resolved
src/transformers/models/telechat2/modular_telechat2.py Outdated Show resolved Hide resolved
src/transformers/models/telechat2/modular_telechat2.py Outdated Show resolved Hide resolved
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
Signed-off-by: shunxing12345 <[email protected]>
@shunxing12345
Copy link
Author

Hi @Cyrilvallez

Thanks for your guidance and suggestions! I’ve addressed the issues and completed the necessary updates as per your feedback. Please let me know if there’s anything else I should refine. 😊

Best regards!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants