You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Additionally, we should simplify the lora/qlora w 8/4bit loading. Ideally we get rid of the adapter: qlora option as qlora is simply a specific subset of lora where all linear layers are targeted, and 4 bit quantization. I think we can simplify this to only lora and allow either 4 or 8bit to be set. And if a user selects qlora, then we warn about the specific cases where qlora applies - 4 bit and targeting all linear layers
I'll give this a go - just so I understand the first part correctly:
I should be able to do (full) finetuning with an existing adapter model as base_model arg in the training config. In that case the base model of the adapter should be merged with the adapter (i.e. using merge_and_unload) and then full finetune should be ran on the merged model.
Optionally full finetune can be swapped for new lora/qlora training run, adding new lora-layers/adapters to the merged model and then proceeding to train the new adapter weights only.
So in the first case, a user coukd add a lora model dir arg, but have an empty adapter. The model should simply just merge and unload the adapter into the base model only.
I'm not clear on what you are asking for the second case.
🔖 Feature description
there are a few use cases that aren't cleanly handled atm.
currently both of these can be worked around by simply manually merging the models beforehand, but it would be nice to handle these cases.
✔️ Solution
see above
❓ Alternatives
No response
📝 Additional Context
No response
Acknowledgements
The text was updated successfully, but these errors were encountered: