You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to finetune a LM with multifit on custom dataset and then finetune the classifier for prediction. Unfortunately I got an OOM after few steps with multifit during the training of the CLS.
I tried to first train the LM then close the session to clean the gpu memory and then train the classifier (loading the encoder weights if I am not wrong in my code) but it does not help. I can not use the same batch size. Is it normal or am I doing something wrong ?
PS : bs = 256
`---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
in ()
3 learn_cls_fwd.load_encoder("encoder_lm_fr_fwd")
4 learn_cls_fwd.freeze()
----> 5 learn_cls_fwd.fit_one_cycle(3)
6 learn_cls_fwd.save("multifit_cls_pretrained_fr")
9 frames
/usr/local/lib/python3.6/dist-packages/fastai/text/learner.py in (.0)
253 def concat(self, arrs:Sequence[Sequence[Tensor]])->List[Tensor]:
254 "Concatenate the arrs along the batch dimension."
--> 255 return [torch.cat([l[si] for l in arrs], dim=1) for si in range_of(arrs[0])]
256
257 def reset(self):
RuntimeError: CUDA out of memory. Tried to allocate 1.02 GiB (GPU 0; 15.90 GiB total capacity; 12.72 GiB already allocated; 599.88 MiB free; 14.61 GiB reserved in total by PyTorch)`
My piece of code :
# pretrained LM
if pretrained_lm:
data_lm_fwd = (TextList.from_df(lm_tr.iloc[:10000], path, cols='comment_text', **fa_config)
.split_by_rand_pct(0.05, seed=42)
.label_for_lm()
.databunch(bs=bs, num_workers=4))
data_lm_fwd.save("fr_data_lm_forward")
if pretrained_lm:
learn_fwd = exp.finetune_lm.get_learner(data_lm_fwd)
learn_fwd.model.cuda()
learn_fwd.lr_find()
learn_fwd.recorder.plot()
# learn is a preconfigured fastai learner with a pretrained model loaded
if pretrained_lm:
learn_fwd.fit_one_cycle(2)
learn_fwd.unfreeze()
for i in range(5):
learn_fwd.fit_one_cycle(2)
learn_fwd.save_encoder("encoder_lm_fr_fwd")
# cls
if pretrained_cls:
data_cls = (TextList.from_df(tr1, path, cols="comment_text", **fa_config)
.split_from_df(col="val")
.label_from_df(cols="toxic")
.databunch(bs=64, num_workers=2))
if pretrained_cls:
learn_cls_fwd = exp.classifier.get_learner(data_cls)#, metrics=[AUROC])
learn_cls_fwd.load_encoder("encoder_lm_fr_fwd")
learn_cls_fwd.freeze()
learn_cls_fwd.fit_one_cycle(3)
learn_cls_fwd.save("multifit_cls_pretrained_fr")
The text was updated successfully, but these errors were encountered:
Hi,
Thank you for sharing your repo.
I am trying to finetune a LM with multifit on custom dataset and then finetune the classifier for prediction. Unfortunately I got an OOM after few steps with multifit during the training of the CLS.
I tried to first train the LM then close the session to clean the gpu memory and then train the classifier (loading the encoder weights if I am not wrong in my code) but it does not help. I can not use the same batch size. Is it normal or am I doing something wrong ?
PS : bs = 256
`---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
in ()
3 learn_cls_fwd.load_encoder("encoder_lm_fr_fwd")
4 learn_cls_fwd.freeze()
----> 5 learn_cls_fwd.fit_one_cycle(3)
6 learn_cls_fwd.save("multifit_cls_pretrained_fr")
9 frames
/usr/local/lib/python3.6/dist-packages/fastai/text/learner.py in (.0)
253 def concat(self, arrs:Sequence[Sequence[Tensor]])->List[Tensor]:
254 "Concatenate the
arrs
along the batch dimension."--> 255 return [torch.cat([l[si] for l in arrs], dim=1) for si in range_of(arrs[0])]
256
257 def reset(self):
RuntimeError: CUDA out of memory. Tried to allocate 1.02 GiB (GPU 0; 15.90 GiB total capacity; 12.72 GiB already allocated; 599.88 MiB free; 14.61 GiB reserved in total by PyTorch)`
My piece of code :
The text was updated successfully, but these errors were encountered: