You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@lipingcoding thx for reporting the issue, I also observed some problems and still wondering what's wrong with pytorch version compared with tf implementation. Sorry to say but I still haven't figured it out yet. The only thing I can be sure currently is that original paper's hyperparameter setting could be be directly used for this codebase, as I fixed some leaky attention issue by using PyTorch's MHA, the parameter initialization issue still need to be elaborated but I haven't done it yet.
No description provided.
The text was updated successfully, but these errors were encountered: