-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Doubt about the attention_mask #6
Comments
Hi @caojiangxia, thx for your question! If you mean the negate operation in For your another concern, I used Moreover, there's two ways for masking attention weights:
I do not like leaky attention weights that may introduce bit of For more details related to your question, pls check #1 Stay healthy~ |
Thank you very much! (True means mask it. I am misunderstanding it.) |
Thanks for your full explanation! |
Honestly, in the original author implement, the evaluation process didn't ranking all remaining items for each user, which will make the metric much higher than its practical performance. |
@caojiangxia yes, the performance overestimation is a big problem, NDCG&HIT would drop to really low score if rank all items, pls check https://www.kdd.org/kdd2020/accepted-papers/view/on-sampled-metrics-for-item-recommendation for more detailed study of this issue. |
Does that mean the current sequential recommender paper's reported experiment results are not reliable, such as SASRec and TiSASRec? Here need convictive experiments about this task. |
it depends on your definition of |
First:
timeline_mask = torch.BoolTensor(log_seqs == 0).to(self.dev) -> timeline_mask = torch.FloatTensor(log_seqs > 0).to(self.dev)
Second:
attention_mask = ~torch.tril(torch.ones((tl, tl), dtype=torch.bool, device=self.dev)) - > attention_mask = torch.tril(torch.ones((tl, tl), dtype=torch.float, device=self.dev))
Why there has a sign ~ ?
The text was updated successfully, but these errors were encountered: