Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why put test[u][0] into the item_index list ?? #11

Closed
BEbillionaireUSD opened this issue Feb 7, 2021 · 8 comments
Closed

Why put test[u][0] into the item_index list ?? #11

BEbillionaireUSD opened this issue Feb 7, 2021 · 8 comments
Labels
documentation Improvements or additions to documentation

Comments

@BEbillionaireUSD
Copy link

Hi, could you please answer my question?
In the evaluate function, it makes an item_index list and put the test[u][0] in it.
What I consider is that the test[u][0] should be what we want to predict, but in this way, the model knows it should predict from the possibility of these candidates, including the one we want to predict.
Is this a kind of data leaking? Or did I misunderstand something?

for u in users:

        if len(train[u]) < 1 or len(test[u]) < 1: continue

        seq = np.zeros([args.maxlen], dtype=np.int32)
        idx = args.maxlen - 1
        seq[idx] = valid[u][0]
        idx -= 1
        for i in reversed(train[u]):
            seq[idx] = i
            idx -= 1
            if idx == -1: break
        rated = set(train[u])
        rated.add(0)
        item_idx = [test[u][0]]  ##### WHY???
        for _ in range(100):
            t = np.random.randint(1, itemnum + 1)
            while t in rated: t = np.random.randint(1, itemnum + 1)
            item_idx.append(t)

        predictions = -model.predict(*[np.array(l) for l in [[u], [seq], item_idx]])
        predictions = predictions[0] # - for 1st argsort DESC

        rank = predictions.argsort().argsort()[0].item()

My understanding of this phase is that: The model randomly chooses 100 candidates from all items (except those in the train sequence ) and adds the one it wants to predict into the candidate set. Then it predicts the probability of these 101 candidates. The logic seems to be strange.

@pmixer
Copy link
Owner

pmixer commented Feb 7, 2021

@cherylLbt hi, item_index is negatively sampled items to be ranked by the model, thus it must contain the real-next-item.

@BEbillionaireUSD
Copy link
Author

Thanks for your quick reply. But what if I want to predict the next item while I don't know the real-next one?

@pmixer
Copy link
Owner

pmixer commented Feb 8, 2021

Thanks for your quick reply. But what if I want to predict the next item while I don't know the real-next one?

u are welcome :)

We need to 'predict' only when we do not know sth, pls pay attention to the difference between model evaluation(testing, see what the model can do when we have some sample data to test it) and model inference(online serving, we do not know the right answer like real-next item, thus deploy a model to somehow predict it).

Moreover, for a recommender system, you at least have two options for model serving:

  1. rank all items for all time like what got addressed in https://www.kdd.org/kdd2020/accepted-papers/view/on-sampled-metrics-for-item-recommendation
  2. firstly recall a set of items(for example 100), then rank these recalled items, this is a common approach for industry recommender system

for But what if I want to predict the next item while I don't know the real-next one?, again, we do need to predict because we do not know. If you mean for model evaluation, just rank all items(like in https://github.com/pmixer/TiSASRec.debug), or just keep the original setting do not add real-next one, we would definitely fail for cases in which the real-next one not included in the negatively sample set.

@pmixer pmixer added the documentation Improvements or additions to documentation label Feb 8, 2021
@BEbillionaireUSD
Copy link
Author

Thanks a lot for your detailed explanation, I understand!!

@BEbillionaireUSD
Copy link
Author

Hi, sorry to bother you. I trained the model 800 epochs and did predictions on the dataset movielen-1m (i. e. I mask the last item and input the previous sequence into it. For the item_index, I put in all items. Then I calculated the probability of all items)
The training Hit Rate is only 0.15 in this way. Is it normal?

@pmixer
Copy link
Owner

pmixer commented Feb 10, 2021

Hi, sorry to bother you. I trained the model 800 epochs and did predictions on the dataset movielen-1m (i. e. I mask the last item and input the previous sequence into it. For the item_index, I put in all items. Then I calculated the probability of all items)
The training Hit Rate is only 0.15 in this way. Is it normal?

As expected. See check https://github.com/pmixer/TiSASRec.debug and #6 if you are interested. Sharing on how to solve this issue is more than welcomed.

@BEbillionaireUSD
Copy link
Author

BEbillionaireUSD commented Feb 10, 2021 via email

@BEbillionaireUSD
Copy link
Author

Thanks a lot. It totally answers my question. Hope to promote this problem

@pmixer pmixer pinned this issue Nov 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants