Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Same output when right padding #24

Open
federicosiciliano opened this issue Jun 28, 2022 · 3 comments
Open

Same output when right padding #24

federicosiciliano opened this issue Jun 28, 2022 · 3 comments

Comments

@federicosiciliano
Copy link

I am experiencing an issue when I give the network sequences in which the last object is replaced by padding (0).
In this case, the trained model always outputs the same sequence, regardless of the other values present in the sequence.
Is this by any chance a known problem?

From what I understand, the emb_dropout in log2feats should make the model robust against this type of sequences.
Am I wrong?

Thank you in advance for your response

@pmixer
Copy link
Owner

pmixer commented Jun 28, 2022

@siciliano-diag as I remembered, item 0 is just nothing, embedding for this item was fixed all-zero vector, used to mask unused positions in the sequence, and its a causal model, which means masking the last position would mask all previous step inputs. And putting the nothing on top of the sequence is unreasonable practically. Thus, altough I doubt the claim that model is robust, what you observed is expected.

@federicosiciliano
Copy link
Author

federicosiciliano commented Jun 28, 2022

Thank you very much for your response.

I understand why it doesn't make sense to have 0 at the last position, but in case I need to get the model to work in these cases as well, but without modifying the sample itself, would you happen to know which part of the code to modify?

Also, as far as fixing the padding embedding to the zero-vector, it wasn't actually working for me because this part:

for name, param in model.named_parameters():
        try:
            torch.nn.init.xavier_normal_(param.data)
        except:
            pass # just ignore those failed init layers

somehow initializes the Embedding layer again, so also the embedding for 0.

@pmixer
Copy link
Owner

pmixer commented Jun 29, 2022

hi @siciliano-diag , thx, the former response was not just for claiming item 0 is not proper to occupy last position, also for suggesting that what got observed may just be expected behaviour of the model as its 'causal model', pls try to elaborate into the details of 'causal model' and even transformer itself for what you want, here's the video by Prof. Li talking about transformer https://www.youtube.com/watch?v=ugWDIIOHtPA, also you may find lots of materials online to help you understand and customize your model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants