Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

require_grad is False when actor_grad is dynamic #12

Open
mrsamsami opened this issue Mar 14, 2023 · 5 comments
Open

require_grad is False when actor_grad is dynamic #12

mrsamsami opened this issue Mar 14, 2023 · 5 comments

Comments

@mrsamsami
Copy link

When running the code on DMC, because the actor_grad is dynamics; therefore, loss_policy would be -value_target. value_target is not dependent on the actor's policy distribution, and so, loss_policy does not have any gradient flowing through it with respect to the actor's parameters.
The assertion will be assert (False and True) or not True, since loss_policy does not require gradients. Therefore, the assertion becomes False.
How can we fix it?

@bilkitty
Copy link

I could be completely wrong, but is it possible that the assert is hard-coded for the case when actor_grad == "reinforce"?

@artemZholus
Copy link

Same problem here!

@jurgisp
Copy link
Owner

jurgisp commented Apr 19, 2023

You are right, it is very likely that assertion was written assuming actor_grad=reinforce. What if you simply remove it, does it work then?

To be honest, I did way less testing with actor_grad=dynamics. The functionality did work at one point and was tested with DMC, but something could have changed since then.

@bilkitty
Copy link

Yes, ignoring it works. However, I'm still a little confused about how the dynamics back-prop works given the non-diff value target. If we scope out the entropy loss, can you clarify how, in the code, the actor's parameters are updated?

@aagha6
Copy link

aagha6 commented Jul 22, 2024

If you use the reinforce policy gradient then you don't back-prop through the dynamics anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants