-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some problems when running tester.py #10
Comments
Thanks for finding this issue! The latest version of pantheonrl should fix this bug (it was caused by a new observation type that is unsupported from SB3 by default). For reference, tester.py follows a similar syntax to trainer.py (except there are no presets). For example, if we want to run Liars dice, we can train two agents with:
And then we can test these two agents with:
|
Thank you for the quick handling of the issue! Additionally, is it possible to set LOAD LOAD on the trainer? This process is included in LOAD PPO setting, but I thought it is necessary to control save-load process in the command. What part should be changed to implement LOAD LOAD setting? |
Oh yeah, reloading both should be possible for fine tuning. We can essentially create a separate version of the gen_fixed function, but we need to wrap it in the appropriate Agent type (OnPolicyAgent or AdapAgent). I can add this feature soon-ish, but if you need this functionality in the meantime you can also write a script that loads the policy you want. If you look at the overcookedtraining.py within the examples folder, you can replace the |
This comment was marked as off-topic.
This comment was marked as off-topic.
Thanks for your previous apply, and I got one more question about Logger output( "rollout/ep_len_mean", "time/fps", "train/loss", ...etc.) when running the trainer.py. While running trainer.py, the outputs of the logger are printed from "ego.learn()". I thought the reference of this module is "algos/modular/learn.py", but even if I changed the code ( such as "self.logger.record("time/fps",fps)") on learn.py, the outputs of the logger were not changed. Where can I control the contents of the logger output? Isn't the ego.learn() module from "algos/modular/learn.py" or "algos/adap/adap_learn" ? |
Great question! Based on the code you have given earlier, it seems like you are using the PPO policy from stablebaselines3, so all of the logger logic comes from there. If you would like to change the logger interface, you would probably need to define a separate PPO implementation that logs the information you want. Alternatively, you could also use CleanRL's implementation of PPO (https://github.com/vwxyzjn/cleanrl), which is easier to understand and cleanly defines the logging behavior. However, it is not a drop-in replacement for SB3's PPO, so you may need to do some extra work to integrate pantheonrl with this different interface. |
Hi, Could you please offer an example of tester.py?
The code worked without a problem with the trainer.py, but the problem occurred with the tester. The error is as follows.
Traceback (most recent call last):
File "tester.py", line 194, in
run_test(ego, env, args.total_episodes, args.render)
File "tester.py", line 50, in run_test
action = ego.get_action(obs, False)
File "C:\Users\user\Desktop\pantheon\PantheonRL\pantheonrl\common\agents.py", line 72, in get_action
actions, _, _ = action_from_policy(obs.obs, self.policy)
AttributeError: 'numpy.ndarray' object has no attribute 'obs'
This problem does not occur when using the FIXED agent in the trainer.
My torch version is 1.13.1, stable-baseline3 version is 1.6.2
Thanks
The text was updated successfully, but these errors were encountered: