PPO-LunarLander / results.json
rmutel's picture
Improved the model by training for 200000 more steps
1c0a160
raw
history blame contribute delete
No virus
164 Bytes
{"mean_reward": 253.97345039999996, "std_reward": 49.61074318828682, "is_deterministic": true, "n_eval_episodes": 10, "eval_datetime": "2023-11-01T08:36:45.453108"}