strategy for training this model
#1
by
gd1m3y
- opened
I was curious what kind of data was used to train this for ppo also what strategy was used for deciding reward
I was curious what kind of data was used to train this for ppo also what strategy was used for deciding reward