3v324v23's picture
批量生成函数注释
77408f7
raw
history blame
530 Bytes
"In practice, we found that a high-entropy initial state is more likely to increase the speed of training.
The entropy is calculated by:
$$H=-\sum_{k= 1}^{n_k} p(k) \cdot \log p(k), p(k)=\frac{|A_k|}{|\mathcal{A}|}$$
where $H$ is the entropy, $|A_k|$ is the number of agent nodes in $k$-th cluster, $|\mathcal{A}|$ is the total number of agents.
To ensure the Cooperation Graph initialization has higher entropy,
we will randomly generate multiple initial states,
rank by their entropy and then pick the one with maximum $H$."