the art of renaming?

#6
by J22 - opened

Comparing this v2 model with the older one, we can find lots of meaningless renaming of variables, such as:

  • lm_head -> output
  • embed_tokens -> tok_embeddings
  • post_attention_layernorm -> ffn_norm
  • mlp -> feed_forward
  • self_attn -> attention
  • o_proj -> wo
  • gate_proj -> w1
  • down_proj -> w2
  • up_proj -> w3

I fully respect the hardwork and kindness of sharing the model. But, I still want to say, these modifications are truly meaningless, and may hurt the community.

Sign up or log in to comment