File size: 1,257 Bytes
07ab572
 
 
 
6ded603
 
54f3221
 
6ded603
 
abf7d9d
 
6ded603
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
---
license: cc
---

https://twitter.com/_lyraaaa_/status/1819145905972691227

model config is identical to [the stock stable_audio_2.0_vae](https://github.com/Stability-AI/stable-audio-tools/blob/main/stable_audio_tools/configs/model_configs/autoencoders/stable_audio_2_0_vae.json) included in the stable-audio-tools repo

finetuned stable audio open's vae for 100k steps to try and fix its habit of colorizing gritty sounds

![image/png](https://cdn-uploads.huggingface.co/production/uploads/62fbf13821c444a56f7c0a01/jce-A9jcyQUB96ETgtd8N.png)

the blue and orange runs are near-identical, same seed etc, except the orange one had the encoder and bottleneck frozen while blue was a full train. orange model has an identical latent space and therefore is instantly swappable into any stable audio open model, blue will require further training in exchange for slightly higher fidelity. 

to use the blue vae, pass it to your train command with --pretransform-ckpt-path.
to use the orange vae, you'll need to load the stable audio open model (with original vae), load the new vae, and then replace model.pretransform.model with it.

further instructions may be written at some point, but i highly recommend you play with the code and figure it out yourself!