ecker
/

vall-e

Model card Files Files and versions Community

ecker commited on 21 days ago

Commit

9ddb793

•

1 Parent(s): ce23edb

Upload fp32.sft

Browse files

Received additional training on my 7900XTX (AMP bfloat16, SDPA):
* a number of unshuffled samples from 12 seconds to 24 seconds, to verify that my ROCm setup works.
* another number of shuffled samples from 3 seconds to 32 seconds, to re-"teach" the model to work for any duration rather than the last duration it was trained against.
* another number of shuffled samples from 3 seconds to 60 seconds but with a RVQ distribution favoring the higher levels, to see if it lobotomizes the AR and clean up the NAR's audio (it seems fine).

Files changed (1) hide show

models/ckpt/ar+nar-llama-8/fp32.sft +2 -2

models/ckpt/ar+nar-llama-8/fp32.sft CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2690728fd3cd1fd4ac396b1d9f30034c9713230f2ff70c6284cfca69d3df3d6d
-size 455745634

 version https://git-lfs.github.com/spec/v1
+oid sha256:2bbafd8afb5403c206c28f51ea3e872769dab8de99b5f441825ff31c893b0911
+size 455745602