Spaces:

Dovakiins
/

qwerrwe

Build error

App Files Files Community

Nanobit commited on Jan 11

Commit

b432889

•

1 Parent(s): 54fe07a

feat: enable trl's autounwrap (#1060)

Browse files

* feat: test trl's autounwrap

* fix: add check for adapter

* feat: add config to disable autounwrap

* chore: fix lint

Files changed (5) hide show

.vscode/launch.json +1 -1
devtools/README.md +1 -1
docs/debugging.md +4 -4
docs/rlhf.md +9 -0
src/axolotl/train.py +9 -4

.vscode/launch.json CHANGED Viewed

@@ -11,7 +11,7 @@
             "request": "launch",
             "args": [
                 "-m", "axolotl.cli.train", "dev_sharegpt.yml",
-                // The flags below simplify debugging by overriding the axolotl config
                 // with the debugging tips above.  Modify as needed.
                 "--dataset_processes=1",      // limits data preprocessing to one process
                 "--max_steps=1",              // limits training to just one step

             "request": "launch",
             "args": [
                 "-m", "axolotl.cli.train", "dev_sharegpt.yml",
+                // The flags below simplify debugging by overriding the axolotl config
                 // with the debugging tips above.  Modify as needed.
                 "--dataset_processes=1",      // limits data preprocessing to one process
                 "--max_steps=1",              // limits training to just one step

devtools/README.md CHANGED Viewed

	@@ -1 +1 @@
1	- This directory contains example config files that might be useful for debugging. Please see [docs/debugging.md](../docs/debugging.md) for more information.


1	+ This directory contains example config files that might be useful for debugging. Please see [docs/debugging.md](../docs/debugging.md) for more information.

docs/debugging.md CHANGED Viewed

@@ -30,13 +30,13 @@ While debugging it's helpful to simplify your test scenario as much as possible.
 3. **Use a small model**: A good example of a small model is [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0).
 4. **Minimize iteration time**: Make sure the training loop finishes as fast as possible, with these settings.
     - `micro_batch_size: 1`
-    - `max_steps: 1`
     - `val_set_size: 0`
 5. **Clear Caches:** Axolotl caches certain steps and so does the underlying HuggingFace trainer.  You may want to clear some of these caches when debugging.
     - Data preprocessing: When debugging data preprocessing, which includes prompt template formation, you may want to delete the directory set in `dataset_prepared_path:` in your axolotl config.  If you didn't set this value, the default is `last_run_prepared`.
     - HF Hub: If you are debugging data preprocessing, you should clear the relevant HF cache [HuggingFace cache](https://huggingface.co/docs/datasets/cache), by deleting the appropriate `~/.cache/huggingface/datasets/...` folder(s).
     - **The recommended approach is to redirect all outputs and caches to a temporary folder and delete selected subfolders before each run.  This is demonstrated in the example configuration below.**
 ## Debugging with VSCode
@@ -74,7 +74,7 @@ For example, to mimic the command `cd devtools && CUDA_VISIBLE_DEVICES=0 acceler
             "request": "launch",
             "args": [
                 "-m", "axolotl.cli.train", "dev_sharegpt.yml",
-                // The flags below simplify debugging by overriding the axolotl config
                 // with the debugging tips above.  Modify as needed.
                 "--dataset_processes=1",      // limits data preprocessing to one process
                 "--max_steps=1",              // limits training to just one step
@@ -101,7 +101,7 @@ For example, to mimic the command `cd devtools && CUDA_VISIBLE_DEVICES=0 acceler
 - The argument `justMyCode` is set to `true` such that you step through only the axolotl code.  If you want to step into dependencies, set this to `false`.
 - The `preLaunchTask`: `cleanup-for-dataprep` is defined in [.vscode/tasks.json](../.vscode/tasks.json) and is used to delete the following folders before debugging, which is essential to ensure that the data pre-processing code is run from scratch:
-    -  `./devtools/temp_debug/axolotl_outputs`
     - `./devtools/temp_debug/.hf-cache/datasets`
 >[!Tip]

 3. **Use a small model**: A good example of a small model is [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0).
 4. **Minimize iteration time**: Make sure the training loop finishes as fast as possible, with these settings.
     - `micro_batch_size: 1`
+    - `max_steps: 1`
     - `val_set_size: 0`
 5. **Clear Caches:** Axolotl caches certain steps and so does the underlying HuggingFace trainer.  You may want to clear some of these caches when debugging.
     - Data preprocessing: When debugging data preprocessing, which includes prompt template formation, you may want to delete the directory set in `dataset_prepared_path:` in your axolotl config.  If you didn't set this value, the default is `last_run_prepared`.
     - HF Hub: If you are debugging data preprocessing, you should clear the relevant HF cache [HuggingFace cache](https://huggingface.co/docs/datasets/cache), by deleting the appropriate `~/.cache/huggingface/datasets/...` folder(s).
     - **The recommended approach is to redirect all outputs and caches to a temporary folder and delete selected subfolders before each run.  This is demonstrated in the example configuration below.**
 ## Debugging with VSCode
             "request": "launch",
             "args": [
                 "-m", "axolotl.cli.train", "dev_sharegpt.yml",
+                // The flags below simplify debugging by overriding the axolotl config
                 // with the debugging tips above.  Modify as needed.
                 "--dataset_processes=1",      // limits data preprocessing to one process
                 "--max_steps=1",              // limits training to just one step
 - The argument `justMyCode` is set to `true` such that you step through only the axolotl code.  If you want to step into dependencies, set this to `false`.
 - The `preLaunchTask`: `cleanup-for-dataprep` is defined in [.vscode/tasks.json](../.vscode/tasks.json) and is used to delete the following folders before debugging, which is essential to ensure that the data pre-processing code is run from scratch:
+    -  `./devtools/temp_debug/axolotl_outputs`
     - `./devtools/temp_debug/.hf-cache/datasets`
 >[!Tip]

docs/rlhf.md CHANGED Viewed

@@ -33,3 +33,12 @@ datasets:
 ```yaml
 rl: ipo
 ```

 ```yaml
 rl: ipo
 ```
+#### Trl autounwrap for peft
+Trl supports autounwrapping peft models, so that a ref model does not need to be additionally loaded, leading to less VRAM needed. This is on by default. To turn it off, pass the following config.
+```yaml
+# load ref model when adapter training.
+rl_adapter_ref_model: true
+```

src/axolotl/train.py CHANGED Viewed

@@ -63,10 +63,15 @@ def train(
     model, peft_config = load_model(cfg, tokenizer, inference=cli_args.inference)
     model_ref = None
     if cfg.rl:
-        # load the model again for model_ref/baseline
-        model_ref, _ = load_model(
-            cfg, tokenizer, inference=cli_args.inference, reference_model=True
-        )
     safe_serialization = cfg.save_safetensors is True

     model, peft_config = load_model(cfg, tokenizer, inference=cli_args.inference)
     model_ref = None
     if cfg.rl:
+        if cfg.adapter and not cfg.rl_adapter_ref_model:
+            # use built-in trl autounwrap
+            LOG.debug("Passing model_ref: None to RL trainer")
+            model_ref = None  # explicit setting to None
+        else:
+            # load the model again for model_ref/baseline
+            model_ref, _ = load_model(
+                cfg, tokenizer, inference=cli_args.inference, reference_model=True
+            )
     safe_serialization = cfg.save_safetensors is True