Adding convertation to ONNX by Optimum

#9
by VKlarkV - opened

Hi. Our team is using a finetuned musicgen-sterio-small model for music generation. We are trying to convert the model to ONNX format for Triton Inference server and were using the Oprimum exporter and got these errors

(.venv) root@s8:/workspace# optimum-cli export onnx --model facebook/musicgen-stereo-small --task text-to-audio --framework pt --pad_token_id=2048 --optimize=O2 ./onnx_v2
/workspace/.venv/lib/python3.11/site-packages/torch/nn/utils/weight_norm.py:28: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
/workspace/.venv/lib/python3.11/site-packages/torch/utils/_device.py:78: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  return func(*args, **kwargs)
Using the export variant text-conditional-with-past. Available variants are:
    - text-conditional-with-past: Exports Musicgen to ONNX to generate audio samples conditioned on a text prompt (Reference: https://huggingface.co/docs/transformers/model_doc/musicgen#text-conditional-generation). This uses the decoder KV cache. The following subcomponents are exported:
                * text_encoder.onnx: corresponds to the text encoder part in https://github.com/huggingface/transformers/blob/v4.39.1/src/transformers/models/musicgen/modeling_musicgen.py#L1457.
                * encodec_decode.onnx: corresponds to the Encodec audio encoder part in https://github.com/huggingface/transformers/blob/v4.39.1/src/transformers/models/musicgen/modeling_musicgen.py#L2472-L2480.
                * decoder_model.onnx: The Musicgen decoder, without past key values input, and computing cross attention. Not required at inference (use decoder_model_merged.onnx instead).
                * decoder_with_past_model.onnx: The Musicgen decoder, with past_key_values input (KV cache filled), not computing cross attention. Not required at inference (use decoder_model_merged.onnx instead).
                * decoder_model_merged.onnx: The two previous models fused in one, to avoid duplicating weights. A boolean input `use_cache_branch` allows to select the branch to use. In the first forward pass where the KV cache is empty, dummy past key values inputs need to be passed and are ignored with use_cache_branch=False.
                * build_delay_pattern_mask.onnx: A model taking as input `input_ids`, `pad_token_id`, `max_length`, and building a delayed pattern mask to the input_ids. Implements https://github.com/huggingface/transformers/blob/v4.39.3/src/transformers/models/musicgen/modeling_musicgen.py#L1054.

***** Exporting submodel 1/5: T5EncoderModel *****
Using framework PyTorch: 2.3.1+cu121

***** Exporting submodel 2/5: EncodecModel *****
Using framework PyTorch: 2.3.1+cu121
/workspace/.venv/lib/python3.11/site-packages/optimum/exporters/onnx/model_patcher.py:934: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results.
  if len(audio_codes) != 1:
/workspace/.venv/lib/python3.11/site-packages/transformers/models/encodec/modeling_encodec.py:436: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  quantized_out = torch.tensor(0.0, device=codes.device)
/workspace/.venv/lib/python3.11/site-packages/transformers/models/encodec/modeling_encodec.py:437: TracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
  for i, indices in enumerate(codes):
Traceback (most recent call last):
  File "/workspace/.venv/bin/optimum-cli", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/optimum/commands/optimum_cli.py", line 208, in main
    service.run()
  File "/workspace/.venv/lib/python3.11/site-packages/optimum/commands/export/onnx.py", line 265, in run
    main_export(
  File "/workspace/.venv/lib/python3.11/site-packages/optimum/exporters/onnx/__main__.py", line 365, in main_export
    onnx_export_from_model(
  File "/workspace/.venv/lib/python3.11/site-packages/optimum/exporters/onnx/convert.py", line 1170, in onnx_export_from_model
    _, onnx_outputs = export_models(
                      ^^^^^^^^^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/optimum/exporters/onnx/convert.py", line 776, in export_models
    export(
  File "/workspace/.venv/lib/python3.11/site-packages/optimum/exporters/onnx/convert.py", line 881, in export
    export_output = export_pytorch(
                    ^^^^^^^^^^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/optimum/exporters/onnx/convert.py", line 577, in export_pytorch
    onnx_export(
  File "/workspace/.venv/lib/python3.11/site-packages/torch/onnx/utils.py", line 516, in export
    _export(
  File "/workspace/.venv/lib/python3.11/site-packages/torch/onnx/utils.py", line 1612, in _export
    graph, params_dict, torch_out = _model_to_graph(
                                    ^^^^^^^^^^^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/torch/onnx/utils.py", line 1134, in _model_to_graph
    graph, params, torch_out, module = _create_jit_graph(model, args)
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/torch/onnx/utils.py", line 1010, in _create_jit_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/torch/onnx/utils.py", line 914, in _trace_and_get_graph_from_model
    trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph(
                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/torch/jit/_trace.py", line 1310, in _get_trace_graph
    outs = ONNXTracedModule(
           ^^^^^^^^^^^^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/torch/jit/_trace.py", line 138, in forward
    graph, out = torch._C._create_graph_by_tracing(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/torch/jit/_trace.py", line 129, in wrapper
    outs.append(self.inner(*trace_inputs))
                ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1522, in _slow_forward
    result = self.forward(*input, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/optimum/exporters/onnx/model_patcher.py", line 936, in patched_forward
    audio_values = self._model._decode_frame(audio_codes[0], audio_scales)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/transformers/models/encodec/modeling_encodec.py", line 705, in _decode_frame
    embeddings = self.quantizer.decode(codes)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/transformers/models/encodec/modeling_encodec.py", line 438, in decode
    layer = self.layers[i]
            ~~~~~~~~~~~^^^
  File "/workspace/.venv/lib/python3.11/site-packages/torch/nn/modules/container.py", line 295, in __getitem__
    return self._modules[self._get_abs_string_index(idx)]
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/torch/nn/modules/container.py", line 285, in _get_abs_string_index
    raise IndexError(f'index {idx} is out of range')
IndexError: index 4 is out of range

We have checked that the original Hugging Face model is also doesn't convert by Optimum. Is there any chance that this problem will be solved?

Versions:
transformers 4.40.0
torch 2.3.1
torchaudio 2.0.2
torchvision 0.18.1
optimum 1.21.2
onnx 1.16.1
onnxruntime 1.18.1

Sign up or log in to comment