|
--- |
|
license: mit |
|
language: |
|
- ja |
|
tags: |
|
- music |
|
- audio |
|
- audio-to-audio |
|
- SFI |
|
datasets: |
|
- MUSDB18-HQ |
|
metrics: |
|
- SDR |
|
--- |
|
|
|
# Sampling-frequency-independent (SFI) Conv-TasNet trained with the MUSDB18-HQ dataset for music source separation. |
|
This model was proposed in [our IEEE/ACM Trans. ASLP paper](https://doi.org/10.1109/TASLP.2022.3203907) and works well with untrained sampling frequencies by using sampling-frequency-independent convolutional layers with the frequency domain filter design. |
|
The latent analog filter is a modulated Gaussian filter. |
|
It was trained by Tomohiko Nakamura using [the codebase](https://github.com/TomohikoNakamura/sfi_convtasnet)). |
|
This model was trained with 32 kHz-sampled data but works well with untrained sampling frequencies (e.g., 8, 16 kHz). |
|
|
|
# License |
|
MIT |
|
|
|
# Citation |
|
Please cite the following paper. |
|
``` |
|
@article{KSaito2022IEEEACMTASLP, |
|
author={Saito, Koichi and Nakamura, Tomohiko and Yatabe, Kohei and Saruwatari, Hiroshi}, |
|
journal = {IEEE/ACM Transactions on Audio, Speech, and Language Processing}, |
|
title = {Sampling-frequency-independent convolutional layer and its application to audio source separation}, |
|
year=2022, |
|
month=sep, |
|
volume=30, |
|
pages={2928--2943}, |
|
doi={10.1109/TASLP.2022.3203907}, |
|
} |
|
``` |
|
|
|
# Contents |
|
- Four trained models (seed=40,42,44,47) |
|
- Evaluation results (json files obtained with the museval library) |
|
|