File size: 1,373 Bytes
95c27a9
 
323085b
 
 
 
 
 
 
 
 
 
 
95c27a9
323085b
1dbfab8
 
7eb2370
1dbfab8
323085b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
---
license: mit
language:
- ja
tags:
- music
- audio
- audio-to-audio
- SFI
datasets:
- MUSDB18-HQ
metrics:
- SDR
---

# Sampling-frequency-independent (SFI) Conv-TasNet trained with the MUSDB18-HQ dataset for music source separation.
This model was proposed in [our IEEE/ACM Trans. ASLP paper](https://doi.org/10.1109/TASLP.2022.3203907) and works well with untrained sampling frequencies by using sampling-frequency-independent convolutional layers with the frequency domain filter design.
The latent analog filter is a modulated Gaussian filter.
It was trained by Tomohiko Nakamura using [the codebase](https://github.com/TomohikoNakamura/sfi_convtasnet)).
This model was trained with 32 kHz-sampled data but works well with untrained sampling frequencies (e.g., 8, 16 kHz).

# License
MIT

# Citation
Please cite the following paper.
```
@article{KSaito2022IEEEACMTASLP,
 author={Saito, Koichi and Nakamura, Tomohiko and Yatabe, Kohei and Saruwatari, Hiroshi},
 journal = {IEEE/ACM Transactions on Audio, Speech, and Language Processing},
 title = {Sampling-frequency-independent convolutional layer and its application to audio source separation},
 year=2022,
 month=sep,
 volume=30,
 pages={2928--2943},
 doi={10.1109/TASLP.2022.3203907},
}
```

# Contents
- Four trained models (seed=40,42,44,47)
- Evaluation results (json files obtained with the museval library)