Demonstration of Differentiable Digital Signal Processing Mixture Model for Synthesis Parameter Extraction from Mixture of Harmonic Sounds

Demonstration of Differentiable Digital Signal Processing Mixture Model for Synthesis Parameter Extraction from Mixture of Harmonic Sounds [1]

Masaya Kawamura¹, Tomohiko Nakamura¹, Daichi Kitamura², Hiroshi Saruwatari¹, Yu Takahashi³, Kazunobu Kondo³

¹The University of Tokyo, Tokyo, Japan
²National Institute of Technology, Kagawa College, Kagawa, Japan
³Yamaha Corporation, Shizuoka, Japan

Dataset

Mixture

Instrument

Ground Truth

SISS+DDSP

SISS+Proposed

SI-Proposed

URMP dataset

Viola/Flute

Viola

Flute

Flute 1/Flute 2

Flute 1

Flute 2

PHENICX-Anechoic dataset

Cello/Double bass

Cello

Double bass

Flute 1/Flute 2

Flute 1

Flute 2

References

[1] M. Kawamura, T. Nakamura, D. Kitamura, H. Saruwatari, Y. Takahashi, and K. Kondo, "Differentiable digital signal processing mixture model for synthesis parameter extraction from mixture of harmonic sounds," in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2022. (to appear http://arxiv.org/abs/2202.00200)
[2] B. Li, X. Liu, K. Dinesh, Z. Duan, G. Sharma, "Creating a multitrack classical music performance dataset for multimodal music analysis: Challenges, insights, and applications," IEEE Transactions on Multimedia, vol. 21, no. 2, pp. 522-535, 2019.
[3] M. Miron, J. J. Carabias-Orti, J. J. Bosch, E. Gómez, and J. Janer, "Score-informed source separation for multichannel orchestral recordings," Journal of Electrical and Computer Engineering, vol. 2016, 2016.
[4] J. Pätynen, V. Pulkki, and T. Lokki, "Anechoic recording system for symphony orchestra," Acta Acustica united with Acustica, vol. 94, no. 6, pp. 856–865, 2008.