The Singing Voice Conversion Challenge 2025: From Singer Identity Conversion To Singing Style Conversion

The Singing Voice Conversion Challenge 2025:
From Singer Identity Conversion To Singing Style Conversion

Lester Phillip Violeta¹, Xueyao Zhang², Jiatong Shi³, Yusuke Yasuda⁴, Wen-Chin Huang¹, Zhizheng Wu², Tomoki Toda¹

¹Nagoya University, Japan ²The Chinese University of Hong Kong, Shenzhen, China
³Carnegie Mellon University, USA ⁴National Institute of Informatics, Japan

Audio Samples

This page presents converted singing samples from a set of baseline and proposed systems across two singing style conversion tasks.

For more details, refer to the paper: https://arxiv.org/abs/2509.15629

System abbreviations

B1, B2, B3 — Baseline systems 1–3
S1B–S7B — Proposed systems 1–7 (best configuration)
S1A, S3A, S4A, S6A — Ablation variants of the corresponding proposed systems

Style pairs Each cell corresponds to a source → target style conversion. Source styles: Breathy, Control, Falsetto, Mixed. Target styles: Glissando, Pharyngeal, Vibrato.

Source style: Breathy (singerA, utterance 0000)