Thai DiffSinger dataset for multi-speaker training.
The Printto TH dataset is specifically curated for training DiffSinger to sing in the Thai language. It is sourced from Printto Magicbeat and comprises 59 minutes and 6 seconds of Thai singing data and 6 minutes and 18 seconds of additional foreign language singing data. The dataset is fully labeled using PRINTmov's Thai phoneme system, ensuring precise and accurate phonetic representation for effective training.
Below is the PRINTmov's Thai phoneme system used in this dataset.
The spreadsheet can also be accessed from https://www.printmov.com/thai-diffsinger.html
This section provides access to the sample dataset, primarily designed for multi-speaker training purposes.
Please strictly follow the usage guidelines and rules provided in readme.txt
and usage_guidelines.txt
within the dataset. It is strictly prohibited from using this dataset with any other person's voice
without obtaining their explicit
consent and permission.
The Printto TH dataset contains fully labeled Thai singing data in NNSVS format separated into separate speakers and languages.
Please note that the dataset may contain some copyrighted lyrics. The primary intended use of this dataset is for educational purposes.
To make it easier to manage the policy, the download link will be sent to your email provided below.
The demo below is the result of DiffSinger training with only samples from Printto TH dataset (without parallel/multi-speaker training).
Thai dsdict-th.yaml is required in order to use the Thai DiffSinger Phonemizer in OpenUTAU. It uses the same phonemes for both long vowels and short vowels, so the timing adjustment should be made manually in OpenUTAU by the user.
You can download dsdict-th.yaml taken from Printto Magicbeat's voice library here.
A USTx file is a musical sequence created by OpenUTAU, an open-source vocal synthesizer designed as a modern successor to UTAU. These files store musical notes, dynamics, vibrato, pitch, and lyrics in the OpenUTAU Sequence Text (USTx) format.
You can find Thai USTx files on printmov.com in the Assets section. The USTx files marked with a Thai flag are specifically designed for use in OpenUTAU with the Thai phonemizer. Older Thai UST files are also available, but they were created for UTAU's older Thai VCCV system, meaning the phonemes are not the same.