The FinINTAS Corpus of Spontaneous and Read-aloud Finnish Speech
The FinINTAS corpus consists of two subcorpora: - FinDialogue: ten spontaneous dialogues between friends, duration 45-55 minutes each. - FinRead: read-aloud speech from the same speakers The speakers were native Finns from the capital city region in Finland. Ten speakers were 20 to 30 years of age, whereas the rest of the speakers were between 45-65 years.
The corpus includes audio files (WAV), phonetic annotation files (Praat TextGrid) and text files.
NB: A partly similar Russian corpus is freely available at http://www.speech.pu.ru/results.php. The Russian INTAS corpus was collected in a partly similar fashion to the Finnish corpus. The Russian and Finnish corpora as well as the previously collected IFA corpus for Dutch were used in order to compare the phonetic properties of spontaneous and read-aloud speech in the three languages during the INTAS project number 00-915 (01.07.2001 – 30.06.2004, funded partly by INTAS, partly by the Academy of Finland).
The FinINTAS corpus of spontaneous and read-aloud Finnish speech will be made available at http://lat.csc.fi.