I have worked on two speech corpora, AusKidTalk and the Future Proofing study corpus.
AusKidTalk aims to create an annotated acoustic corpus of Australian children’s speech, sufficiently larger for developing automatic speech recognition systems for children. I work on developing semi-automatic workflows for annotating the acoustic data. For more details, see the AusKidTalk website.The Future Proofing Study aims to prevent depression and anxiety in Australian adolescents using big data, collecting detailed demographic- and mental health surveys as well as speech samples. I work on phonetic analysis of the resulting corpus. For more details, see the Future Proofing Study website.
Publications related to these projects:
Szalay, T., Stasak, B., Maston, K., Werner-Seidler, A., & Larsen, M. (2024). Exploring fundamental frequency characteristics of Australian adolescents with and without depression in The Future Proofing Study Corpus. In Proceedings of the 19th Australasian International Conference on Speech Science and Technology, (pp. 262-266). PDF
Szalay, T., Ratko, L., Shahin, M., Sirojan, T., Ballard, K., Cox, F., & Ahmed, B. (2022). A semi-automatic workflow for orthographic transcription of a novel speech corpus: A case study of AusKidTalk. In Rosey Billington (Ed.) Proceedings of the 18th Australasian International Conference on Speech Science and Technology, (pp. 126-130). PDF :: BIB
Szalay, T., Shahin, M., Ahmed, B., & Ballard, K. (2022). Training forced aligners on (mis)matched data: The effect of dialect and age. In Rosey Billington (Ed.) Proceedings of the 18th Australasian International Conference on Speech Science and Technology, (pp. 36-40). PDF :: BIB
Szalay, T., Shahin, M., Ahmed, B., & Ballard, K. (2022). Knowledge of accent differences can be used to predict speech recognition. Proceedings of the 23rd INTERSPEECH Conference. PDF :: BIB :: DOI