Fine-tuning an Automatic Speech Recognition Model for a Canadian Indigenous Counselling Program

Authors

  • Emmanuel Olaniyanu University of Manitoba

Keywords:

Speech Recognition, Data Science, Data Selection, Machine Learning

Abstract

Automatic Speech Recognition (ASR) systems are programs designed to transcribe or identify spoken language. Most modern ASRs are created using End to End Neural Networks and are largely dependent on the quantity and quality of available speech training data. The lack of accented speech data can lead to poor ASR performance with niche accents and voice types. The ASR model presented in this paper is designed to work within an interactive VR counselling software for Canadian Indigenous youth, with an elder. This paper outlines the use of fine-tuning and other data processing techniques to minimize the Word Error Rate of our ASR model. These techniques provide valuable insight into data selection and processing.  

Downloads

Published

2024-06-26

How to Cite

[1]
E. Olaniyanu, “Fine-tuning an Automatic Speech Recognition Model for a Canadian Indigenous Counselling Program”, CMBES Proc., vol. 46, Jun. 2024.

Issue

Section

Academic