Automatic Speech Feature Extraction for Cognitive Load Classification
Abstract
Examining simultaneous performance of motor and cognitive tasks, or dual-tasking, has received increasing interest as a means to probe attentional impairments related to neurological disease (e.g., dementia, parkinsonism). Current dual-task methodologies are limited in measuring the level of attentional load associated with the cognitive task, which restricts the sensitivity of these paradigms. The purpose of the current study is to develop and validate a tool to measure cognitive load from features of speech (i.e., speech rate, pause rate, pause percentage). Specifically, two goals are addressed: 1) develop a method of automatically extracting speech rate, and 2) demonstrate that a combination of automated speech features accurately classifies cognitive load.
Speech samples were collected from 10 undergraduate students counting by 1's, 3's and 7's to experimentally synthesize easy, intermediate, and difficult cognitive tasks, respectively. Speech to text recognition was processed using an open-source speech recognition engine with a restricted grammar set (i.e., numbers 1 to 100). Speech rate (syllables/min), pause rate (number of pauses/min), and pause percentage (duration of pauses/sample duration) was calculated from the speech recognition output. Compared to a manually transcribed count to calculate speech rate, the automated algorithm resulted in a mean absolute error rate of 13%. When used to classify cognitive load (i.e., easy, intermediate, difficult), the automatically extracted features delivered an 82.2% classification accuracy compared to 85.5% using the manual count. We conclude that an automated process of extracting speech rate, pause rate, and pause percentage can be used to efficiently indicate levels of cognitive load.