Temporal constraints on human and artificial muliti-sensory speech recognition

Perlette, Christopher S.; University of Lethbridge. Faculty of Arts and Science

Temporal constraints on human and artificial muliti-sensory speech recognition

Files

PERLETTE_CHRISTOPHER_MSC_2023.pdf(775.61 KB)

Date

2023

Authors

Perlette, Christopher S.

University of Lethbridge. Faculty of Arts and Science

Publisher

Lethbridge, Alta. : University of Lethbridge, Dept. of Neuroscience

Abstract

Audio Visual Speech Recognition (AVSR) is the process of perceiving and understanding speech using audio and visual information. Combining visual information with auditory stimuli has been shown to improve AVSR performance when compared to purely auditory speech recognition when the task is performed in adverse conditions with large amounts of distracting noise. This work examines the relationship of auditory and visual speech information and the effect audio-visual temporary desynchronization has on AVSR performance. Using a whole report task, we show that (1) consistent with prior similar work, performance declines asymmetrically depending on the direction and quantity of a temporal lag, and (2) a common, modern architecture for computational AVSR does not show this asymmetry indicating a fundamental difference in biological and computational AVSR methods.

Keywords

machine learning , speech recognition , behavioural , neuroscience , Speech processing systems--Research , Speech perception--Research , Visual perception--Research , Lipreading , Lipreading--Computer simulation , Machine learning , Neurosciences , Dissertations, Academic

URI

https://hdl.handle.net/10133/6465

Collections

Arts and Science, Faculty of
University of Lethbridge Theses

Full item page