Temporal constraints on human and artificial muliti-sensory speech recognition

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Lethbridge, Alta. : University of Lethbridge, Dept. of Neuroscience

Abstract

Audio Visual Speech Recognition (AVSR) is the process of perceiving and understanding speech using audio and visual information. Combining visual information with auditory stimuli has been shown to improve AVSR performance when compared to purely auditory speech recognition when the task is performed in adverse conditions with large amounts of distracting noise. This work examines the relationship of auditory and visual speech information and the effect audio-visual temporary desynchronization has on AVSR performance. Using a whole report task, we show that (1) consistent with prior similar work, performance declines asymmetrically depending on the direction and quantity of a temporal lag, and (2) a common, modern architecture for computational AVSR does not show this asymmetry indicating a fundamental difference in biological and computational AVSR methods.

Description

Citation

Endorsement

Review

Supplemented By

Referenced By