Biologically-inspired auditory artificial intelligence for speech recognition in multi-talker environments
dc.contributor.author | Grasse, Lukas Walter Neufeld | |
dc.contributor.supervisor | Tata, Matthew S. | |
dc.contributor.supervisor | Luczak, Artur | |
dc.date.accessioned | 2020-12-23T22:23:28Z | |
dc.date.available | 2020-12-23T22:23:28Z | |
dc.date.issued | 2020 | |
dc.degree.level | Masters | en_US |
dc.description.abstract | Understanding speech in the presence of distracting talkers is a difficult computational problem known as the cocktail party problem. Motivated by auditory processing in the human brain, this thesis developed a neural network to isolate the speech of a single talker given binaural input containing a target talker and multiple distractors. In this research the network is called a Binaural Speaker Isolation FFTNet or BSINet for short. To compare the performance of BSINet to human participant performance on recognizing the target talker's speech with a varying number of distractors, a "cocktail party" dataset was designed and made available online. This dataset also enables the comparison of network performance to human participant performance. Using the Word-Error-Rate metric for evaluation, this research finds that BSINet performs comparably to the human participants. Thus BSINet provides significant advancement for solving the challenging cocktail party problem. | en_US |
dc.description.sponsorship | The research was funded by an NSERC Canada Discovery Grant, a Government of Alberta Centre for Autonomous Systems in Strengthening Future Communities grant, a MITACS Globalink Award, a NSERC CGS-M Award, and a AITF Graduate Student Scholarship. | en_US |
dc.identifier.uri | https://hdl.handle.net/10133/5815 | |
dc.language.iso | en_US | en_US |
dc.proquest.subject | 0317 | en_US |
dc.proquest.subject | 0800 | en_US |
dc.proquest.subject | 0984 | en_US |
dc.proquestyes | Yes | en_US |
dc.publisher | Lethbridge, Alta. : University of Lethbridge, Dept. of Neuroscience | en_US |
dc.publisher.department | Department of Neuroscience | en_US |
dc.publisher.faculty | Arts and Science | en_US |
dc.relation.ispartofseries | Thesis (University of Lethbridge. Faculty of Arts and Science) | en_US |
dc.subject | Speech Recognition | en_US |
dc.subject | Denoising | en_US |
dc.subject | Speaker Isolation | en_US |
dc.subject | Cocktail Party Problem | en_US |
dc.subject | Auditory selective attention | en_US |
dc.subject | Neural networks (Computer science) | en_US |
dc.subject | Speech perception | en_US |
dc.subject | Automatic speech recognition | en_US |
dc.subject | Directional hearing | en_US |
dc.subject | Auditory perception | en_US |
dc.subject | Dissertations, Academic | en_US |
dc.title | Biologically-inspired auditory artificial intelligence for speech recognition in multi-talker environments | en_US |
dc.type | Thesis | en_US |