Reviews

The literature-review database. Every paper Bob has reviewed (he has read many more), with a short summary, key findings, and tags. Browse, filter, search.

Search results

Deaf, Hard of Hearing, and Hearing Perspectives on Using Automatic Speech Recognition in Conversation
Abraham Glasser, Kesavan Kushalnagar, Raja Kushalnagar · 2017 · Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS)
This experience report describes the real-world accessibility challenges encountered by five participants — two deaf, one hard of hearing, and two hearing — including the authors, when using the top seven most popular ASR applications (DEAFCOM, Dragon Dictation, Siri, Virtual…
automatic speech recognition · deaf and hard of hearing · speech recognition · communication accessibility · voice interface
VocalIDE: An IDE for Programming via Speech Recognition
Lucas Rosenblatt · 2017 · Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '17)
This student research paper addresses the underrepresentation of people with upper-limb physical impairments in the developer community — while 6.7% of Americans have upper-limb impairments, less than 4% of developers report any physical disability. The author argues that…
speech recognition · motor disability · programming accessibility · voice interface · upper-limb impairment
Leveraging Complementary Contributions of Different Workers for Efficient Crowdsourcing of Video Captions
Yun Huang, Yifeng Huang, Na Xue, Jeffrey P. Bigham · 2017 · CHI Conference on Human Factors in Computing Systems
This paper presents BandCaption, a crowdsourcing system that combines automatic speech recognition (ASR) with input from diverse crowd workers to efficiently correct video captions. The key insight is that different groups of people — hearing-impaired users, second-language…
captioning · crowdsourcing · video accessibility · speech recognition · deaf and hard of hearing
Scribe: Deep Integration of Human and Machine Intelligence to Caption Speech in Real Time
Walter S. Lasecki, Christopher D. Miller, Iftekhar Naim, Raja Kushalnagar, Adam Sadilek, Daniel Gildea, Jeffrey P. Bigham · 2017 · Communications of the ACM
Scribe is a system that provides on-demand, real-time captioning of live speech for deaf and hard of hearing (DHH) people by combining groups of non-expert human captionists with machine intelligence. The system addresses a critical accessibility gap: professional CART…
real-time captioning · deaf and hard of hearing · crowdsourcing · human computation · speech recognition
WebReader: a screen reader for everyone, everywhere
Aurelio De Rosa, Donovan Justice · 2016 · Proceedings of the 13th International Web for All Conference (W4A)
This extended abstract presents WebReader, a free and open source JavaScript library that implements a subset of screen reader features directly within web pages, requiring no software installation beyond a web browser. The project addresses two key limitations of traditional…
screen readers · web accessibility · JavaScript · Web Speech API · open source
The Effects of Automatic Speech Recognition Quality on Human Transcription Latency
Yashesh Gaur, Walter S. Lasecki, Florian Metze, Jeffrey P. Bigham · 2016 · Proceedings of the 13th International Web for All Conference (W4A)
This paper from Carnegie Mellon University and the University of Michigan empirically investigates when automatic speech recognition (ASR) output helps or hinders human transcriptionists producing captions for deaf and hard of hearing people. Manual transcription remains…
speech recognition · captioning · deaf and hard of hearing · crowdsourcing · human computation
Evaluation of Real-time Captioning by Machine Recognition with Human Support
Hironobu Takagi, Takashi Itoh, Kaoru Shinkawa · 2015 · Proceedings of the 12th International Web for All Conference (W4A)
This paper from IBM Research Tokyo investigates a hybrid approach to real-time captioning that combines Automated Speech Recognition (ASR) with human correction to make workplace meetings accessible for deaf and hard of hearing (DHH) employees. Professional stenography services…
real-time captioning · deaf and hard of hearing · automated speech recognition · workplace accessibility · Japanese
Capti-Speak: A Speech-Enabled Web Screen Reader
Vikas Ashok, Yevgen Borodin, Yury Puzis, I. V. Ramakrishnan · 2015 · Proceedings of the 12th International Web for All Conference (W4A)
This paper presents Capti-Speak, a speech-augmented screen reader for web browsing that allows blind users to combine natural language voice commands with traditional keyboard shortcuts. Built as an extension to the Capti Narrator screen reader, Capti-Speak addresses a…
screen readers · speech recognition · voice interface · web accessibility · blind users
The Implementation of a Vocabulary and Grammar for an Open-Source Speech-Recognition Programming Platform
Jean K. Rodriguez-Cartagena, Andrea C. Claudio-Palacios, Natalia Pacheco-Tallaj, Valerie Santiago González, Patricia Ordonez-Franco · 2015 · ASSETS '15: Proceedings of the 17th International ACM SIGACCESS Conference on Computers & Accessibility
This paper presents the development of a standardized vocabulary and grammar for voice-based programming, designed to make coding accessible to people with limited hand mobility or visual impairments. The work is part of the larger Kavita Project, which aims to create an…
speech recognition · voice input · programming · motor impairment · repetitive strain injury
Speech Interaction with Personal Assistive Robots Supporting Aging at Home for Individuals with Alzheimer's Disease
Frank Rudzicz, Rosalie Wang, Momotaz Begum, Alex Mihailidis · 2015 · ACM Transactions on Accessible Computing (TACCESS)
This study examines speech-based interaction between older adults with Alzheimer's disease (AD) and a mobile assistive robot called ED, designed to help with activities of daily living. The research addresses a critical healthcare challenge: many nations lack capacity to support…
Alzheimer's disease · dementia · assistive robotics · speech recognition · aging in place
Perspectives on Speech and Language Interaction for Daily Assistive Technology: Introduction to Part 1 of the Special Issue
Heidi Christensen, Frank Rudzicz, François Portet, Jan Alexandersson · 2015 · ACM Transactions on Accessible Computing (TACCESS)
This editorial introduces the first part of a TACCESS special issue on speech and language interaction for daily assistive technology, emerging from the 2013 SLPAT (Speech and Language Processing for Assistive Technologies) workshop. The editors frame speech and natural language…
speech recognition · disordered speech · dysarthria · speech intelligibility · assistive technology
Evaluation of a Context-Aware Voice Interface for Ambient Assisted Living: Qualitative User Study vs. Quantitative System Evaluation
Michel Vacher, Sybille Caffiau, François Portet, Brigitte Meillon, Camille Roux, Elena Eluj, Benjamin Lecouteux, Pedro Chahuara · 2015 · ACM Transactions on Accessible Computing (TACCESS)
This study evaluates Sweet-Home, a voice-controlled smart home system designed to help older adults and people with visual impairments maintain independence at home. The research was conducted in a realistic 30-square-meter smart apartment (Domus) equipped with 150 sensors, 7…
ambient assisted living · voice interface · smart home · aging in place · visual impairment
Automatic Detection of Phone-Based Anomalies in Dysarthric Speech
Imed Laaridh, Corinne Fredouille, Christine Meunier · 2015 · ACM Transactions on Accessible Computing (TACCESS)
This research develops automatic methods to detect and localize acoustic anomalies in speech produced by people with dysarthria, a motor speech disorder caused by neurological damage affecting the respiratory, phonatory, resonatory, articulatory, or prosodic components of…
dysarthria · speech recognition · automatic speech processing · motor speech disorders · clinical assessment
Intelligibility Assessment and Speech Recognizer Word Accuracy Rate Prediction for Dysarthric Speakers in a Factor Analysis Subspace
David Martínez, Phil Green, Heidi Christensen · 2015 · ACM Transactions on Accessible Computing
This paper presents a novel approach to assessing speech intelligibility and predicting automatic speech recognition (ASR) accuracy for speakers with dysarthria using iVectors, a technique from speaker verification research. The authors address a critical challenge in assistive…
dysarthric speech · speech recognition · intelligibility assessment · iVectors · factor analysis
JustSpeak: enabling universal voice control on Android
Yu Zhong, T. V. Raman, Casey Burkhardt, Fadi Biadsy, Jeffrey P. Bigham · 2014 · Proceedings of the 11th Web for All Conference (W4A)
This paper introduces JustSpeak, a universal voice control system for Android that works across all applications without requiring any developer intervention. Unlike Google Now or Siri, which only support pre-defined commands for specific apps, JustSpeak dynamically constructs…
voice interface · mobile accessibility · blindness · motor accessibility · Android
Automatically Identifying Trouble-Indicating Speech Behaviors in Alzheimer's Disease
Frank Rudzicz, Leila Chan Currie, Andrew Danks, Tejas Mehta, Shunan Zhao · 2014 · ASSETS '14: Proceedings of the 16th International ACM SIGACCESS Conference on Computers & Accessibility
This paper addresses the challenge of automatically detecting communication breakdowns in conversations with people who have Alzheimer's disease (AD). AD is a progressive neurodegenerative disease that deteriorates memory, executive capacity, visual-spatial reasoning, and…
Alzheimer's disease · dementia · speech recognition · natural language processing · machine learning
Capti-Speak: A Speech-Enabled Accessible Web Interface
Vikas Ashok · 2014 · Proceedings of the 16th International ACM SIGACCESS Conference on Computers & Accessibility (ASSETS)
This paper presents Capti-Speak, a speech-augmented screen reader interface for the web that allows blind users to issue voice commands alongside traditional keyboard shortcuts. Built on top of the Capti web browsing application (which provides a JAWS-like screen reader…
screen readers · speech recognition · voice interface · web accessibility · blindness
Improving Programming Interfaces for People with Limited Mobility Using Voice Recognition
Xiomara Figueroa Fontánez, Patricia Ordóñez · 2014 · Proceedings of the 16th International ACM SIGACCESS Conference on Computers & Accessibility (ASSETS)
This paper describes an effort to make programming more accessible to people with motor impairments by integrating voice recognition into an Integrated Development Environment (IDE). The work is motivated by the specific case of a computer scientist with spinal muscular atrophy…
programming accessibility · voice interface · speech recognition · motor disability · spinal muscular atrophy
Speech Dasher: A Demonstration of Text Input using Speech and Approximate Pointing
Keith Vertanen, David J.C. MacKay · 2014 · Proceedings of the 16th International ACM SIGACCESS Conference on Computers & Accessibility (ASSETS 2014)
This paper demonstrates Speech Dasher, a multimodal text entry system that combines speech recognition with the Dasher zooming interface to enable fast, corrected text input using only voice and gaze direction. The core problem addressed is that while speech dictation is fast,…
speech recognition · eye tracking · gaze input · text entry · error correction
Exploring the Use of Speech Input by Blind People on Mobile Devices
Shiri Azenkot, Nicole B. Lee · 2013 · Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS)
This paper investigates how blind people use speech input on mobile devices through two studies: a survey of 169 participants (64 blind/low-vision, 105 sighted) and a laboratory study with 8 blind participants composing paragraphs on an iPod Touch using speech dictation versus…
visual impairment · blindness · speech input · speech recognition · mobile accessibility
Architecture of an Automated Therapy Tool for Childhood Apraxia of Speech
Avinash Parnandi, Virendra Karappa, Youngpyo Son, Mostafa Shahin, Jacqueline McKechnie, Kirrie Ballard, Beena Ahmed, Ricardo Gutierrez-Osuna · 2013 · Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS)
This paper presents a multi-tier client-server system for remotely administering speech therapy to children with childhood apraxia of speech (CAS), a neurological speech sound disorder that impairs the precision and consistency of oro-motor planning and execution needed for…
childhood apraxia of speech · speech therapy · speech sound disorder · automated speech analysis · tele-rehabilitation
Enhancing Learning Accessibility through Fully Automatic Captioning
Maria Federico, Marco Furini · 2012 · Proceedings of the International Cross-Disciplinary Conference on Web Accessibility (W4A)
This paper proposes an architecture for automatically generating synchronized captions for video lectures using off-the-shelf automatic speech recognition (ASR) software, aimed at making educational content accessible to hearing impaired students, dyslexic students, ESL (English…
captioning · speech recognition · education accessibility · deaf and hard of hearing · automatic speech recognition
A Readability Evaluation of Real-Time Crowd Captions in the Classroom
Raja S. Kushalnagar, Walter S. Lasecki, Jeffrey P. Bigham · 2012 · Proceedings of the 14th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2012)
This paper evaluates the readability of real-time captions produced by three different approaches in a higher education classroom setting: professional CART (Communication Access Realtime Translation) captionists, automatic speech recognition (ASR), and a novel crowd captioning…
real-time captioning · deaf and hard of hearing · crowdsourcing · classroom accessibility · higher education
Real-Time Captioning by Groups of Non-Experts
Walter Lasecki, Christopher Miller, Adam Sadilek, Andrew Abumoussa, Donato Borrello, Raja Kushalnagar, Jeffrey Bigham · 2012 · UIST '12: Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology
This paper presents Legion:Scribe, an end-to-end system that enables groups of non-expert typists to collectively produce real-time captions for deaf and hard of hearing (DHH) people, offering a cheaper and more available alternative to professional stenographers (CART, costing…
real-time captioning · crowdsourcing · deaf and hard of hearing · human computation · text alignment
Crowdsourcing Correction of Speech Recognition Captioning Errors
M. Wald · 2011 · Proceedings of the International Cross-Disciplinary Conference on Web Accessibility (W4A)
This paper describes tools built around Synote, an award-winning web-based application from the University of Southampton, that enable crowdsourced correction of automatic speech recognition (ASR) captioning errors to make video content accessible at scale. The author frames the…
captioning · speech recognition · crowdsourcing · deaf and hard of hearing · video accessibility

Reviews

Year

Tag

Search results

Deaf, Hard of Hearing, and Hearing Perspectives on Using Automatic Speech Recognition in Conversation

VocalIDE: An IDE for Programming via Speech Recognition

Leveraging Complementary Contributions of Different Workers for Efficient Crowdsourcing of Video Captions

Scribe: Deep Integration of Human and Machine Intelligence to Caption Speech in Real Time

WebReader: a screen reader for everyone, everywhere

The Effects of Automatic Speech Recognition Quality on Human Transcription Latency

Evaluation of Real-time Captioning by Machine Recognition with Human Support

Capti-Speak: A Speech-Enabled Web Screen Reader

The Implementation of a Vocabulary and Grammar for an Open-Source Speech-Recognition Programming Platform

Speech Interaction with Personal Assistive Robots Supporting Aging at Home for Individuals with Alzheimer's Disease

Perspectives on Speech and Language Interaction for Daily Assistive Technology: Introduction to Part 1 of the Special Issue

Evaluation of a Context-Aware Voice Interface for Ambient Assisted Living: Qualitative User Study vs. Quantitative System Evaluation

Automatic Detection of Phone-Based Anomalies in Dysarthric Speech

Intelligibility Assessment and Speech Recognizer Word Accuracy Rate Prediction for Dysarthric Speakers in a Factor Analysis Subspace

JustSpeak: enabling universal voice control on Android

Automatically Identifying Trouble-Indicating Speech Behaviors in Alzheimer's Disease

Capti-Speak: A Speech-Enabled Accessible Web Interface

Improving Programming Interfaces for People with Limited Mobility Using Voice Recognition

Speech Dasher: A Demonstration of Text Input using Speech and Approximate Pointing

Exploring the Use of Speech Input by Blind People on Mobile Devices

Architecture of an Automated Therapy Tool for Childhood Apraxia of Speech

Enhancing Learning Accessibility through Fully Automatic Captioning

A Readability Evaluation of Real-Time Crowd Captions in the Classroom

Real-Time Captioning by Groups of Non-Experts

Crowdsourcing Correction of Speech Recognition Captioning Errors