← Writing · Glossary →

Reviews

The literature-review database. Every paper Bob has reviewed (he has read many more), with a short summary, key findings, and tags. Browse, filter, search.

Search results

  • Deaf, Hard of Hearing, and Hearing Perspectives on Using Automatic Speech Recognition in Conversation

    Abraham Glasser, Kesavan Kushalnagar, Raja Kushalnagar · 2017 · Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS)

    This experience report describes the real-world accessibility challenges encountered by five participants — two deaf, one hard of hearing, and two hearing — including the authors, when using the top seven most popular ASR applications (DEAFCOM, Dragon Dictation, Siri, Virtual…

    automatic speech recognition · deaf and hard of hearing · speech recognition · communication accessibility · voice interface

  • VocalIDE: An IDE for Programming via Speech Recognition

    Lucas Rosenblatt · 2017 · Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '17)

    This student research paper addresses the underrepresentation of people with upper-limb physical impairments in the developer community — while 6.7% of Americans have upper-limb impairments, less than 4% of developers report any physical disability. The author argues that…

    speech recognition · motor disability · programming accessibility · voice interface · upper-limb impairment

  • Leveraging Complementary Contributions of Different Workers for Efficient Crowdsourcing of Video Captions

    Yun Huang, Yifeng Huang, Na Xue, Jeffrey P. Bigham · 2017 · CHI Conference on Human Factors in Computing Systems

    This paper presents BandCaption, a crowdsourcing system that combines automatic speech recognition (ASR) with input from diverse crowd workers to efficiently correct video captions. The key insight is that different groups of people — hearing-impaired users, second-language…

    captioning · crowdsourcing · video accessibility · speech recognition · deaf and hard of hearing

  • Scribe: Deep Integration of Human and Machine Intelligence to Caption Speech in Real Time

    Walter S. Lasecki, Christopher D. Miller, Iftekhar Naim, Raja Kushalnagar, Adam Sadilek, Daniel Gildea, Jeffrey P. Bigham · 2017 · Communications of the ACM

    Scribe is a system that provides on-demand, real-time captioning of live speech for deaf and hard of hearing (DHH) people by combining groups of non-expert human captionists with machine intelligence. The system addresses a critical accessibility gap: professional CART…

    real-time captioning · deaf and hard of hearing · crowdsourcing · human computation · speech recognition

  • WebReader: a screen reader for everyone, everywhere

    Aurelio De Rosa, Donovan Justice · 2016 · Proceedings of the 13th International Web for All Conference (W4A)

    This extended abstract presents WebReader, a free and open source JavaScript library that implements a subset of screen reader features directly within web pages, requiring no software installation beyond a web browser. The project addresses two key limitations of traditional…

    screen readers · web accessibility · JavaScript · Web Speech API · open source

  • The Effects of Automatic Speech Recognition Quality on Human Transcription Latency

    Yashesh Gaur, Walter S. Lasecki, Florian Metze, Jeffrey P. Bigham · 2016 · Proceedings of the 13th International Web for All Conference (W4A)

    This paper from Carnegie Mellon University and the University of Michigan empirically investigates when automatic speech recognition (ASR) output helps or hinders human transcriptionists producing captions for deaf and hard of hearing people. Manual transcription remains…

    speech recognition · captioning · deaf and hard of hearing · crowdsourcing · human computation

  • Evaluation of Real-time Captioning by Machine Recognition with Human Support

    Hironobu Takagi, Takashi Itoh, Kaoru Shinkawa · 2015 · Proceedings of the 12th International Web for All Conference (W4A)

    This paper from IBM Research Tokyo investigates a hybrid approach to real-time captioning that combines Automated Speech Recognition (ASR) with human correction to make workplace meetings accessible for deaf and hard of hearing (DHH) employees. Professional stenography services…

    real-time captioning · deaf and hard of hearing · automated speech recognition · workplace accessibility · Japanese

  • Capti-Speak: A Speech-Enabled Web Screen Reader

    Vikas Ashok, Yevgen Borodin, Yury Puzis, I. V. Ramakrishnan · 2015 · Proceedings of the 12th International Web for All Conference (W4A)

    This paper presents Capti-Speak, a speech-augmented screen reader for web browsing that allows blind users to combine natural language voice commands with traditional keyboard shortcuts. Built as an extension to the Capti Narrator screen reader, Capti-Speak addresses a…

    screen readers · speech recognition · voice interface · web accessibility · blind users

  • The Implementation of a Vocabulary and Grammar for an Open-Source Speech-Recognition Programming Platform

    Jean K. Rodriguez-Cartagena, Andrea C. Claudio-Palacios, Natalia Pacheco-Tallaj, Valerie Santiago González, Patricia Ordonez-Franco · 2015 · ASSETS '15: Proceedings of the 17th International ACM SIGACCESS Conference on Computers & Accessibility

    This paper presents the development of a standardized vocabulary and grammar for voice-based programming, designed to make coding accessible to people with limited hand mobility or visual impairments. The work is part of the larger Kavita Project, which aims to create an…

    speech recognition · voice input · programming · motor impairment · repetitive strain injury

  • Speech Interaction with Personal Assistive Robots Supporting Aging at Home for Individuals with Alzheimer's Disease

    Frank Rudzicz, Rosalie Wang, Momotaz Begum, Alex Mihailidis · 2015 · ACM Transactions on Accessible Computing (TACCESS)

    This study examines speech-based interaction between older adults with Alzheimer's disease (AD) and a mobile assistive robot called ED, designed to help with activities of daily living. The research addresses a critical healthcare challenge: many nations lack capacity to support…

    Alzheimer's disease · dementia · assistive robotics · speech recognition · aging in place

  • Perspectives on Speech and Language Interaction for Daily Assistive Technology: Introduction to Part 1 of the Special Issue

    Heidi Christensen, Frank Rudzicz, François Portet, Jan Alexandersson · 2015 · ACM Transactions on Accessible Computing (TACCESS)

    This editorial introduces the first part of a TACCESS special issue on speech and language interaction for daily assistive technology, emerging from the 2013 SLPAT (Speech and Language Processing for Assistive Technologies) workshop. The editors frame speech and natural language…

    speech recognition · disordered speech · dysarthria · speech intelligibility · assistive technology

  • Evaluation of a Context-Aware Voice Interface for Ambient Assisted Living: Qualitative User Study vs. Quantitative System Evaluation

    Michel Vacher, Sybille Caffiau, François Portet, Brigitte Meillon, Camille Roux, Elena Eluj, Benjamin Lecouteux, Pedro Chahuara · 2015 · ACM Transactions on Accessible Computing (TACCESS)

    This study evaluates Sweet-Home, a voice-controlled smart home system designed to help older adults and people with visual impairments maintain independence at home. The research was conducted in a realistic 30-square-meter smart apartment (Domus) equipped with 150 sensors, 7…

    ambient assisted living · voice interface · smart home · aging in place · visual impairment

  • Automatic Detection of Phone-Based Anomalies in Dysarthric Speech

    Imed Laaridh, Corinne Fredouille, Christine Meunier · 2015 · ACM Transactions on Accessible Computing (TACCESS)

    This research develops automatic methods to detect and localize acoustic anomalies in speech produced by people with dysarthria, a motor speech disorder caused by neurological damage affecting the respiratory, phonatory, resonatory, articulatory, or prosodic components of…

    dysarthria · speech recognition · automatic speech processing · motor speech disorders · clinical assessment

  • Intelligibility Assessment and Speech Recognizer Word Accuracy Rate Prediction for Dysarthric Speakers in a Factor Analysis Subspace

    David Martínez, Phil Green, Heidi Christensen · 2015 · ACM Transactions on Accessible Computing

    This paper presents a novel approach to assessing speech intelligibility and predicting automatic speech recognition (ASR) accuracy for speakers with dysarthria using iVectors, a technique from speaker verification research. The authors address a critical challenge in assistive…

    dysarthric speech · speech recognition · intelligibility assessment · iVectors · factor analysis

  • JustSpeak: enabling universal voice control on Android

    Yu Zhong, T. V. Raman, Casey Burkhardt, Fadi Biadsy, Jeffrey P. Bigham · 2014 · Proceedings of the 11th Web for All Conference (W4A)

    This paper introduces JustSpeak, a universal voice control system for Android that works across all applications without requiring any developer intervention. Unlike Google Now or Siri, which only support pre-defined commands for specific apps, JustSpeak dynamically constructs…

    voice interface · mobile accessibility · blindness · motor accessibility · Android

  • Automatically Identifying Trouble-Indicating Speech Behaviors in Alzheimer's Disease

    Frank Rudzicz, Leila Chan Currie, Andrew Danks, Tejas Mehta, Shunan Zhao · 2014 · ASSETS '14: Proceedings of the 16th International ACM SIGACCESS Conference on Computers & Accessibility

    This paper addresses the challenge of automatically detecting communication breakdowns in conversations with people who have Alzheimer's disease (AD). AD is a progressive neurodegenerative disease that deteriorates memory, executive capacity, visual-spatial reasoning, and…

    Alzheimer's disease · dementia · speech recognition · natural language processing · machine learning

  • Capti-Speak: A Speech-Enabled Accessible Web Interface

    Vikas Ashok · 2014 · Proceedings of the 16th International ACM SIGACCESS Conference on Computers & Accessibility (ASSETS)

    This paper presents Capti-Speak, a speech-augmented screen reader interface for the web that allows blind users to issue voice commands alongside traditional keyboard shortcuts. Built on top of the Capti web browsing application (which provides a JAWS-like screen reader…

    screen readers · speech recognition · voice interface · web accessibility · blindness

  • Improving Programming Interfaces for People with Limited Mobility Using Voice Recognition

    Xiomara Figueroa Fontánez, Patricia Ordóñez · 2014 · Proceedings of the 16th International ACM SIGACCESS Conference on Computers & Accessibility (ASSETS)

    This paper describes an effort to make programming more accessible to people with motor impairments by integrating voice recognition into an Integrated Development Environment (IDE). The work is motivated by the specific case of a computer scientist with spinal muscular atrophy…

    programming accessibility · voice interface · speech recognition · motor disability · spinal muscular atrophy

  • Speech Dasher: A Demonstration of Text Input using Speech and Approximate Pointing

    Keith Vertanen, David J.C. MacKay · 2014 · Proceedings of the 16th International ACM SIGACCESS Conference on Computers & Accessibility (ASSETS 2014)

    This paper demonstrates Speech Dasher, a multimodal text entry system that combines speech recognition with the Dasher zooming interface to enable fast, corrected text input using only voice and gaze direction. The core problem addressed is that while speech dictation is fast,…

    speech recognition · eye tracking · gaze input · text entry · error correction

  • Exploring the Use of Speech Input by Blind People on Mobile Devices

    Shiri Azenkot, Nicole B. Lee · 2013 · Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS)

    This paper investigates how blind people use speech input on mobile devices through two studies: a survey of 169 participants (64 blind/low-vision, 105 sighted) and a laboratory study with 8 blind participants composing paragraphs on an iPod Touch using speech dictation versus…

    visual impairment · blindness · speech input · speech recognition · mobile accessibility

  • Architecture of an Automated Therapy Tool for Childhood Apraxia of Speech

    Avinash Parnandi, Virendra Karappa, Youngpyo Son, Mostafa Shahin, Jacqueline McKechnie, Kirrie Ballard, Beena Ahmed, Ricardo Gutierrez-Osuna · 2013 · Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS)

    This paper presents a multi-tier client-server system for remotely administering speech therapy to children with childhood apraxia of speech (CAS), a neurological speech sound disorder that impairs the precision and consistency of oro-motor planning and execution needed for…

    childhood apraxia of speech · speech therapy · speech sound disorder · automated speech analysis · tele-rehabilitation

  • Enhancing Learning Accessibility through Fully Automatic Captioning

    Maria Federico, Marco Furini · 2012 · Proceedings of the International Cross-Disciplinary Conference on Web Accessibility (W4A)

    This paper proposes an architecture for automatically generating synchronized captions for video lectures using off-the-shelf automatic speech recognition (ASR) software, aimed at making educational content accessible to hearing impaired students, dyslexic students, ESL (English…

    captioning · speech recognition · education accessibility · deaf and hard of hearing · automatic speech recognition

  • A Readability Evaluation of Real-Time Crowd Captions in the Classroom

    Raja S. Kushalnagar, Walter S. Lasecki, Jeffrey P. Bigham · 2012 · Proceedings of the 14th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2012)

    This paper evaluates the readability of real-time captions produced by three different approaches in a higher education classroom setting: professional CART (Communication Access Realtime Translation) captionists, automatic speech recognition (ASR), and a novel crowd captioning…

    real-time captioning · deaf and hard of hearing · crowdsourcing · classroom accessibility · higher education

  • Real-Time Captioning by Groups of Non-Experts

    Walter Lasecki, Christopher Miller, Adam Sadilek, Andrew Abumoussa, Donato Borrello, Raja Kushalnagar, Jeffrey Bigham · 2012 · UIST '12: Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology

    This paper presents Legion:Scribe, an end-to-end system that enables groups of non-expert typists to collectively produce real-time captions for deaf and hard of hearing (DHH) people, offering a cheaper and more available alternative to professional stenographers (CART, costing…

    real-time captioning · crowdsourcing · deaf and hard of hearing · human computation · text alignment

  • Crowdsourcing Correction of Speech Recognition Captioning Errors

    M. Wald · 2011 · Proceedings of the International Cross-Disciplinary Conference on Web Accessibility (W4A)

    This paper describes tools built around Synote, an award-winning web-based application from the University of Southampton, that enable crowdsourced correction of automatic speech recognition (ASR) captioning errors to make video content accessible at scale. The author frames the…

    captioning · speech recognition · crowdsourcing · deaf and hard of hearing · video accessibility