← Writing · Glossary →

Reviews

The literature-review database. Every paper Bob has reviewed (he has read many more), with a short summary, key findings, and tags. Browse, filter, search.

Search results

  • ADCanvas: Accessible and Conversational Audio Description Authoring for Blind and Low Vision Creators

    Franklin Mingzhe Li, Michael Xieyang Liu, Cynthia L Bennett, Shaun K. Kane · 2026 · Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI '26)

    Li and colleagues tackle a rarely examined corner of accessibility: the fact that the tools used to produce Audio Description (AD) are themselves largely inaccessible to the blind and low-vision (BLV) creators who are often its most skilled practitioners. Professional AD…

    audio description · blind and low vision · conversational agent · multimodal LLM · visual question answering

  • How Multimodal Large Language Models Support Access to Visual Information: A Diary Study With Blind and Low Vision People

    Ricardo E. Gonzalez Penuela, Crescentia Jung, Sharon Lin, Ruiying Hu, Shiri Azenkot · 2026 · Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI '26)

    This CHI 2026 paper reports a two-week diary study with 20 Blind and Low Vision (BLV) participants (ages 19–75, 11 female/9 male, 13 blind/7 low vision) investigating how multimodal large language models (MLLMs) support real-world access to visual information. The authors built…

    AI · accessibility · multimodal large language models · MLLM · visual question answering

  • Say It My Way: Exploring Control in Conversational Visual Question Answering with Blind Users

    Farnaz Zamiri Zeraati, Yang Cao, Yuehan Qiao, Hal Daumé III, Hernisa Kacorri · 2026 · Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI '26)

    This CHI 2026 paper investigates how blind users can exert control over responses generated by conversational visual question answering (VQA) systems built on vision-language models. While prompting and steering techniques are well established in general-purpose generative AI,…

    blind users · generative AI · visual question answering · VQA · personalization

  • ViDscribe: Multimodal AI for Customizing Audio Description and Question Answering in Online Videos

    Maryam S Cheema, Sina Elahimanesh, Pooyan Fazli, Hasti Seifi · 2026 · Extended Abstracts of the 2026 CHI Conference on Human Factors in Computing Systems (CHI EA '26)

    Cheema and colleagues (Arizona State University and Saarland University) present ViDscribe, a web platform that layers AI-generated audio description (AD) and conversational visual question answering (VQA) on top of arbitrary YouTube videos for blind and low vision (BLV)…

    video accessibility · audio description · blind and low vision · multimodal large language models · visual question answering

  • Probing the Gaps in ChatGPT's Live Video Chat for Real-World Assistance for People who are Blind or Visually Impaired

    Ruei-Che Chang, Rosiana Natalie, Wenqian Xu, Jovan Zheng Feng Yap, Anhong Guo · 2025 · ASSETS 2025: 27th International ACM SIGACCESS Conference on Computers and Accessibility

    This paper evaluates ChatGPT's Advanced Voice with Video feature — OpenAI's state-of-the-art live video AI released in December 2024 — as a real-world assistive tool for blind and visually impaired (BVI) individuals. The researchers conducted an in-person exploratory study with…

    blind · visually impaired · large multimodal models · live video · ChatGPT

  • The Potential of a Visual Dialogue Agent In a Tandem Automated Audio Description System for Videos

    Abigale Stangl, Shasta Ihorn, Yue-Ting Siu, Aditya Bodi, Mar Castanon, Lothar D Narins, Ilmi Yoon · 2023 · Proceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2023)

    This paper presents and evaluates a tandem AI-based audio description (AD) system for videos that combines two complementary tools: NarrationBot, which delivers automated minimum viable descriptions (MVD) of video content, and InfoBot, a visual dialogue agent that allows users…

    audio description · blind and low vision · visual question answering · visual dialogue · AI

  • Investigating the Appropriateness of Social Network Question Asking as a Resource for Blind Users

    Erin L. Brady, Yu Zhong, Meredith Ringel Morris, Jeffrey P. Bigham · 2013 · Proceedings of the 2013 Conference on Computer Supported Cooperative Work (CSCW 2013)

    This paper explores whether social networking sites (SNSs) are a viable alternative to paid crowdsourcing for answering blind users' visual questions. The research combines three methods: a survey of 191 blind adults about their social networking habits and attitudes toward…

    social media accessibility · blind users · crowdsourcing · friendsourcing · visual question answering

  • Visual Challenges in the Everyday Lives of Blind People

    Erin Brady, Meredith Ringel Morris, Yu Zhong, Samuel White, Jeffrey P. Bigham · 2013 · CHI '13: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

    This paper presents the findings of a year-long large-scale study of VizWiz Social, an iPhone application that allows blind users to take a photograph, record a spoken question about it, and receive answers from crowd workers or social contacts within about a minute. Between May…

    blind users · crowdsourcing · mobile accessibility · VizWiz · visual question answering

  • Using Real-time Feedback to Improve Visual Question Answering

    Yu Zhong, Phyo Thiha, Grant He, Walter Lasecki, Jeffrey Bigham · 2012 · CHI EA '12: CHI '12 Extended Abstracts on Human Factors in Computing Systems

    This work-in-progress paper introduces Legion:View, a system that extends the VizWiz model of crowd-powered visual question answering by adding a real-time feedback loop between blind users and crowd workers. The original VizWiz allowed blind users to take a still photograph,…

    visual question answering · crowdsourcing · blind users · real-time systems · assistive technology

  • What the Disability Community Can Teach Us About Interactive Crowdsourcing

    Jeffrey P. Bigham, Richard E. Ladner · 2011 · Interactions

    This short forum article argues that the disability community has been practicing interactive crowdsourcing long before the term became mainstream in computing, and that mainstream crowdsourcing systems have much to learn from these experiences. The authors trace how people with…

    crowdsourcing · assistive technology · disability community · visual question answering · sign language interpreting

  • Analyzing Visual Questions from Visually Impaired Users

    Erin L. Brady · 2011 · The Proceedings of the 13th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS)

    This doctoral consortium paper presents an analysis of the types of visual questions that visually impaired users ask through VizWiz, a mobile phone application that provides near-realtime answers to visual questions. VizWiz allows users to take a photo with their phone, speak a…

    blindness and low vision · crowdsourcing · computer vision · mobile accessibility · visual question answering

  • VizWiz: Nearly Real-Time Answers to Visual Questions

    Jeffrey P. Bigham, Chandrika Jayant, Hanjie Ji, Greg Little, Andrew Miller, Robert C. Miller, Aubrey Tatarowicz, Brandyn White, Samuel White, Tom Yeh · 2010 · Proceedings of the 2010 International Cross Disciplinary Conference on Web Accessibility (W4A)

    This paper introduces VizWiz, a pioneering mobile application that enables blind and low-vision people to get nearly real-time answers to visual questions by connecting their smartphone cameras to remote paid workers on Amazon Mechanical Turk. Users take a photo with their…

    blind and low vision · crowdsourcing · assistive technology · mobile accessibility · human computation

12 results.