← All reviews

Speech-Based Cursor Control: A Study of Grid-Based Solutions

Liwei Dai, Rich Goldman, Andrew Sears, Jeremy Lozier · 2004 · Proceedings of the 6th International ACM SIGACCESS Conference on Computers and Accessibility (Assets 04) · doi:10.1145/1028630.1028648

Summary

This paper investigates grid-based approaches to speech-controlled cursor positioning, addressing a well-documented weakness of speech recognition technology: while dictation-based text entry has improved dramatically, using speech for precise cursor control remains slow and error-prone. The authors developed two variations of a 3x3 grid overlay system that recursively subdivides the screen to allow progressively finer target selection. The first variation (nine-cursor) places a cursor at the center of each grid cell, letting users select a target by naming the cell number containing it; the grid then zooms into that cell and the process repeats. The second variation (one-cursor) places a single cursor in the center cell, requiring users to navigate to the correct cell before zooming. Both solutions use IBM ViaVoice for speech recognition and support "Move" commands for fine adjustments and a "Back" command to recover from errors. The research builds on the authors' earlier work with spinal cord injury users, which confirmed that existing speech-based cursor control solutions were inadequate for practical use. The grid-based approach is theoretically appealing because it uses discrete commands rather than continuous movement, reducing the vocabulary needed and limiting the impact of recognition errors.

Key findings

In an experiment with 24 university students divided equally between the two grid conditions, the nine-cursor solution produced significantly faster target selection times (M=8.38 seconds) compared to the one-cursor solution (M=9.91 seconds), confirming the advantage of allowing users to select any cell directly. Target size had a significant effect on both completion time and accuracy, with smaller targets requiring more time and producing more errors. Notably, the grid-based approach eliminated the effect of target distance on selection time, a meaningful improvement over direction-based solutions where distance significantly impacts performance. Compared to earlier speech-based cursor control studies, the grid solutions reduced selection times for large targets by at least 33%, and for small targets the improvement averaged 55%. Error rates for large targets decreased nearly 70%, and small targets saw 85% fewer errors. The nine-cursor solution did produce significantly more errors than the one-cursor solution, likely because the larger vocabulary increased recognition errors. Subjective ratings showed the nine-cursor solution was rated significantly more comfortable, though other satisfaction measures did not differ significantly.

Relevance

This research is relevant to accessibility practitioners working on hands-free computing solutions for people with motor disabilities, particularly spinal cord injuries and other conditions that prevent mouse use. The grid-based approach offers a practical alternative to direction-based speech cursor control, with the key advantage that distance to the target no longer affects task completion time. For practitioners designing voice-controlled interfaces, the finding that the nine-cursor solution trades slightly higher error rates for substantially faster performance illustrates a common accessibility design tension between speed and accuracy. A limitation is that the study used non-disabled university students rather than the target population of people with motor impairments, so the results represent a performance ceiling rather than real-world usage. The authors acknowledge this and reference their prior work with spinal cord injury users as motivation, but the transfer of these findings to users with disabilities remains to be validated.

Tags: speech recognition · cursor control · motor accessibility · voice interaction · input methods · grid-based navigation · hands-free computing