← All reviews

Real-Time Crowd Labeling for Deployable Activity Recognition

Walter S. Lasecki, Young Chol Song, Henry Kautz, Jeffrey P. Bigham · 2013 · Proceedings of the 2013 Conference on Computer Supported Cooperative Work (CSCW 2013) · doi:10.1145/2441776.2441912

Summary

Legion:AR is a system that provides deployable activity recognition by combining real-time crowd labeling with automatic recognition using Hidden Markov Models (HMMs). The system addresses a critical limitation of current activity recognition: automated systems must be trained in advance on known activities and cannot handle novel situations, making them difficult to deploy in real-world settings like homes for elderly or cognitively disabled individuals. Legion:AR uses an active learning approach where the HMM attempts to classify activities from sensor data (RFID tags or Microsoft Kinect video), and when confidence is low, it requests labels from crowd workers in real time. Workers watch live or near-live video streams and enter open-ended text labels describing the activities they observe. Multiple workers' labels are merged using a directed acyclic graph that matches equivalent labels (using Damerau-Levenshtein distance) and identifies the most likely sequence via greedy traversal. These crowd-generated labels then train the HMM online, enabling the system to automatically recognize the same activities in future occurrences. The system includes privacy protection features: automatic face/body veiling using colored silhouettes detected via Kinect, low video resolution options, opt-in/opt-out controls via mobile alerts, and time-limited worker sessions.

Key findings

The HMM trained on crowd-generated labels achieved 90.2% average precision and was able to recall all conducted activities in home monitoring experiments — significantly outperforming single-worker labels (66.7% precision, 78% recall). Privacy-preserving veils (colored silhouettes covering faces or entire bodies) did not significantly reduce workers' ability to accurately label activities — only one label was missed due to obfuscation. In multi-actor surveillance scenes, workers achieved 85% accuracy in labeling activities by specific veiled individuals. For complex labeling tasks, groups of 5 workers collectively identified 66% of actions and 90% of objects used in activities, compared to individual workers averaging only 32% of actions and 48% of objects. Crowd workers also discovered 24 unique actions and 12 unique objects that an expert offline labeler had missed, demonstrating that crowds can produce richer and more complete annotations than individual experts. Workers converged on consistent labels quickly, generating only 8 unique labels for the same activities across sessions, and 80% reported that automatic HMM suggestions made agreement easier.

Relevance

This research has direct implications for assistive technology, particularly for supporting aging in place and independent living for people with cognitive disabilities. Activity recognition systems can power prompting tools that help people with dementia or cognitive impairments stay on track with daily tasks like taking medicine, preparing meals, and maintaining hygiene. The key accessibility insight is that fully automated systems are too brittle for real-world deployment — they cannot handle the natural variability of human behavior — but crowd-augmented systems can bridge this gap by labeling novel activities in real time and progressively training the automation. The privacy protection mechanisms (veiling, opt-in/opt-out, resolution reduction) are essential for ethical deployment of monitoring technologies with vulnerable populations, and the finding that privacy veils do not significantly reduce labeling accuracy is encouraging. The system also demonstrates a general framework for human-AI collaboration applicable to many accessibility challenges: use human intelligence when the machine is uncertain, train the machine from human input, and gradually shift toward automation while maintaining human oversight.

Tags: activity recognition · crowdsourcing · human computation · machine learning · aging in place · cognitive accessibility · smart environments · privacy · assistive technology