← Writing · Glossary →

Reviews

The literature-review database. Every paper Bob has reviewed (he has read many more), with a short summary, key findings, and tags. Browse, filter, search.

Search results

  • Extracting content from accessible web pages

    Suhit Gupta, Gail Kaiser · 2005 · Proceedings of the 2005 International Cross-Disciplinary Workshop on Web Accessibility (W4A)

    This paper from Columbia University presents Crunch, a web proxy tool that applies heuristic-based filters to extract core content from web pages by removing clutter such as advertisements, navigation menus, spacer elements, and extraneous links. Crunch works by parsing HTML…

    content extraction · screen readers · web clutter · DOM · web proxy

1 result.