With Few Eyes, All Hoaxes Are Deep

Published in CSCW 2018, 2018

Recommended citation: Sumit Asthana and Aaron Halfaker. 2018. With Few Eyes, All Hoaxes are Deep. InProceedings of the ACMon Human-Computer Interaction, Vol. 2, CSCW, Article 21 (November 2018). ACM, New York, NY. 18 pages. https://doi.org/10.1145/3274290

Quality control is critical to open production communities like Wikipedia. Editors enact quality control onthe borders of Wikipedia to review edits (counter-vandalism) and new article creations (new page patrolling)shortly after they are saved. In this paper, we describe a long-standing set of inefficiencies that have plaguednew page patrolling by drawing a contrast to the more efficient, distributed processes for counter-vandalism.To effect better page review distribution, we develop an effective automated topic model based on a labelingstrategy that leverages a folksonomy developed by subject specific working groups in Wikipedia (WikiProjecttags) and a flexible ontology (WikiProjects Directory) to arrive at a hierarchical and uniform label set. Weare able to attain very high fitness measures (macro ROC-AUC: 95.2%, macro PR-AUC: 74.5%) and real-timeperformance using word2vec-based features on the intial draft versions of articles. Finally, we present aproposal for how incorporating this model into current tools will shift the dynamics of new article reviewpositively.

Download paper here