skip to main content
10.1145/3581641.3584039acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
research-article

Supporting Requesters in Writing Clear Crowdsourcing Task Descriptions Through Computational Flaw Assessment

Published:27 March 2023Publication History

ABSTRACT

Quality control is an, if not the, essential challenge in crowdsourcing. Unsatisfactory responses from crowd workers have been found to particularly result from ambiguous and incomplete task descriptions, often from inexperienced task requesters. However, creating clear task descriptions with sufficient information is a complex process for requesters in crowdsourcing marketplaces. In this paper, we investigate the extent to which requesters can be supported effectively in this process through computational techniques. To this end, we developed a tool that enables requesters to iteratively identify and correct eight common clarity flaws in their task descriptions before deployment on the platform. The tool can be used to write task descriptions from scratch or to assess and improve the clarity of prepared descriptions. It employs machine learning-based natural language processing models trained on real-world task descriptions that score a given task description for the eight clarity flaws. On this basis, the requester can iteratively revise and reassess the task description until it reaches a sufficient level of clarity. In a first user study, we let requesters create task descriptions using the tool and rate the tool’s different aspects of helpfulness thereafter. We then carried out a second user study with crowd workers, as those who are confronted with such descriptions in practice, to rate the clarity of the created task descriptions. According to our results, 65% of the requesters classified the helpfulness of the information provided by the tool high or very high (only 12% as low or very low). The requesters saw some room for improvement though, for example, concerning the display of bad examples. Nevertheless, 76% of the crowd workers believe that the overall clarity of the task descriptions created by the requesters using the tool improves over the initial version. In line with this, the automatically-computed clarity scores of the edited task descriptions were generally higher than those of the initial descriptions, indicating that the tool reliably predicts the clarity of task descriptions in overall terms.

References

  1. Michael S Bernstein, Greg Little, Robert C Miller, Björn Hartmann, Mark S Ackerman, David R Karger, David Crowell, and Katrina Panovich. 2010. Soylent: a word processor with a crowd inside. In Proceedings of the 23nd annual ACM symposium on User interface software and technology. 313–322.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Jonathan Bragg, Daniel S Weld, 2018. Sprout: Crowd-powered task design for crowdsourcing. In The 31st Annual ACM Symposium on User Interface Software and Technology. ACM, 165–176.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Jeanne Sternlicht Chall and Edgar Dale. 1995. Readability revisited: The new Dale-Chall readability formula. Brookline Books.Google ScholarGoogle Scholar
  4. Jesse Chandler, Gabriele Paolacci, and Pam Mueller. 2013. Risks and Rewards of Crowdsourcing Marketplaces. Springer New York, New York, NY, 377–392. https://doi.org/10.1007/978-1-4614-8806-4_30Google ScholarGoogle ScholarCross RefCross Ref
  5. Joseph Chee Chang, Saleema Amershi, and Ece Kamar. 2017. Revolt: Collaborative crowdsourcing for labeling machine learning datasets. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 2334–2346.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Jenny J Chen, Natala J Menezes, Adam D Bradley, and T North. 2011. Opportunities for crowdsourcing research on amazon mechanical turk. Interfaces 5, 3 (2011), 1.Google ScholarGoogle Scholar
  7. Chun-Wei Chiang, Anna Kasunic, and Saiph Savage. 2018. Crowd coach: Peer coaching for crowd workers’ skill growth. Proceedings of the ACM on Human-Computer Interaction 2, CSCW(2018), 1–17.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Kevyn Collins-Thompson and James P Callan. 2004. A language modeling approach to predicting reading difficulty. In Proceedings of the human language technology conference of the North American chapter of the association for computational linguistics: HLT-NAACL 2004. 193–200.Google ScholarGoogle Scholar
  9. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423Google ScholarGoogle ScholarCross RefCross Ref
  10. Djellel Difallah, Elena Filatova, and Panos Ipeirotis. 2018. Demographics and dynamics of mechanical turk workers. In Proceedings of the eleventh ACM international conference on web search and data mining. 135–143.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ailbhe Finnerty, Pavel Kucherbaev, Stefano Tranquillini, and Gregorio Convertino. 2013. Keep it simple: Reward and task design in crowdsourcing. In Proceedings of the Biannual Conference of the Italian Chapter of SIGCHI. 1–4.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Michael J Franklin, Donald Kossmann, Tim Kraska, Sukriti Ramesh, and Reynold Xin. 2011. CrowdDB: answering queries with crowdsourcing. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. ACM, 61–72.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ujwal Gadiraju and Gianluca Demartini. 2019. Understanding worker moods and reactions to rejection in crowdsourcing. In Proceedings of the 30th ACM Conference on Hypertext and Social Media. 211–220.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Ujwal Gadiraju, Jie Yang, and Alessandro Bozzon. 2017. Clarity is a worthwhile quality: On the role of task clarity in microtask crowdsourcing. In Proceedings of the 28th ACM Conference on Hypertext and Social Media. ACM, 5–14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Snehalkumar (Neil) S Gaikwad, Mark E Whiting, Dilrukshi Gamage, Catherine A Mullings, Dinesh Majeti, Shirish Goyal, Aaron Gilbee, Nalin Chhibber, Adam Ginzberg, Angela Richmond-Fuller, 2017. The daemo crowdsourcing marketplace. In Companion of the 2017 ACM conference on computer supported cooperative work and social computing. 1–4.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Philipp Gutheim and Björn Hartmann. 2012. Fantasktic: Improving quality of results for novice crowdsourcing users. EECS Dept., Univ. California, Berkeley, CA, USA, Tech. Rep. UCB/EECS-2012 112(2012).Google ScholarGoogle Scholar
  17. Pengcheng He, Jianfeng Gao, and Weizhu Chen. 2021. DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing. CoRR abs/2111.09543(2021). arXiv:2111.09543https://arxiv.org/abs/2111.09543Google ScholarGoogle Scholar
  18. Jeff Howe. 2006. The rise of crowdsourcing. Wired magazine 14, 6 (2006), 1–4.Google ScholarGoogle Scholar
  19. Kosetsu Ikeda, Atsuyuki Morishima, Habibur Rahman, Senjuti Basu Roy, Saravanan Thirumuruganathan, Sihem Amer-Yahia, and Gautam Das. 2016. Collaborative crowdsourcing with crowd4u. Proceedings of the VLDB Endowment 9, 13 (2016), 1497–1500.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Thorsten Joachims. 1998. Text categorization with support vector machines: learning with many relevant features. In Proceedings of ECML-98, 10th European Conference on Machine Learning, Claire Nédellec and Céline Rouveirol (Eds.). Springer Verlag, Heidelberg, DE, Chemnitz, DE, 137–142. /brokenurl#joachims98.psGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  21. Collins-Thompson Kevyn. 2014. Computational assessment of text readability. ITL-International Journal of Applied Linguistics 165, 2(2014), 97–135.Google ScholarGoogle ScholarCross RefCross Ref
  22. Shashank Khanna, Aishwarya Ratan, James Davis, and William Thies. 2010. Evaluating and improving the usability of Mechanical Turk for low-income workers in India. In Proceedings of the first ACM symposium on computing for development. 1–10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J Peter Kincaid, Robert P Fishburne Jr, Richard L Rogers, and Brad S Chissom. 1975. Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. Technical Report. Naval Technical Training Command Millington TN Research Branch.Google ScholarGoogle Scholar
  24. Aniket Kittur, Jeffrey V Nickerson, Michael Bernstein, Elizabeth Gerber, Aaron Shaw, John Zimmerman, Matt Lease, and John Horton. 2013. The future of crowd work. In Proceedings of the 2013 conference on Computer supported cooperative work. 1301–1318.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Aniket Kittur, Boris Smus, Susheel Khamkar, and Robert E Kraut. 2011. Crowdforge: Crowdsourcing complex work. In Proceedings of the 24th annual ACM symposium on User interface software and technology. 43–52.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Anand P Kulkarni, Matthew Can, and Bjoern Hartmann. 2011. Turkomatic: automatic recursive task and workflow design for mechanical turk. In CHI’11 extended abstracts on human factors in computing systems. 2053–2058.Google ScholarGoogle Scholar
  27. Greg Little, Lydia B Chilton, Max Goldman, and Robert C Miller. 2010. Turkit: human computation algorithms on mechanical turk. In Proceedings of the 23nd annual ACM symposium on User interface software and technology. 57–66.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. VK Chaithanya Manam, Dwarakanath Jampani, Mariam Zaim, Meng-Han Wu, and Alexander J. Quinn. 2019. TaskMate: A Mechanism to Improve the Quality of Instructions in Crowdsourcing. In Companion Proceedings of The 2019 World Wide Web Conference. 1121–1130.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. VK Chaithanya Manam and Alexander J Quinn. 2018. Wingit: Efficient refinement of unclear task instructions. In Sixth AAAI Conference on Human Computation and Crowdsourcing.Google ScholarGoogle ScholarCross RefCross Ref
  30. Brian McInnis, Dan Cosley, Chaebong Nam, and Gilly Leshed. 2016. Taking a HIT: Designing around rejection, mistrust, risk, and workers’ experiences in Amazon Mechanical Turk. In Proceedings of the 2016 CHI conference on human factors in computing systems. 2271–2282.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Zahra Nouri, Ujwal Gadiraju, Gregor Engels, and Henning Wachsmuth. 2021. What is unclear? computational assessment of task clarity in crowdsourcing. In Proceedings of the 32nd ACM Conference on Hypertext and Social Media. 165–175.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Zahra Nouri, Ujwal Gadiraju, Gregor Engels, and Henning Wachsmuth. 2021. What is Unclear? Computational Assessment of Task Clarity in Crowdsourcing. In Submitted for publication.Google ScholarGoogle Scholar
  33. Zahra Nouri, Henning Wachsmuth, and Gregor Engels. 2020. Mining Crowdsourcing Problems from Discussion Forums of Workers. In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), 6264–6276. https://doi.org/10.18653/v1/2020.coling-main.551Google ScholarGoogle ScholarCross RefCross Ref
  34. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825–2830.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Niloufar Salehi, Jaime Teevan, Shamsi Iqbal, and Ece Kamar. 2017. Communicating context to the crowd for complex writing tasks. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 1890–1901.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Thimo Schulze, Stefan Seedorf, David Geiger, Nicolas Kaufmann, and Martin Schader. 2011. Exploring task properties in crowdsourcing–An empirical study on Mechanical Turk. (2011).Google ScholarGoogle Scholar
  37. David Schwartz. 2018. Embedded in the crowd: Creative freelancers, crowdsourced work, and occupational community. Work and Occupations 45, 3 (2018), 247–282. https://doi.org/10.1177/0730888418762263 arXiv:https://doi.org/10.1177/0730888418762263Google ScholarGoogle ScholarCross RefCross Ref
  38. M Six Silberman, Joel Ross, Lilly Irani, and Bill Tomlinson. 2010. Sellers’ problems in human computation markets. In Proceedings of the acm sigkdd workshop on human computation. 18–21.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Daniel S Weld, Christopher H Lin, and Jonathan Bragg. 2015. Artificial intelligence and collective intelligence. Handbook of Collective Intelligence(2015), 89–114.Google ScholarGoogle Scholar
  40. Meng-Han Wu and Alexander James Quinn. 2017. Confusing the crowd: Task instruction quality on amazon mechanical turk. In Fifth AAAI Conference on Human Computation and Crowdsourcing.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Supporting Requesters in Writing Clear Crowdsourcing Task Descriptions Through Computational Flaw Assessment
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              IUI '23: Proceedings of the 28th International Conference on Intelligent User Interfaces
              March 2023
              972 pages
              ISBN:9798400701061
              DOI:10.1145/3581641

              Copyright © 2023 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 27 March 2023

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed limited

              Acceptance Rates

              Overall Acceptance Rate746of2,811submissions,27%
            • Article Metrics

              • Downloads (Last 12 months)93
              • Downloads (Last 6 weeks)4

              Other Metrics

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format .

            View HTML Format