Publications of Matthew Lease

Other Author Pages: ACL · ACM · Amazon Science · arXiv · DBLP · Google Scholar · Semantic Scholar · OpenReview · ORCID · SSRN · UT Database

By Year: 2024 · 2023 · 2022 · 2021 · 2020 · 2019 · 2018 · 2017 · 2016 · 2015 · 2014 · 2013 · 2012 · 2011 · 2010 · 2009 · 2008 · 2007 · 2006 · 2005 · 2004 · 2003 · 2002 · 2001 · 2000 · 1999 · 1998 ·

2024

Soumyajit Gupta, Venelin Kovatchev, Maria De-Arteaga, and Matthew Lease. Fairly Accurate: Optimizing Accuracy Parity in Fair Target-Group Detection. Technical Report arXiv:2407.11933, The University of Texas at Austin, July 16 2024. [ bib | tech-report ]

Gauri Kambhatla, Matthew Lease, and Ashwin Rajadesingan. Promoting Constructive Deliberation: Reframing for Receptiveness. Technical Report arXiv:2405.15067, The University of Texas at Austin, May 23 2024. [ bib | tech-report ]

Venelin Kovatchev and Matthew Lease. Benchmark Transparency: Measuring the Impact of Data on Evaluation. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pages 1536--1551, 2024. [ bib | pdf | sourcecode | tech-report ]

Houjiang Liu, Anubrata Das, Alexander Boltz, Didi Zhou, Daisy Pinaroc, Matthew Lease, and Min Kyung Lee. Human-centered NLP Fact-checking: Co-Designing with Fact-checkers using Matchmaking for AI. Proceedings of the ACM on Human-Computer Interaction, 2024. To be presented at the 28th ACM Conference on Computer Supported Cooperative Work (CSCW). [ bib | tech-report ]

Utkarsh Mujumdar. Designing a Multi-Perspective Search System Using Large Language Models and Retrieval Augmented Generation. Master's thesis, School of Information, The University of Texas at Austin, 2024. Winner: Dean's Choice Award. [ bib | pdf | news ]

Terrence Neumann, Sooyong Lee, Maria De-Arteaga, Sina Fazelpour, and Matthew Lease. Diverse, but Divisive: LLMs Can Exaggerate Gender Differences in Opinion Related to Harms of Misinformation. Technical Report arXiv:2401.16558, University of Texas at Austin, January 29 2024. [ bib | tech-report ]

Luis Oala, Manil Maskey, Lilith Bat-Leah, Alicia Parrish, Nezihe Merve Gürel, Tzu-Sheng Kuo, Yang Liu, Rotem Dror, Danilo Brajovic, Xiaozhe Yao, Max Bartolo, William A Gaviria Rojas, Ryan Hileman, Rainier Aliment, Michael W. Mahoney, Meg Risdal, Matthew Lease, Wojciech Samek, Debojyoti Dutta, Curtis G Northcutt, Cody Coleman, Braden Hancock, Bernard Koch, Girmaw Abebe Tadesse, Bojan Karlaš, Ahmed Alaa, Adji Bousso Dieng, Natasha Noy, Vijay Janapa Reddi, James Zou, Praveen Paritosh, Mihaela van der Schaar, Kurt Bollacker, Lora Aroyo, Ce Zhang, Joaquin Vanschoren, Isabelle Guyon, and Peter Mattson. DMLR: Data-centric Machine Learning Research -- Past, Present and Future. Journal of Data-centric Machine Learning Research (DMLR), 2024. 27 pages. See also https://data.mlr.press/. [ bib | pdf | conference-website | tech-report ]

Alex C Williams, Min Bai, Jonathan Buck, Tristan J McKinney, Amy Rechkemmer, Koushik Kalyanaraman, Matthew Lease, Patrick Haffner, Xiong Zhou, Kumar Chellapilla, and Li Erran Li. Snapper: Accelerating Bounding Box Annotation in Object Detection Tasks with Find-and-Snap Tooling. In The 29th ACM Intl. Conference on Intelligent User Interfaces (IUI), pages 471--488, 2024. [ bib | DOI | pdf ]

Yian Wong. Exploring Multiple Perspectives to Mitigate Cognitive Biases through an Integrated Interface to Language Models. Master's thesis, School of Information, The University of Texas at Austin, 2024. [ bib | pdf ]

2023

Alexander Braylan, Madalyn Marabella, Omar Alonso, and Matthew Lease. A General Model for Aggregating Annotations Across Simple, Complex, and Multi-Object Annotation Tasks. Journal of Artificial Intelligence Research (JAIR), 78:901--973, December 2023. Presented at the 2024 Annual AAAI Conference on Artificial Intelligence. [ bib | pdf | data | sourcecode ]

Anubrata Das, Houjiang Liu, Venelin Kovatchev, and Matthew Lease. The State of Human-centered NLP Technology for Fact-checking. Information Processing & Management, 60(2), 2023. [ bib | DOI | pdf | tech-report ]

Caifan Du and Matthew Lease. Voices of Workers: Why a Worker-Centered Approach to Crowd Work Is Challenging. Technical report, University of Texas at Austin, 2023. January 6. arXiv:2212.14471. [ bib | pdf ]

Ruijiang Gao, Maytal Saar-Tsechansky, Maria De-Arteaga, Ligong Han, Wei Sun, Min Kyung Lee, and Matthew Lease. Learning Complementary Policies for Human-AI Teams. Technical report, University of Texas at Austin, 2023. February 6. arXiv:2302.02944. [ bib | pdf ]

Soumyajit Gupta, Sooyong Lee, Maria De-Arteaga, and Matthew Lease. Same Same, But Different: Conditional Multi-Task Learning for Demographic-Specific Toxicity Detection. In Proceedings of the Web Conference, pages 3689--3700, 2023. Additional, shorter video. [ bib | pdf | sourcecode | video | slides | tech-report ]

Danula Hettiachchi, Indigo Holcombe-James, Stephanie Livingstone, Anjalee de Silva, Matthew Lease, Flora D. Salim, and Mark Sanderson. How Crowd Worker Factors Influence Subjective Annotations: A Study of Tagging Misogynistic Hate Speech in Tweets. In Proceedings of the 11th AAAI Conference on Human Computation and Crowdsourcing (HCOMP), pages 38--50, 2023. [ bib | pdf | tech-report ]

Vijay Keswani, Elisa Celis, Krishnaram Kenthapadi, and Matthew Lease. Designing Closed-Loop Models for Task Allocation. In 2nd International Conference on Hybrid Human-Artificial Intelligence (HHAI), pages 17--32. IOS Press, 2023. [ bib | pdf | conference-website | slides | tech-report ]

Sooyong Lee. Multi-Task Learning for Hate Speech Detection. Master's thesis, Department of Computer Science, University of Texas at Austin, 2023. [ bib | pdf ]

Amy Rechkemmer, Alex C. Williams, Matthew Lease, and Li Erran Li. Characterizing Time Spent in Video Object Tracking Annotation Tasks: A Study of Task Complexity in Vehicle Tracking. In Proceedings of the 11th AAAI Conference on Human Computation and Crowdsourcing (HCOMP), pages 140--151, 2023. [ bib | pdf ]

Yiheng Su. Wrapper Boxes for Increasing Model Interpretability via Example-based Explanations. Master's thesis, Department of Computer Science, University of Texas at Austin, 2023. [ bib | pdf ]

Yiheng Su, Junyi Jessy Li, and Matthew Lease. Interpretable by Design: Wrapper Boxes Combine Neural Performance with Faithful Explanations. Technical Report arXiv:2311.08644, University of Texas at Austin, November 15 2023. [ bib | tech-report ]

Sukanya Thapa. Enhancing Worker Management and Supporting External Tasks in Crowdsourced Data Labeling. Master's thesis, School of Information, University of Texas at Austin, 2023. [ bib | pdf ]

Mehmet Deniz Türkmen, Matthew Lease, and Mucahid Kutlu. New Metrics to Encourage Innovation and Diversity in Information Retrieval Approaches. In Proceedings of the 45th European Conference on Information Retrieval (ECIR), pages 239--254, 2023. [ bib | pdf | slides | tech-report ]

Jie Yang, Alessandro Bozzon, Ujwal Gadiraju, and Matthew Lease. Editorial: “Human-Centered AI: Crowd Computing”. In Frontiers in Artificial Intelligence, Special Topic on Human-Centered AI: Crowd Computing. Frontiers, 2023. DOI: 10.3389/frai.2023.1161006. [ bib | pdf ]

2022

Lora Aroyo, Matthew Lease, Praveen Paritosh, and Mike Schaekermann. Data Excellence for AI: Why Should You Care? ACM Interactions, 29(2):66--69, 2022. March-April. [ bib | pdf | tech-report ]

Prateek Chaudhry and Matthew Lease. You Are What You Tweet: Profiling Users by Past Tweets to Improve Hate Speech Detection. In Proceedings of the 17th Annual iConference, pages 195--203, 2022. [ bib | pdf | video | conference-website | slides | tech-report ]

Anubrata Das, Chitrank Gupta, Venelin Kovatchev, Matthew Lease, and Junyi Jessy Li. ProtoTEx: Explaining Model Decisions with Prototype Tensors. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), pages 2986--2997, 2022. [ bib | pdf | sourcecode | video | poster | slides | tech-report ]

Anubrata Das, Houjiang Liu, Venelin Kovatchev, and Matthew Lease. The Need for Human-centered Design in Fact-checking Research. In Proceedings of the 1st Information Processing & Management (IP&M) Conference, 2022. [ bib | pdf | conference-website ]

Ruijiang Gao, Maytal Saar-Tsechansky, Maria De-Arteaga, Ligong Han, Min Kyung Lee, Wei Sun, and Matthew Lease. Robust Human-AI Collaboration with Bandit Feedback. In Conference on Information Systems and Technology (CIST), 2022. Best Student Paper Award. Further revised & retitled as https://arxiv.org/abs/2302.02944. [ bib ]

Soumyajit Gupta, Gurpreet Singh, Raghu Bollapragada, and Matthew Lease. Learning a Neural Pareto Manifold Extractor with Constraints. In Proceedings of the 38th International Conference on Uncertainty in Artificial Intelligence (UAI), pages 749--758, 2022. [ bib | pdf | sourcecode | video | poster | tech-report ]

Venelin Kovatchev, Trina Chatterjee, Venkata S Govindarajan, Jifan Chen, Eunsol Choi, Gabriella Chronis, Anubrata Das, Katrin Erk, Matthew Lease, Junyi Jessy Li, et al. Longhorns at DADC 2022: How many linguists does it take to fool a Question Answering model? A systematic approach to adversarial attacks. In Proceedings of the First Workshop on Dynamic Adversarial Data Collection (DADC) at the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), pages 41--52, 2022. [ bib | pdf | poster | slides | tech-report ]

Venelin Kovatchev, Soumyajit Gupta, Anubrata Das, and Matthew Lease. Fairly Accurate: Learning Optimal Accuracy vs. Fairness Tradeoffs for Hate Speech Detection. Technical report, University of Texas at Austin, 2022. April 15. arXiv:2204.07661. [ bib | pdf ]

Matthew Lease. A Better Way to Measure Annotator Agreement for Complex Tasks. In CloudResearch Conference on Innovations in Online Research, 2022. Presentation. [ bib | video | conference-website | slides ]

Data annotation (aka labeling or coding) is fundamental and ubiquitous in both machine learning (ML) and behavioral sciences. With ML, annotated data enables training supervised learning models and evaluating accuracy. In behavioral science, content analysis codes participant responses for study. Across both, annotator agreement measures (AAMs) assess the extent of agreement between human annotators (e.g., professional labelers, crowdsourcing contributors, or researchers) in labeling data consistently. Establishing such consistency is often a precursor to any subsequent use of annotated data. One of the best known measures, Krippendorf's alpha (KA), usefully supports measurement across any number of annotators. In its most general form, it also supports any annotation task in which distance between annotations can be quantified. However, this form of KA cannot distinguish two distributions of annotation distances having the same mean, rendering it quite brittle in practice. My lab's contribution is three-fold. First, we show any annotation task's evaluation metric can be repurposed as a distance function to facilitate use of KA (achieving generality). Second, we propose a simple change in how distance distributions are compared (using the Kolmogorv-Smirnov test instead of comparing only the means) to boost measure sensitivity (improving robustness). Third, we perform the first benchmarking of KA's general form across a wide range of simulated and real annotation tasks, demonstrating its inaccuracies and our improvement to it. Our approach is trivial to implement, and we also provide opensource.

Vivek Krishna Pradhan, Mike Schaekermann, and Matthew Lease. In Search of Ambiguity: A Three-Stage Workflow Design to Clarify Annotation Guidelines for Crowd Workers. Frontiers in Artificial Intelligence, 2022. ISSN:2624-8212. [ bib | pdf | tech-report ]

Li Shi, Nilavra Bhattacharya, Anubrata Das, Matthew Lease, and Jacek Gwizdka. The Effects of Interactive AI Design on User Behavior: An Eye-tracking Study of Fact-checking COVID-19 Claims. In Proceedings of the 7th ACM SIGIR Conference on Human Information, Interaction and Retrieval (CHIIR), pages 315--320, 2022. [ bib | pdf | demo | sourcecode | video | poster | tech-report ]

Prakhar Singh, Anubrata Das, Junyi Jessy Li, and Matthew Lease. The Case for Claim Difficulty Assessment in Automatic Fact Checking. Technical Report arXiv:2109.09689, University of Texas at Austin, February 4 2022. [ bib | pdf ]

Mehmet Deniz Türkmen, Matthew Lease, and Mucahid Kutlu. A New Evaluation Metric Rewarding Information Retrieval of Hard Documents. Technical Report TR-22-01, University of Texas at Austin, Department of Computer Science, 2022. January 14. [ bib | pdf ]

Didi Zhou. Leveraging Annotator Rationales for Active Learning with Transformers. Bachelor's thesis, Department of Computer Science, University of Texas at Austin, 2022. [ bib | pdf ]

2021

Alexander Braylan and Matthew Lease. Detecting Bias in Complex Annotations. In Third symposium on Biases in Human Computation and Crowdsourcing (BHCC), 2021. Presentation. [ bib | pdf | conference-website | slides ]

Anubrata Das, Sooyong Lee, An Thanh Nguyen, Aditya Kharosekar, Saumyaa Krishnan, Siddhesh Krishnan, Elizabeth Tate, Byron C. Wallace, and Matthew Lease. ExFacto: An Explainable Fact-Checking Tool. In Knight Research Network Tool Demonstration Day, 2021. Presentation. [ bib | video | conference-website | slides ]

Ruijiang Gao, Maytal Saar-Tsechansky, Maria De-Arteaga, Ligong Han, Min Kyung Lee, and Matthew Lease. Human-AI Collaboration with Bandit Feedback. In Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI), pages 1722--1728, 2021. [ bib | pdf | sourcecode | tech-report ]

Soumyajit Gupta, Gurpreet Singh, Anubrata Das, and Matthew Lease. Pareto Solutions vs Dataset Optima: Concepts and Methods for Optimizing Competing Objectives with Constraints in Retrieval. In Proceedings of The 7th ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR), pages 43--52, 2021. [ bib | pdf | video | conference-website | poster | slides ]

Danula Hettiachchi, Mark Sanderson, Jorge Goncalves, Simo Hosio, Gabriella Kazai, Matthew Lease, Mike Schaekermann, and Emine Yilmaz. Investigating and Mitigating Biases in Crowdsourced Data. In Proceedings of the 24th ACM Conference on Computer Supported Cooperative Work (CSCW), pages 331--334, 2021. Accepted Workshop. [ bib | pdf | conference-website ]

Danula Hettiachchi, Mark Sanderson, Jorge Goncalves, Simo Hosio, Gabriella Kazai, Matthew Lease, Mike Schaekermann, and Emine Yilmaz, editors. Investigating and Mitigating Biases in Crowdsourced Data: Workshop Proceedings. The 24th ACM Conference on Computer Supported Cooperative Work (CSCW), 2021. 32 pages. [ bib | pdf | conference-website ]

Danula Hettiachchi, Mike Schaekermann, Tristan J. McKinney, and Matthew Lease. The Challenge of Variable Effort Crowdsourcing and How Visible Gold Can Help. Proceedings of the ACM on Human-Computer Interaction, 5, 2021. Article number 332, 26 pages. Presented at the 24th ACM Conference on Computer Supported Cooperative Work (CSCW). [ bib | DOI | pdf | data | video | tech-report ]

Vijay Keswani, Matthew Lease, and Krishnaram Kenthapadi. Designing human-in-the-loop approaches for closed deferral pipelines. In Third symposium on Biases in Human Computation and Crowdsourcing (BHCC), 2021. 10 pages. [ bib | pdf | conference-website ]

Matthew Lease. Automated Models for Quantifying Centrality of Survey Responses. In CloudResearch Conference on Innovations in Online Research, 2021. Presentation. [ bib | video | conference-website | slides ]

When collecting data online, an automated method to quantify relative centrality of participant responses can provide insights for quality assurance plus assessing responses and participants. For example, given a set of textual responses to a survey question, which responses are most normative vs. others? Which represent the greatest outliers? Over a set of questions, which participants provide the most normative or outlying responses overall? How might such automated quantitative assessment inform analysis of responses and participants? My lab has released an open source library (free for commercial use) that enables such centrality measures to be computed in a general way across arbitrary question types. We have already published several research articles on this work I will synthesize for the talk. For any question type (e.g., textual response), we quantify distance between two participant responses via a user-selected distance function. This can be a built-in distance function we provide (e.g., for text responses: Levenshtein, BLEU, or GLEU distance) or any arbitrary distance function provided by the user (e.g., computing semantic distance based on a BERT transformer). Using this specified distance function, we provide a suite of aggregation models of varying complexity (e.g., multi-dimensional scaling with Bayesian priors and hyperparameters) that score responses for each question and aggregate statistics across questions. We report findings across a range of question types, distance functions, and aggregation models.

Matthew Lease. Designing Human-AI Partnerships for Annotation, Moderation, and Fact-Checking. In Workshop on Human-Machine Partnerships in the Future of Work at the 24th ACM Conference on Computer Supported Cooperative Work (CSCW), 2021. [ bib | pdf | conference-website ]

Matthew Lease, Mohammad Hossein Jarrahi, and Saiph Savage. Data Labeling Work in the AI Ecosystem and Opportunities for Improvement. In Rabb Symposium on Embedding AI in Society, NC State University, February 2021. Presentation. [ bib | video | conference-website | slides ]

Matthew Lease, Mohammad Hossein Jarrahi, and Saiph Savage. Designing for the Global Workers During the Pandemic. In Workshop on The Global Labours of AI and Data Intensive Systems at the 24th ACM Conference on Computer Supported Cooperative Work (CSCW), 2021. Presentation. [ bib | video | conference-website ]

Christian Staal Bruun Overgaard, Anthony Dudo, Matthew Lease, Gina M. Masullo, Natalie Jomini Stroud, Scott R. Stroud, and Samuel C. Woolley. Building connective democracy: Interdisciplinary solutions to the problem of polarisation. In Howard Tumber and Silvio Waisbord, editors, The Routledge companion to Media Disinformation and Populism, volume 1, chapter 53, pages 569--578. Routledge, 2021. ISBN 9780367435769, SSRN 3831634. [ bib | conference-website | tech-report ]

Md Mustafizur Rahman. Reliable and low-cost test collections construction using machine learning. PhD thesis, School of Information, University of Texas at Austin, August 2021. [ bib | pdf ]

Md Mustafizur Rahman, Dinesh Balakrishnan, Dhiraj Murthy, Mucahid Kutlu, and Matthew Lease. Addressing Content Selection Bias in Creating Datasets for Hate Speech Detection. In Proceedings of the Workshop on Data-Centric AI at the 35th Conference on Neural Information Processing Systems (NeurIPS), 2021. 4 pages. [ bib | pdf | sourcecode | video | conference-website | slides ]

Md Mustafizur Rahman, Dinesh Balakrishnan, Dhiraj Murthy, Mucahid Kutlu, and Matthew Lease. An Information Retrieval Approach to Building Datasets for Hate Speech Detection. In Proceedings of the Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS): Datasets and Benchmarks Track, 2021. 15 pages. [ bib | pdf | supplementary-materials | sourcecode | video | poster | slides | tech-report ]

Gurpreet Singh, Soumyajit Gupta, Matthew Lease, and Clint Dawson. A Hybrid 2-stage Neural Optimization for Pareto Front Extraction. Technical Report arXiv 2101.11684, University of Texas at Austin, 2021. January 27. [ bib | pdf ]

Miriah Steiger, Timir J. Bharucha, Sukrit Venkatagiri, Martin J. Riedl, and Matthew Lease. The Psychological Well-Being of Content Moderators: The Emotional Labor of Commercial Moderation and Avenues for Improving Support. In Proceedings of the ACM CHI Conference on Human Factors in Computing Systems, pages 1--14, 2021. [short 30-second preview video]. [ bib | pdf | video | tech-report ]

2020

Anubrata Das, Brandon Dang, and Matthew Lease. Fast, Accurate, and Healthier: Interactive Blurring Helps Moderators Reduce Exposure to Harmful Content. In Proceedings of the 8th AAAI Conference on Human Computation and Crowdsourcing (HCOMP), pages 33--42, 2020. [ bib | pdf | demo | blog-post | sourcecode | video | slides ]

Mucahid Kutlu, Tyler McDonnell, Tamer Elsayed, and Matthew Lease. Annotator Rationales for Labeling Tasks in Crowdsourcing. Journal of Artificial Intelligence Research (JAIR), 69:143--189, 2020. Award Winning Papers Track. [ bib | pdf | blog-post | data | conference-website ]

Matthew Lease. Designing Assistive AI Technologies to Support Human Judging of Information Reliability. In Virtual Conference on Social-Cybersecurity in Times of Crisis and Change, Center for Informed Democracy & Social-Cybersecurity (IDeaS), Carnegie Mellon University, 2020. Presentation abstract. [ bib | pdf | conference-website | slides ]

Matthew Lease, Miriah Steiger, Timir J. Bharucha, Martin J. Riedl, and Sukrit Venkatagiri. Promoting Psychological Wellness of Content Moderators. Technical Report TR-20-02, University of Texas at Austin, Department of Computer Science, 2020. June 1. [ bib | pdf ]

An Thanh Nguyen. Probabilistic modeling with human factors in machine learning. PhD thesis, Computer Science, University of Texas at Austin, May 2020. [ bib | pdf ]

Md Mustafizur Rahman, Mucahid Kutlu, Tamer Elsayed, and Matthew Lease. Efficient Test Collection Construction via Active Learning. In Proceedings of The 6th ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR), pages 177--184, 2020. [ bib | pdf | tech-report ]

Gurpreet Singh, Soumyajit Gupta, and Matthew Lease. Extracting Optimal Solution Manifolds using Constrained Neural Optimization. Technical Report arxiv 2009.06024, University of Texas at Austin, 2020. September 13. [ bib | pdf ]

2019

Alexander Braylan and Matthew Lease. Distance-based Consensus Modeling for Complex Annotations. In Workshop on Aggregating and Analysing Crowdsourced Annotations for NLP (AnnoNLP) at the EMNLP-IJCNLP Conference, 2019. 6 pages. [ bib | pdf | sourcecode | conference-website | slides ]

Alexander Braylan and Matthew Lease. Modeling Complex Annotations. In Doctoral Consortium at the AAAI Human Computation and Crowdsourcing (HCOMP) Conference, 2019. 4 pages. [ bib | pdf | sourcecode | conference-website | slides ]

Anubrata Das and Matthew Lease. A Conceptual Framework for Evaluating Fairness in Search. Technical report, University of Texas at Austin, July 2019. arXiv:1907.09328. [ bib | pdf ]

Anubrata Das, Kunjan Mehta, and Matthew Lease. CobWeb: A Research Prototype for Exploring User Bias in Political Fact-Checking. In ACM SIGIR Workshop on Fairness, Accountability, Confidentiality, Transparency, and Safety in Information Retrieval (FACTS-IR), 2019. 8 pages. [ bib | pdf | conference-website ]

Soumyajit Gupta, Mucahid Kutlu, Vivek Khetan, and Matthew Lease. Correlation, Prediction and Ranking of Evaluation Metrics in Information Retrieval. In Proceedings of the 41st European Conference on Information Retrieval (ECIR), pages 636--651, 2019. Best Student Paper award. [ bib | pdf | news | data | sourcecode | slides | tech-report ]

Qiwei Li. Clickbait and Emotional Language in Fake News. Bachelor's thesis, University of Texas at Austin, Department of Computer Science, 2019. [ bib | pdf ]

An Thanh Nguyen, Matthew Lease, and Byron C. Wallace. Explainable Modeling of Annotations in Crowdsourcing. In Proceedings of the 24th Annual ACM Intelligent User Interfaces (IUI) conference, pages 575--579, 2019. [ bib | pdf | data ]

An Thanh Nguyen, Matthew Lease, and Byron C. Wallace. Mash: software tools for developing interactive and transparent machine learning systems. In Proceedings of ACM IUI Workshop on Explainable Smart Systems (ExSS), 2019. 6 pages. [ bib | pdf ]

Md Mustafizur Rahman, Mucahid Kutlu, and Matthew Lease. Constructing Test Collections using Multi-armed Bandits and Active Learning. In Proceedings of the Web Conference, pages 3158--3164, 2019. [ bib | pdf | sourcecode ]

Adam Roegiest, Aldo Lipani, Alex Beutel, Alexandra Olteanu, Ana Lucic, Ana-Andreea Stoica, Anubrata Das, Asia Biega, Bart Voorn, Claudia Hauff, Damiano Spina, David Lewis, Douglas W. Oard, Emine Yilmaz, Faegheh Hasibi, Gabriella Kazai, Graham McDonald, Hinda Haned, Iadh Ounis, Ilse van der Linden, Jean Garcia-Gathright, Joris Baan, Kamuela N. Lau, Krisztian Balog, Maarten de Rijke, Mahmoud Sayed, Maria Panteli, Mark Sanderson, Matthew Lease, Michael D. Ekstrand, Preethi Lahoti, and Toshihiro Kamishima. FACTS-IR: Fairness, Accountability, Confidentiality, Transparency, and Safety in Information Retrieval. SIGIR Forum, 53(2):20--43, December 2019. Alexandra Olteanu, Jean Garcia-Gathright, Maarten de Rijke, and Michael D. Ekstrand, editors. [ bib | pdf ]

Ye Zhang. Neural NLP Models Under Low-supervision Scenarios. PhD thesis, Computer Science, University of Texas at Austin, May 2019. [ bib | pdf ]

2018

Brandon Dang, Martin J. Riedl, and Matthew Lease. But Who Protects the Moderators? The Case of Crowdsourced Image Moderation. In 6th AAAI Conference on Human Computation and Crowdsourcing (HCOMP): Works-in-Progress Track, 2018. 5 pages, peer-reviewed, non-archival. Demo URL updated since publication. [ bib | pdf | demo | blog-post | sourcecode | conference-website | slides ]

Brandon Dang, Martin J. Riedl, and Matthew Lease. Toward Safer Crowdsourced Content Moderation. In 6th ACM Collective Intelligence Conference, 2018. 5 pages. Peer-reviewed, non-archival. Extended version at AAAI HCOMP 2018. Demo URL updated since publication. [ bib | pdf | demo | blog-post | sourcecode | conference-website | slides ]

Tanya Goyal, Tyler McDonnell, Mucahid Kutlu, Tamer Elsayed, and Matthew Lease. Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to Ensure Quality Relevance Annotations. In Proceedings of the 6th AAAI Conference on Human Computation and Crowdsourcing (HCOMP), pages 41--49, 2018. Online version here includes corrections to official version from proceedings. [ bib | pdf | data | sourcecode | slides ]

Mucahid Kutlu, Tamer Elsayed, Maram Hasanain, and Matthew Lease. When Rank Order isn't Enough: New Statistical-Significance-Aware Correlation Measures. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM), pages 397--406, 2018. [ bib | pdf ]

Mucahid Kutlu, Tamer Elsayed, and Matthew Lease. Learning to Effectively Select Topics For Information Retrieval Test Collections. Information Processing and Management (IPM), 54(1):37--59, 2018. [ bib | DOI | pdf | tech-report ]

Mucahid Kutlu, Tyler McDonnell, Yassmine Barkallah, Tamer Elsayed, and Matthew Lease. Crowd vs. Expert: What Can Rationales behind Relevance Judgments Tell Us About Assessor Disagreement? In Proceedings of the 41st international ACM SIGIR conference on Research and development in Information Retrieval, pages 805--814, 2018. [ bib | pdf | data ]

Mucahid Kutlu, Tyler McDonnell, Aashish Sheshadri, Tamer Elsayed, and Matthew Lease. Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collections Accurately and Affordably. In Proceedings of the 1st Biannual Conference on the Design of Experimental Search & Information REtrieval Systems (DESIRES), pages 42--46, 2018. CEUR Workshop Proceedings Vol-2167, http://ceur-ws.org/Vol-2167. [ bib | pdf | conference-website | slides | tech-report ]

Matthew Lease. Fact Checking and Information Retrieval. In Proceedings of the 1st Biannual Conference on the Design of Experimental Search & Information REtrieval Systems (DESIRES), pages 97--98, 2018. CEUR Workshop Proceedings Vol-2167, http://ceur-ws.org/Vol-2167. [ bib | pdf | conference-website | slides ]

Matthew Lease and Omar Alonso. Crowdsourcing and Human Computation: Introduction. In Reda Alhajj and Jon Rokne, editors, Encyclopedia of Social Network Analysis and Mining, pages 499--510, New York, NY, 2018. Springer New York. [ bib | DOI | pdf ]

An Thanh Nguyen, Aditya Kharosekar, Saumyaa Krishnan, Siddhesh Krishnan, Elizabeth Tate, Byron C. Wallace, and Matthew Lease. Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking. In Proceedings of the 31st ACM User Interface Software and Technology Symposium (UIST), pages 189--199, 2018. Interface sourcecode: https://github.com/utir/fcweb2-py3. [ bib | pdf | demo | sourcecode | video | slides ]

An Thanh Nguyen, Aditya Kharosekar, Matthew Lease, and Byron C. Wallace. An Interpretable Joint Graphical Model for Fact-Checking from Crowds. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), pages 1511--1518, 2018. [ bib | pdf | demo | sourcecode | video | slides ]

Jeffrey V. Nickerson and Matthew Lease. Work in the Age of Intelligent Machines: Report of the Workshop Held at the Sixth AAAI Conference on Human Computation and Crowdsourcing. AI Magazine, 39(4):60, 2018. Workshop organized by Jeffrey V. Nickerson, Matthew Lease, Kevin Crowston, and Ingrid Erickson. [ bib | pdf | conference-website ]

Kezban Dilek Onal, Ye Zhang, Ismail Sengor Altingovde, Md Mustafizur Rahman, Pinar Karagoz, Alexander Braylan, Brandon Dang, Heng-Lu Chang, Henna Kim, Quinten McNamara, Aaron Angert, Edward Banner, Vivek Khetan, Tyler McDonnell, An Thanh Nguyen, Dan Xu, Byron C. Wallace, Maarten de Rijke, and Matthew Lease. Neural Information Retrieval: At the End of the Early Years. Information Retrieval, 21(2-3):111--182, 2018. [ bib | DOI | pdf | slides | tech-report ]

2017

Akash Mankar, Riddhi J. Shah, and Matthew Lease. Design Activism for Minimum Wage Crowd Work. In 5th AAAI Conference on Human Computation and Crowdsourcing (HCOMP): Works-in-Progress Track, 2017. See extended technical report: arXiv 1706.10097. [ bib | pdf | sourcecode | poster | tech-report ]

Tyler McDonnell, Mucahid Kutlu, Tamer Elsayed, and Matthew Lease. The Many Benefits of Annotator Rationales for Relevance Judgments. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI): Sister Conference Best Paper Track, pages 4909--4913, 2017. [ bib | pdf | blog-post | data | conference-website | slides ]

An Thanh Nguyen, Junyi Jessy Li, Ani Nenkova, Byron C. Wallace, and Matthew Lease. Aggregating and Predicting Sequence Labels from Crowd Annotations. In Proceedings of the 55th annual meeting of the Association for Computational Linguistics (ACL), pages 299--309, 2017. [ bib | pdf | data | sourcecode ]

Ye Zhang, Matthew Lease, and Byron C. Wallace. Active Discriminative Text Representation Learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI), pages 3386--3392, 2017. Also accepted for encore presentation at the 2nd Workshop on Representation Learning for NLP (RepL4NLP) at the 55th Annual Meeting of the Association for Computational Linguistics (ACL). [ bib | pdf | conference-website ]

Ye Zhang, Matthew Lease, and Byron C. Wallace. Exploiting Domain Knowledge via Grouped Weight Sharing with Application to Text Categorization. In Proceedings of the 55th annual meeting of the Association for Computational Linguistics (ACL), pages 155--160, 2017. [ bib | pdf | tech-report ]

Xi Zheng, Akanksha Bansal, and Matthew Lease. Bullseye: Structured Passage Retrieval and Document Highlighting for Scholarly Search. In The Thirteenth Asia-Pacific Conference on Conceptual Modelling (APCCM), held as part of the Australasian Computer Science Week (ACSW) Multiconference, 2017. 4 pages. [ bib | DOI | pdf | conference-website | tech-report ]

2016

Brandon Dang, Miles Hutson, and Matthew Lease. MmmTurkey: A Crowdsourcing Framework for Deploying Tasks and Recording Worker Behavior on Amazon Mechanical Turk. In Proceedings of the 4th AAAI Conference on Human Computation and Crowdsourcing (HCOMP): Works-in-Progress Track, 2016. 3 pages. arXiv:1609.00945. [ bib | pdf | sourcecode ]

Arpita Ghosh and Matthew Lease, editors. Proceedings of the 4th AAAI Conference on Human Computation and Crowdsourcing (HCOMP). Association for the Advancement of Artificial Intelligence (AAAI), 2016. ISBN 978-1-57735-774-2, 290 pages. [ bib | pdf | conference-website ]

Matthew Lease. Crowdsourcing for Success: Motivations, Design, & Ethics. In Workshop on Novel Incentives and Engineering Unique Workflows (NIEUW), organized by the Linguistic Data Consortium (LDC), 2016. 2 pages. [ bib | pdf | conference-website ]

Matthew Lease, Gordon V. Cormack, An Thanh Nguyen, Thomas A. Trikalinos, and Byron C. Wallace. Systematic Review is e-Discovery in Doctor's Clothing. In Proceedings of the Medical Information Retrieval (MedIR) Workshop at the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2016. 2 pages. [ bib | pdf | slides ]

Tyler McDonnell, Matthew Lease, Mucahid Kutlu, and Tamer Elsayed. Why Is That Relevant? Collecting Annotator Rationales for Relevance Judgments. In Proceedings of the 4th AAAI Conference on Human Computation and Crowdsourcing (HCOMP), pages 139--148, 2016. Best Paper Award. [ bib | pdf | news | blog-post | data | slides ]

An Thanh Nguyen, Matthew Halpern, Byron C. Wallace, and Matthew Lease. Probabilistic Modeling for Crowdsourcing Partially-Subjective Ratings. In Proceedings of the 4th AAAI Conference on Human Computation and Crowdsourcing (HCOMP), pages 149--158, 2016. [ bib | pdf | blog-post | data | sourcecode ]

An Thanh Nguyen, Byron C. Wallace, and Matthew Lease. A Correlated Worker Model for Grouped, Imbalanced and Multitask Data. In Proceedings of the 32nd International Conference on Uncertainty in Artificial Intelligence (UAI), pages 537--546, 2016. [ bib | pdf | sourcecode ]

Yalin Sun, Pengxiang Cheng, Shengwei Wang, Hao Lyu, Matthew Lease, Iain Marshall, and Byron C. Wallace. Crowdsourcing Information Extraction for Biomedical Systematic Reviews. In 4th AAAI Conference on Human Computation and Crowdsourcing (HCOMP): Works-in-Progress Track, 2016. 3 pages. arXiv:1609.01017. [ bib | pdf ]

Reem Suwaileh, Mucahid Kutlu, Nihal Fathima, Tamer Elsayed, and Matthew Lease. ArabicWeb16: A New Crawl for Today's Arabic Web. In Proceedings of the 39th international ACM SIGIR conference on Research and development in Information Retrieval, pages 673--676, 2016. [ bib | pdf | data ]

Ye Zhang, Md Mustafizur Rahman, Alex Braylan, Brandon Dang, Heng-Lu Chang, Henna Kim, Quinten McNamara, Aaron Angert, Edward Banner, Vivek Khetan, Tyler McDonnell, An Thanh Nguyen, Dan Xu, Byron C. Wallace, and Matthew Lease. Neural Information Retrieval: A Literature Review. Technical report, University of Texas at Austin, November 2016. ArXiv 1611.06792. [ bib | pdf | slides ]

2015

James Cheng, Monisha Manoharan, Matthew Lease, and Yan Zhang. Is there a Doctor in the Crowd? Diagnosis Needed! (for less than $5). In Proceedings of the iConference, 2015. 16 pages. [ bib | pdf ]

Hyun Joon Jung. Temporal Modeling Crowd Work for Quality Assurance in Crowdsourcing. PhD thesis, School of Information, University of Texas at Austin, December 2015. [ bib | pdf ]

Hyun Joon Jung and Matthew Lease. A Discriminative Approach to Predicting Assessor Accuracy. In Proceedings of the 37th European Conference on Information Retrieval (ECIR), pages 159--171, 2015. Samsung Human-Tech Paper Award: Silver Prize in Computer Science. [ bib | pdf | news ]

Hyun Joon Jung and Matthew Lease. Forecasting Crowd Work Quality via Multi-dimensional Features of Workers. In ICML Workshop on Crowdsourcing and Machine Learning (CrowdML), 2015. 10 pages. [ bib | pdf ]

Hyun Joon Jung and Matthew Lease. Modeling Temporal Crowd Work Quality with Limited Supervision. In Proceedings of the 3rd AAAI Conference on Human Computation (HCOMP), pages 83--91, 2015. [ bib | pdf ]

An Thanh Nguyen, Byron C. Wallace, and Matthew Lease. Combining Crowd and Expert Labels using Decision Theoretic Active Learning. In Proceedings of the 3rd AAAI Conference on Human Computation (HCOMP), pages 120--129, 2015. [ bib | pdf ]

Donna Vakharia and Matthew Lease. Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms. In Proceedings of the iConference, 2015. 17 pages. [ bib | pdf | tech-report ]

Haofeng Zhou. Crowdsourcing Construction of Information Retrieval Test Collections for Conversational Speech. Master's thesis, School of Information, University of Texas at Austin, May 2015. Reader: Byron C. Wallace. [ bib | pdf ]

2014

Tatiana Josephy, Matthew Lease, and Praveen Paritosh. Crowdsourcing at Scale: Workshop Held at the First AAAI Conference on Human Computation and Crowdsourcing. AI Magazine, 35(2):76--77, 2014. [ bib | pdf | conference-website ]

Hyun Joon Jung. Quality Assurance in Crowdsourcing via Matrix Factorization based Task Routing. In Proceedings of World Wide Web (WWW) Ph.D. Symposium, Companion Publication, pages 3--8, 2014. [ bib | pdf | conference-website ]

Hyun Joon Jung, Yubin Park, and Matthew Lease. Predicting Next Label Quality: A Time-Series Model of Crowdwork. In Proceedings of the 2nd AAAI Conference on Human Computation (HCOMP), pages 87--95, 2014. [ bib | pdf ]

Matthew Lease and Omar Alonso. Crowdsourcing and Human Computation, Introduction. In Reda Alhajj and Jon Rokne, editors, Encyclopedia of Social Network Analysis and Mining, pages 304--315, New York, NY, 2014. Springer New York. [ bib | DOI | pdf ]

Ethan Petuchowski and Matthew Lease. TurKPF: TurKontrol as a Particle Filter. Technical report, University of Texas at Austin, April 2014. arXiv:1404.5078. [ bib | pdf | sourcecode ]

Aashish Sheshadri. A Collaborative Approach to IR Evaluation. Master's thesis, Department of Computer Science, University of Texas at Austin, May 2014. Co-Supervisors: Kristen Grauman and Matthew Lease. [ bib | pdf ]

Mark D. Smucker, Gabriella Kazai, and Matthew Lease. Overview of the TREC 2013 Crowdsourcing Track. In Proceedings of the 22nd NIST Text Retrieval Conference (TREC), 2014. 6 pages. [ bib | pdf | conference-website ]

Yinglong Zhang, Jin Zhang, Matthew Lease, and Jacek Gwizdka. Multidimensional Relevance Modeling via Psychometrics and Crowdsourcing. In Proceedings of the 37th international ACM SIGIR conference on Research and Development in Information Retrieval, pages 435--444, 2014. [ bib | pdf | data ]

2013

Paul Bennett and Matthew Lease, editors. Proceedings of the Research Track from the 4th Annual CrowdConf. Online, 2013. 21 pages. [ bib | pdf | conference-website ]

Kenneth R Fleischmann, Sean P. Goggins, James Howison, Matthew Lease, and Douglas W. Oard. Calling all computer scientists and social scientists: Establishing a research agenda for computational social science. In Proceedings of the iConference, pages 1035--1036, 2013. [ bib | pdf ]

Hyun Joon Jung and Matthew Lease. Crowdsourced Task Routing via Matrix Factorization. Technical report, University of Texas at Austin, October 2013. arXiv:1310.5142. [ bib | pdf ]

Hyun Joon Jung and Matthew Lease. UT Austin in the TREC 2012 Crowdsourcing Track's Image Relevance Assessment Task. In Proceedings of the 21st NIST Text Retrieval Conference (TREC), 2013. 12 pages. [ bib | pdf ]

Aniket Kittur, Jeffrey V. Nickerson, Michael S. Bernstein, Elizabeth Gerber, Aaron Shaw, John Zimmerman, Matthew Lease, and John J. Horton. The Future of Crowd Work. In Proceedings of the ACM Conference on Computer Supported Cooperative Work (CSCW), pages 1301--1318, February 2013. Social Science Research Network (SSRN) ID: 2190946. [ bib | pdf | blog-post ]

Matthew Lease, Jessica Hullman, Jeffrey P. Bigham, Michael S. Bernstein, Juho Kim, Walter S. Lasecki, Saeideh Bakhshi, Tanushree Mitra, and Robert C. Miller. Mechanical Turk is Not Anonymous. Technical report, Social Science Research Network (SSRN), March 6, 2013. SSRN ID: 2228728. [ bib | pdf | blog-post ]

Matthew Lease, Praveen Paritosh, and Tatiana Josephy, editors. Proceedings of the AAAI Human Computation Workshop on Crowdsourcing at Scale (CrowdScale). Online, Palm Springs, CA, November 2013. 36 pages. [ bib | conference-website ]

Matthew Lease and Emine Yilmaz. Crowdsourcing for Information Retrieval: Introduction to the Special Issue. Information Retrieval (Springer), 16(2):91--100, April 2013. [ bib | pdf | conference-website ]

Matthew Lease and Emine Yilmaz, editors. Crowdsourcing for Information Retrieval (Special Issue). Information Retrieval (Springer), April 2013. 16(2):91--305. [ bib | pdf ]

Hohyon Ryu and Matthew Lease. Generating Automatic Keywords for Conversational Speech ASR Transcripts. In 1st ACM SIGIR Workshop on the Exploration, Navigation and Retrieval of Information in Cultural Heritage (ENRICH), 2013. 4 pages. [ bib | pdf | conference-website ]

Ripon Saha, Matthew Lease, Sarfraz Khurshid, and Dewayne Perry. Improving Bug Localization using Structured Information Retrieval. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 345--355, 2013. [ bib | pdf | data | conference-website ]

Aashish Sheshadri and Matthew Lease. SQUARE: A Benchmark for Research on Computing Crowd Consensus. In Proceedings of the 1st AAAI Conference on Human Computation (HCOMP), pages 156--164, 2013. [ bib | pdf | data ]

Aashish Sheshadri and Matthew Lease. SQUARE: Benchmarking Crowd Consensus at MediaEval. In Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, 2013. 2 pages. CEUR Workshop Proceedings Vol-1043, http://ceur-ws.org/Vol-1043. [ bib | pdf | data | conference-website ]

Mark D. Smucker, Gabriella Kazai, and Matthew Lease. Overview of the TREC 2012 Crowdsourcing Track. In Proceedings of the 21st NIST Text Retrieval Conference (TREC), 2013. 12 pages. [ bib | pdf | conference-website ]

Haofeng Zhou, Dennis Baskov, and Matthew Lease. Crowdsourcing Transcription Beyond Mechanical Turk. In AAAI HCOMP Workshop on Scaling Speech, Language Understanding and Dialogue through Crowdsourcing (SSLUD), 2013. 8 pages. [ bib | pdf | conference-website ]

2012

James Allan, Jay Aslam, Leif Azzopardi, Nick Belkin, Pia Borlund, Peter Bruza, Jamie Callan, Mark Carman, Charles L.A. Clarke, Nick Craswell, W. Bruce Croft, J. Shane Culpepper, Fernando Diaz, Susan Dumais, Nicola Ferro, Shlomo Geva, Julio Gonzalo, David Hawking, Kalervo Jarvelin, Gareth Jones, Rosie Jones, Jaap Kamps, Noriko Kando, Evangelos Kanoulas, Jussi Karlgren, Diane Kelly, Matthew Lease, Jimmy Lin, Stefano Mizzaro, Alistair Moffat, Vanessa Murdock, Douglas W. Oard, Maarten de Rijke, Tetsuya Sakai, Mark Sanderson, Falk Scholer, Luo Si, James A. Thom, Paul Thomas, Andrew Trotman, Andrew Turpin, Arjen P. de Vries, William Webber, Xiuzhen Zhang, and Yi Zhang. Frontiers, Challenges, and Opportunities for Information Retrieval -- Report from SWIRL 2012, The Second Strategic Workshop on Information Retrieval in Lorne. SIGIR Forum, 46(1):2--32, 2012. James Allan, Bruce Croft, Alistair Moffat, and Mark Sanderson, editors. [ bib | pdf ]

Hyun Joon Jung and Matthew Lease. Evaluating Classifiers Without Expert Labels. Technical report, University of Texas at Austin, December 2012. arXiv:1212.0960. [ bib | pdf ]

Hyun Joon Jung and Matthew Lease. Improving Quality of Crowdsourced Labels via Probabilistic Matrix Factorization. In Proceedings of the 4th Human Computation Workshop (HCOMP) at AAAI, pages 101--106, 2012. [ bib | pdf | conference-website ]

Hyun Joon Jung and Matthew Lease. Inferring Missing Relevance Judgments from Crowd Workers via Probabilistic Matrix Factorization. In Proceedings of the 35th international ACM SIGIR conference on Research and Development in Information Retrieval, pages 1095--1096, 2012. [ bib | pdf ]

Abhimanu Kumar. Supervised language models for temporal resolution of text in absence of explicit temporal cues. Master's thesis, Department of Computer Science, University of Texas at Austin, May 2012. Supervisor: Joydeep Ghosh. Readers: Jason Baldridge and Matthew Lease. [ bib | pdf ]

Abhimanu Kumar, Jason Baldridge, Matthew Lease, and Joydeep Ghosh. Dating Texts without Temporal Cues. Technical report, University of Texas at Austin, November 2012. arXiv:1211.2290. [ bib | pdf ]

Di Liu, Randolph Bias, Matthew Lease, and Rebecca Kuipers. Crowdsourcing for Usability Testing. In Proceedings of the 75th Annual Meeting of the American Society for Information Science and Technology (ASIS&T), October 28--31 2012. 10 pages. [ bib | pdf | tech-report ]

Di Liu, Matthew Lease, Rebecca Kuipers, and Randolph Bias. Crowdsourcing for Usability Testing. Technical report, School of Information, University of Texas at Austin, March 2012. arXiv:1203.1468. [ bib | pdf ]

Hohyon Ryu, Matthew Lease, and Nicholas Woodward. Finding and Exploring Memes in Social Media. In Proceedings of the 23rd ACM Conference on Hypertext and Social Media, pages 295--304. ACM, 2012. [ bib | pdf | demo | sourcecode | video ]

Shilpa Shukla, Matthew Lease, and Ambuj Tewari. Parallelizing ListNet Training using Spark. In Proceedings of the 35th international ACM SIGIR conference on Research and Development in Information Retrieval, pages 1127--1128, 2012. [ bib | pdf | sourcecode ]

Stephen Wolfson. Crowdsourcing and the Law. Master's thesis, School of Information, University of Texas at Austin, May 2012. Supervisor: Matthew Lease. Reader: James Howison. [ bib | pdf ]

2011

Lu Guo and Matthew Lease. Personalizing Local Search with Twitter. In Workshop on Enriching Information Retrieval (ENIR) at the 34th Annual ACM SIGIR Conference, 2011. 2 pages, Oral presentation. [ bib | pdf | sourcecode | video | conference-website ]

Hyun Joon Jung and Matthew Lease. Improving Consensus Accuracy via Z-score and Weighted Voting. In Proceedings of the 3rd Human Computation Workshop (HCOMP) at AAAI, pages 88--90, 2011. [ bib | pdf | blog-post | conference-website ]

Jorn Klinger and Matthew Lease. Enabling Trust in Crowd Labor Relations through Identity Sharing. In Proceedings of the 74th Annual Meeting of the American Society for Information Science and Technology (ASIS&T), 2011. 4 pages. [ bib | pdf | conference-website ]

Abhimanu Kumar and Matthew Lease. Learning to Rank From a Noisy Crowd. In Proceedings of the 34th Annual ACM SIGIR Conference, pages 1221--1222, 2011. Separately reviewed and accepted for encore presentation at the 3rd Human Computation Workshop (HCOMP) at AAAI 2011. Appears in SIGIR proceedings only. [ bib | pdf ]

Abhimanu Kumar and Matthew Lease. Modeling Annotator Accuracies for Supervised Learning. In Proceedings of the Workshop on Crowdsourcing for Search and Data Mining (CSDM) at the Fourth ACM International Conference on Web Search and Data Mining (WSDM), pages 19--22, Hong Kong, China, February 2011. [ bib | pdf | conference-website | slides ]

Abhimanu Kumar, Matthew Lease, and Jason Baldridge. Supervised Language Modeling for Temporal Resolution of Texts. In Proceedings of the 20th ACM Conference on Information and Knowledge Management (CIKM), pages 2069--2072, 2011. [ bib | pdf ]

Matthew Lease. Crowd Computing: Opportunities and Challenges. In Keynote at the 5th International Joint Conference on Natural Language Processing (IJCNLP), Chiang Mai, Thailand, November 2011. [ bib | conference-website | slides ]

Matthew Lease. On Quality Control and Machine Learning in Crowdsourcing. In Proceedings of the 3rd Human Computation Workshop (HCOMP) at AAAI, pages 97--102, 2011. Separately refereed and accepted for encore presentation at the AAAI Spring Sym posium 2012: Wisdom of the Crowd. [ bib | pdf | conference-website ]

Matthew Lease, Vitor Carvalho, and Emine Yilmaz. Crowdsourcing for Search and Data Mining. ACM SIGIR Forum, 45(1):18--24, June 2011. [ bib | pdf | conference-website ]

Matthew Lease, Vitor Carvalho, and Emine Yilmaz, editors. Proceedings of the Workshop on Crowdsourcing for Search and Data Mining (CSDM) at the Fourth ACM International Conference on Web Search and Data Mining (WSDM). Online, Hong Kong, China, February 2011. 38 pages. [ bib | pdf | conference-website ]

Matthew Lease and Gabriella Kazai. Overview of the TREC 2011 Crowdsourcing Track (Conference Notebook). In 20th Text Retrieval Conference (TREC), 2011. 10 pages. [ bib ]

Matthew Lease and Emine Yilmaz. Crowdsourcing for Information Retrieval. ACM SIGIR Forum, 45(2):66--75, December 2011. [ bib | pdf ]

Matthew Lease, Emine Yilmaz, Alexander Sorokin, and Vaughn Hester, editors. Proceedings of the 2nd Workshop on Crowdsourcing for Information Retrieval at the 34th ACM International Conference on Information Retrieval (SIGIR 2011). Online, Beijing, China, July 2011. 65 pages. [ bib | pdf | conference-website ]

Hohyon Ryu and Matthew Lease. Crowdworker Filtering with Support Vector Machine. In Proceedings of the 74th Annual Meeting of the American Society for Information Science and Technology (ASIS&T), 2011. 4 pages. [ bib | pdf ]

Elben Shira and Matthew Lease. Expert Search on Code Repositories. Technical Report TR-11-42, Department of Computer Science, University of Texas at Austin, December 2011. [ bib | pdf ]

Wei Tang and Matthew Lease. Semi-Supervised Consensus Labeling for Crowdsourcing. In ACM SIGIR Workshop on Crowdsourcing for Information Retrieval (CIR), pages 36--41, 2011. [ bib | pdf | conference-website ]

Aibo Tian and Matthew Lease. Active Learning to Maximize Accuracy vs. Effort in Interactive Information Retrieval. In Proceedings of the 34th international ACM SIGIR conference on Research and Development in Information Retrieval, pages 145--154, 2011. [ bib | pdf ]

Stephen Wolfson and Matthew Lease. Look Before You Leap: Legal Pitfalls of Crowdsourcing. In Proceedings of the 74th Annual Meeting of the American Society for Information Science and Technology (ASIS&T), 2011. 10 pages. [ bib | pdf | conference-website | tech-report ]

Yongyi Zhou, Ramona Broussard, and Matthew Lease. Mobile options for online public access catalogs. In Proceedings of the iConference, pages 598--605. ACM, 2011. [ bib | pdf | video | conference-website ]

2010

Ramona Broussard, Yongyi Zhou, and Matthew Lease. Mobile Phone Search for Library Catalogs. In Proceedings of the 73rd Annual Meeting of the American Society for Information Science and Technology (ASIS&T), 2010. 4 pages. [ bib | pdf | sourcecode | video | slides ]

Ramona Broussard, Yongyi Zhou, and Matthew Lease. University of Texas Mobile Library Search. In Proceedings of the 73rd Annual Meeting of the American Society for Information Science and Technology (ASIS&T), 2010. 2 pages. [ bib | pdf | video ]

Chris Buckley, Matthew Lease, and Mark D. Smucker. Overview of the TREC 2010 Relevance Feedback Track (Notebook). In The Nineteenth Text Retrieval Conference (TREC) Notebook, 2010. 4 pages. [ bib | pdf ]

Marc Cartright, Jangwon Seo, and Matthew Lease. UMass Amherst and UT Austin at the TREC'09 Relevance Feedback Track. In Proceedings of the 18th Text Retrieval Conference (TREC'09), 2010. 10 pages. [ bib | pdf ]

Vitor Carvalho, Matthew Lease, and Emine Yilmaz. Crowdsourcing for Search Evaluation. ACM SIGIR Forum, 44(2):17--22, December 2010. [ bib | pdf | conference-website ]

Catherine Grady and Matthew Lease. Crowdsourcing Document Relevance Assessment with Mechanical Turk. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, pages 172--179, Los Angeles, June 2010. Association for Computational Linguistics. [ bib | pdf | data | conference-website ]

Adriana Kovashka and Matthew Lease. Human and Machine Detection of Stylistic Similarity in Art. In Proceedings of the 1st Annual Conference on the Future of Distributed Work (CrowdConf), San Francisco, September 2010. 9 pages. [ bib | pdf | conference-website ]

Matthew Lease, Vitor Carvalho, and Emine Yilmaz, editors. Proceedings of the ACM SIGIR 2010 Workshop on Crowdsourcing for Search Evaluation (CSE 2010). Online, Geneva, Switzerland, July 2010. 42 pages. [ bib | pdf | conference-website ]

Saeedeh Momtazi, Matthew Lease, and Dietrich Klakow. Effective Term Weighting for Sentence Retrieval. In Proceedings of the 14th European Conference on Research and Advanced Technology for Digital Libraries (ECDL), volume 6273 of Lecture Notes in Computer Science (LNCS), pages 482--485. Springer-Verlag, 2010. [ bib | pdf ]

Eunho Yang, Pradeep Ravikumar, and Matthew Lease. A new class of ranking functions for DCG-like evaluation metrics using conditional probability models. Technical Report AI14-02 (AI report), Department of Computer Science, University of Texas at Austin, October 29 2010. 8 pages. [ bib | pdf ]

2009

Matthew Lease. An Improved Markov Random Field Model for Supporting Verbose Queries. In Proceedings of the 32nd Annual ACM SIGIR Conference, pages 476--483, 2009. [ bib | pdf ]

Matthew Lease. Beyond Keywords: Finding Information More Accurately and Easily Using Natural Language. PhD thesis, Brown University Dept. of Computer Science, August 24, 2009. Degree conferred May 2010. [ bib | pdf ]

Matthew Lease. Incorporating Relevance and Psuedo-relevance Feedback in the Markov Random Field Model: Brown at the TREC'08 Relevance Feedback Track. In Proceedings of the 17th Text Retrieval Conference (TREC'08), 2009. Best results in track. This paper supersedes an earlier version appearing in conference's Working Notes. [ bib | pdf | data ]

Matthew Lease, James Allan, and W. Bruce Croft. Regression Rank: Learning to Meet the Opportunity of Descriptive Queries. In Proceedings of the 31st European Conference on Information Retrieval (ECIR), pages 90--101, 2009. [ bib | pdf | data ]

2008

Matthew Lease and Eugene Charniak. A Dirichlet-smoothed Bigram Model for Retrieving Spontaneous Speech. In Advances in Multilingual and Multimodal Information Retrieval: 8th Workshop of the Cross-Language Evaluation Forum, CLEF 2007, Revised Selected Papers, volume 5152 of Lecture Notes in Computer Science. Springer-Verlag, 2008. [ bib | pdf ]

2007

Matthew Lease. Natural Language Processing for Information Retrieval: the time is ripe (again). In Proceedings of the 1st Ph.D. Workshop at the ACM Conference on Information and Knowledge Management (PIKM), 2007. Best Paper award. [ bib | pdf ]

Matthew Lease and Eugene Charniak. Brown at CL-SR'07: Retrieving Conversational Speech in English and Czech. In Working Notes of the Cross-Language Evaluation Forum (CLEF): Cross-Language Speech Retrieval (CL-SR) track, 2007. Corrected version. [ bib | pdf ]

2006

Ann Bies, Stephanie Strassel, Haejoong Lee, Kazuaki Maeda, Seth Kulick, Yang Liu, Mary Harper, and Matthew Lease. Linguistic Resources for Speech Parsing. In Fifth International Conference on Language Resources and Evaluation (LREC'06), Genoa, Italy, 2006. [ bib | pdf ]

John Hale, Izhak Shafran, Lisa Yung, Bonnie Dorr, Mary Harper, Anna Krasnyanskaya, Matthew Lease, Yang Liu, Brian Roark, Matthew Snover, et al. PCFGs with syntactic and prosodic indicators of speech repairs. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pages 161--168. Association for Computational Linguistics, 2006. [ bib | pdf ]

Matthew Lease, Eugene Charniak, Mark Johnson, and David McClosky. A Look At Parsing and Its Applications. In Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06), 16--20 July 2006. [ bib | pdf ]

Matthew Lease and Mark Johnson. Early Deletion of Fillers In Processing Conversational Speech. In Proceedings of the Human Language Technology Conference of the NAACL (HLT-NAACL'06), Companion Volume: Short Papers, pages 73--76, New York City, USA, June 2006. Association for Computational Linguistics. Version here corrects Table 2 in published version. [ bib | pdf ]

Matthew Lease, Mark Johnson, and Eugene Charniak. Recognizing disfluencies in conversational speech. IEEE Transactions on Audio, Speech and Language Processing, 14(5):1566--1573, September 2006. [ bib | pdf ]

B. Roark, Yang Liu, M. Harper, R. Stewart, M. Lease, M. Snover, I. Shafran, B. Dorr, J. Hale, A. Krasnyanskaya, and L. Yung. Reranking for Sentence Boundary Detection in Conversational Speech. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'06), pages 545--548, May 14--19 2006. [ bib | pdf ]

Brian Roark, Mary Harper, Eugene Charniak, Bonnie Dorr, Mark Johnson, Jeremy G. Kahn, Yang Liu, Mari Ostendorf, John Hale, Anna Krasnyanskaya, Matthew Lease, Izhak Shafran, Matthew Snover, Robin Stewart, and Lisa Yung. SParseval: Evaluation Metrics for Parsing Speech. In Fifth International Conference on Language Resources and Evaluation (LREC'06), Genoa, Italy, 2006. [ bib | pdf ]

2005

Mary Harper, Bonnie Dorr, John Hale, Brian Roark, Izhak Shafran, Matthew Lease, Yang Liu, Matthew Snover, Lisa Yunge, Anna Krasnyanskayai, and Robin Stewart. Parsing Speech and Structural Event Detection (PASSED): CLSP Summer Workshop Final Report. Technical report, Johns Hopkins University, 2005. [ bib | pdf | conference-website | slides ]

Jeremy G. Kahn, Matthew Lease, Eugene Charniak, Mark Johnson, and Mari Ostendorf. Effective Use of Prosody in Parsing Conversational Speech. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (EMNLP'05), pages 233--240, Vancouver, British Columbia, Canada, October 2005. Association for Computational Linguistics. [ bib | pdf ]

Matthew Lease. Parsing and Disfluency Modeling. Technical Report CS-05-15, Brown University Department of Computer Science, 2005. [ bib | pdf ]

Matthew Lease and Eugene Charniak. Parsing Biomedical Literature. In R. Dale, K.-F. Wong, J. Su, and O. Kwong, editors, Proceedings of the 2nd International Joint Conference on Natural Language Processing (IJCNLP'05), volume 3651 of Lecture Notes in Computer Science (LNCS), pages 58 -- 69, Jeju Island, Korea, October 11 - October 13 2005. Springer-Verlag. [ bib | pdf | data ]

Matthew Lease, Eugene Charniak, and Mark Johnson. Parsing and its applications for conversational speech. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'05), volume 5, pages 961--964, March 18 - March 23 2005. [ bib | pdf ]

2004

Mark Johnson, Eugene Charniak, and Matthew Lease. An Improved Model For Recognizing Disfluencies in Conversational Speech. In Rich Transcription 2004 Fall Workshop (RT-04F), 2004. [ bib | pdf ]

2003

Matthew Lease and Guy Eddon. SmartElevator: Revitalizing A Legacy Device through Inexpensive Augmentation. In Proceedings of the IEEE 23rd International Conference on Distributed Computing Systems (ICDCS): 3rd International Workshop on Smart Appliances and Wearable Computing, pages 254 -- 259, 2003. [ bib | pdf ]

2002

A. LaMarca, W. Brunette, D. Koizumi, M. Lease, S.B. Sigurdsson, K. Sikorski, D. Fox, and G. Borriello. PlantCare: An Investigation in Practical Ubiquitous Systems. In Proceedings of the 4th International Conference on Ubiquitous Computing (UBICOMP), volume 2498 of LECTURE NOTES IN COMPUTER SCIENCE, pages 316--332. Springer, 2002. [ bib | pdf ]

Anthony LaMarca, Waylon Brunette, David Koizumi, Matthew Lease, Stefan B. Sigurdsson, Kevin Sikorski, Dieter Fox, and Gaetano Borriello. Making Sensor Networks Practical with Robots. In Pervasive '02: Proceedings of the First International Conference on Pervasive Computing, volume 2414 of LECTURE NOTES IN COMPUTER SCIENCE, pages 152--166. Springer-Verlag, 2002. [ bib | pdf ]

Matthew Lease. Plan-Aware Behavioral Modeling. In Adjunct Proceedings of 4th Intl. Conference on Ubiquitous Computing (UBICOMP), pages 35--36, 2002. [ bib | pdf ]

2001

2000

1999

I.J. Kalet, J. Wu, M. Lease, M.M. Austin-Seymour, J.F. Brinkley, and C. Rosse. Anatomical information in radiation treatment planning. In Proceedings of the American Medical Informatics Association (AMIA) Fall Symposium, 1999. [ bib | pdf ]

1998

I.J. Kalet, R.S. Giansiracusa, C. Wilcox, and M. Lease. Radiation Therapy Planning: an Uncommon Application of Lisp. In R. Gabriel, editor, Proceedings of the Conference on the 40th Anniversary of Lisp, 1998. [ bib | pdf ]