Matthew Lease
The University of Texas at Austin   The University of Texas at Austin
Matt Lease Matthew Lease

Associate Professor
School of Information
University of Texas at Austin

1616 Guadalupe Ste 5.202
Austin, TX 78701-1213
Location & DirectionsParking map

Voice: (512) 471-9350 • Fax: (512) 471-3971
Office: UTA 5.536 • Lab: UTA 5.520
Campus box: D8600 •

Lab: Information Retrieval & Crowdsourcing (about)
Papers • Talks (Videos & Slides) • Data & Software
Scholar CVLinkedInTwitter

Leadership Team: Good Systems, a UT Austin 8-year Grand Challenge to design responsible AI technologies
Academic Affiliations: InformationComputer ScienceStatistics & Data SciencesRobotics
Research Affiliations: Machine Learning LabIntelligent SystemsComputational LinguisticsCenter for Media Engagement
Faculty Advisory Board: Texas Advanced Computing Center
External engagement: Amazon Scholar



Areas (specific): Information Retrieval (IR) • Crowdsourcing & Human Computation (HCOMP) • Natural Language Processing (NLP)
Areas (broad): Artificial Intelligence (AI) • Human-Computer Interation (HCI)

Overview: IR is the science behind search engines such as Google. Crowdsourcing and human computation engage online workers to train or augment automated artificial intelligence algorithms. My IR research seeks to improve core search algorithms, reliably evaluate search systems, and to enable new forms of search. My HCOMP research seeks to optimize crowdsourced data collection (e.g., quality, cost, and speed), to expand the reach of crowdsourcing to tackle new problems, and to investigate broader socio-technical questions of how paid crowdsourcing is transforming digital work and the lives of workers engaged in it. At the intersection of IR and HComp, I develop crowdsourcing methods to better scale IR evaluation methodology while preserving its reliability. Both IR and HComp place people at the center of computing: system users in IR and online workers in HComp. I thus seek to orchestrate effective man-machine partnerships which creatively blend front-end HCI design with back-end AI modeling of people and their tasks. By capitalizing on the respective strengths of each party - man and machine - we can compensate for the other's limitations to create a whole greater than the sum of its parts. For example, IR systems can utilize front-end HCI design to empower searcher intuition and creativity, while back-end AI algorithms interpret ambiguous human queries, sift through vast information, and suggest potentially relevant results. In HCOMP, front-end HCI design can enable workers to more easily understand and complete tasks, while back-end AI modeling of workers and tasks enables principled optimization of data collection.

Brief Biography. Matthew Lease received degrees in Computer Science from Brown University (PhD, MSc) and the University of Washington (BSc). His research on information retrieval and crowdsourcing was recognized by three Early Career awards: from the Defense Advanced Research Projects Agency (DARPA), the National Science Foundation (NSF), and the Institute for Museum and Library Sciences (IMLS). More recent honors include Best Student Paper at the 2019 European Conference for Information Retrieval (ECIR) and Best Paper at the 2016 Association for the Advancement of Artificial Intelligence (AAAI) Human Computation and Crowdsourcing conference (HCOMP). From 2010-2013, Lease ran benchmarking challenges for the National Institute of Standards and Technology (NIST) Text Retrieval Conference (TREC). Lease's industry experience includes stints at Intel Research, Computer game company HyperBole Studios, image compression startup LizardTech, crowdsourcing startup CrowdFlower, and Amazon.

Research Lab. See Lease's Information Retrieval & Crowdsourcing Lab for more information.



HCOMP 2014 Doctoral Consortium (with Loren Terveen)
TREC 2011-2013 Crowdsourcing Track (with Gabriella Kazai & Mark Smucker)
HCOMP 2013 Workshop: CrowdScale: accepted papers & shared task challenge (with Tatiana Josephy & Praveen Paritosh)
CrowdConf 2013 Research Track (with Paul Bennett) -- see brief interview
Springer Information Retrieval: April 2013 Special Issue on Crowdsourcing (volume 16 no. 2)
SWIRL 2012: Second Strategic Workshop on Information Retrieval in Lorne
SIGIR 2011 Workshop on Crowdsourcing for Information Retrieval (July 28, 2011)
WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining (CSDM 2011), with Vitor Carvalho and Emine Yilmaz (February 9, 2011)
TREC 2010 Relevance Feeback Track, with Chris Buckley and Mark Smucker (November 17-19, 2010)
SIGIR 2010 Workshop on Crowdsourcing for Search Evaluation (CSE2010), with Vitor Carvalho and Emine Yilmaz (July 23, 2010)

SELECTED PAST TALKS (slides on SlideShare, some videos on lab webpage)

Panel Talk at Microsoft Faculty Summit: Video · Abstract (July 15, 2014)
Tutorial at SIAM 2013 Data Mining Conference on Crowdsourcing & Human Computation (May 3)
Invited Talk at ID360 Conference: Amazon's Mechanical Turk is Not Anonymous (May 1, 2013)
2012 Frontiers of Information Science and Technology (FIST) Invited Talk (December 12)
MetroCon 2012 (IEEE) Invited Talk: The Rise of Crowd Computing (October 11)
SIGIR 2012 Tutorial: Crowdsourcing for Search Evaluation and Social-Algorithmic Search (with Omar Alonso)
SBP 2012 Invited Talk for Challenge Award (April 3, slides)
IJCNLP 2011 Invited Keynote Crowd Computing: Opportunities and Challenges (Nov 10, slides)
CrowdConf 2011 Tutorial: Crowdsourcing for Research & Engineering (with Omar Alonso, Nov 1, press, slides)
SIGIR 2011 Tutorial on Crowdsourcing for Information Retrieval: Principles, Methods, and Applications (with Omar Alonso, July 24, slides)
UT Austin School of Information Advisory Council talk: Crowdsourcing & Human Computation (April 15, 2011)
UT Austin Linguistics Colloquium: Crowdsourcing for Natural Language Processing (February 28, 2011, slides)
WSDM 2011 Tutorial: Crowdsourcing 101: Putting the WSDM of Crowds to Work for You (with Omar Alonso, February 9, slides)


Two papers presented at ACL 2017; read the story (August 3, 2017)
Using Crowds to Teach AI How to Search Smarter (August 16, 2017)

Grant for developing Arabic Websearch technologies (November 4, 2015)
Version 2.0 released of SQUARE: benchmarking Software for aggregating crowd responses (October 26, 2015)
We have two papers appearing at AAAI HCOMP 2015 (June 17, 2015)
Hyun Joon Jung Receives 2015 Samsung Human-Tech Paper Award (Feburary 11, 2015)
Amazon's Mechanical Turk is Not Anonymous: Blog 3/6/13 · Paper 3/7/13 · TheVerge 3/7/13 · Press Release 3/27/13 · Talk 5/1/13
The Future of Crowd Work: Paper 12/12/12 · Blog 2/6/13 · Press Release 2/7/13 · New Scientist 2/7/13 · New York Times 3/18/13

Quoted in

TechTarget: Predictive coding plus crowdsourcing could cure e-discovery challenges (September 24, 2015)
MIT Technology Review: Baidu’s Duer Joins the Virtual Assistant Party (September 10, 2015)
Speech Technology: Siri-ous Influence: Enterprise Virtual Assistants Have Arrived (November 10, 2012)
Austin American Statesman: Digital Savant: Sensors bringing technology ever nearer (October 23, 2012) (UT Research Alert coverage)
Science News: Digital bounty hunters unleashed (October 31, 2011)
Data Breach Examiner: Privacy Without Borders: The Ins and Outs of Outsourcing (August 2011)


IMLS LB21CL Early Career
Yahoo! Faculty Research Award
New York Community Trust (August 2, 2012)
LIFT award: announcement (August 4, 2010) and follow-on article (March 11, 2011)
Amazon sponsorship for TREC Crowdsourcing Track (August 31, 2010)
Portugal FCT (July 26, 2010) (April 23, 2010)
Amazon (March 23, 2010)



Brown Laboratory for Linguistic Information Processing (BLLIP), Department of Computer Science, Brown University
Center for Intelligent Information Retrieval (CIIR), Department of Computer Science, University of Massachusetts Amherst (lab reunion at SIGIR'09)
Spoken Language Systems Laboratory (
LSV), Saarland University
Institute of Formal and Applied Linguistics (UFAL), Charles University
Intel Research Seattle at the University of Washington

My Ph.D./M.D. brother Kevin Lease makes iPhone applications for doctors, particularly for geriatrics

Information Science involves a surprising amount of flexibility

Thanks to Vance Faber I have a pseudo Erdös number of 2. Nothing more to accomplish, right?