Research Awards/Grants (Current)

Ying Ding

National Science Foundation (NSF)

04/01/2023 to 03/31/2026

The award is $299,862 over the project period.

NSF-CSIRO: RESILIENCE: Graph Representation Learning for Fair Teaming in Crisis Response

The recent COVID-19 pandemic has revealed the fragility of humankind. In our highly connected world, infectious disease can swiftly transform into worldwide epidemics. A plague can rewrite history and science can limit the damage. The significance of teamwork in science has been extensively studied in the science of science literature using transdisciplinary studies to analyze the mechanisms underlying broad scientific activities. How can scientific communities rapidly form teams to best respond to pandemic crises? Artificial intelligence (AI) models have been proposed to recommend scientific collaboration, especially for those with complementary knowledge or skills. But issues related to fairness in teaming, especially how to balance group fairness and individual fairness remain challenging. Thus, developing fair AI models for recommending teams is critical for an equal and inclusive working environment. Such a need could be pivotal in the next pandemic crisis. This project will develop a decision support system to strengthen the US-Australia public health response to infectious disease outbreak. The system will help to rapidly form global scientific teams with fair teaming solutions for infectious disease control, diagnosis, and treatment. The project will include participation of underrepresented groups (Indigenous Australians and Hispanic Americans) and will provide fair teaming solutions in broad working and recruiting scenarios.  
This project aims to understand how scientific communities have responded to historical pandemic crises and how to best respond in the future to provide fair teaming solutions for new infectious disease crises. The project will develop a set of graph representation learning methods for fair teaming recommendation in crisis response through: 1) biomedical knowledge graph construction and learning, with novel models for emerging bio-entity extraction, relationship discovery, and fair graph representation learning for sensitive demographical attributes; 2) the recognition of fairness and the determinant of team success, with a subgraph contrastive learning-based prediction model for identifying core team units and considering trade-offs between fairness and team performance; and 3) learning to recommend fairly, with a measurement of graph-based maximum mean discrepancy, a meta learning method for fair graph representation learning, and a reinforcement learning-based search method for fair teaming recommendation. The project will support cross-disciplinary curriculum development by effectively bridging gaps in responsible AI and team science, fair project management, and risk management in science.

Ying Ding

National Science Foundation (NSF)

09/01/2023 to 08/31/2024

The award is $50,000 over the project period.

NSF I-Corps Project Title: CARE: Contextualization of Explainable AI for Better Health

The broader impact/commercial potential of this I-Corps project is the development of the explainable Artificial Intelligence (XAI) methods for healthcare data. Currently, the number of electronic medical records is increasing while machine learning and deep learning models, especially large language models, have been employed to address healthcare needs. However, the healthcare domain is highly regulated and explainability for the black-box AI model becomes increasingly critical for any AI application. Users need to comprehend and trust the results and output created by machine learning algorithms. The proposed XAI technology may be used to describe an AI model, its expected impact, and potential biases. Further, the proposed technology may be used to transfer AI predictions into explainable medical interventions to enable the last mile delivery of AI in healthcare The commercial potential of these technologies may impact three major groups: health insurance companies who may provide better care management interventions and achieve personalized care delivery based on XAI; health analytic companies who rely on explanation to further enhance their products and meet the government regulations; and medical device startups who demand explainable analytical outputs based on the collected data from medical devices to enrich their user experience.
This I-Corps project is based on the development of explainable Artificial Intelligence (XAI) methods applied to the healthcare industry. Providing explainability is critical for AI health applications. Healthcare is a unique domain with multimodality data: tableau data about patient demographic information, textual data about medical notes, time series data about vital sign measures, images about medical scan, and wavelet data about EEG and ECG. To provide a holistic view of these data, deep learning is used to create universal embeddings on different modalities of data and build the prediction models for health risks. But deep learning methods lack transparency and demand explainability. The proposed technology combines integrated gradients with ablation studies to identify the contributing factors of different data components in the explanation. In addition, the proposed platform adds knowledge graphs into the prediction and explanation workflow to detect the relationships between contributing features to generate an explanation with a holistic view, and translates weights or feature importance into risk scores to enable the last mile delivery of AI in healthcare. The proposed XAI method may be used to explain the importance of input data components, identify the contributing features at the individual patient level and the patient cohort level; scale and save computational resources; and self-improve by using reinforcement learning to enhance positive feedback.

Min Kyung Lee

National Science Foundation (NSF)

09/01/2022 to 08/31/2025

The award is $249,999 over the project period.

Collaborative Research: DASS: Designing accountable software systems for worker-centered algorithmic management

Software systems have become an integral part of public and private sector management, assisting and automating critical human decisions such as selecting people and allocating resources. Emerging evidence suggests that software systems for algorithmic management can significantly undermine workforce well-being and may be poorly suited to fostering accountability to existing labor law. For example, warehouse workers are under serious physical and psychological stress due to task assignment and tracking without appropriate break times. On-demand ride drivers feel that automated evaluation is unfair and distrust the system?s opaque payment calculations which has led to worker lawsuits for wage underpayment. Shift workers suffer from unpredictable schedules that destabilize work-life balance and disrupt their ability to plan ahead. Meanwhile, there is not yet an established mechanism to regulate such software systems. For example, there is no expert consensus on how to apply concepts of fairness in software systems. Existing work laws have not kept pace with emerging forms of work, such as algorithmic management and digital labor platforms that introduce new risk to workers, including work-schedule volatility and employer surveillance of workers both on and off the job. To tackle these challenges, we aim to develop technical approaches that can (1) make software accountable to existing law, and (2) address the gaps in existing law by measuring the negative impacts of certain software use and behavior, so as to help stakeholders better mitigate those effects. In other words, we aim to make software accountable to law and policy, and leverage it to make software users (individuals and firms) accountable to the affected population and the public.

This project is developing novel methods to enable standards and disclosure-based regulation in and through software systems drawing from formal methods, human-computer interaction, sociology, public policy, and law throughout the software development cycle. The work will focus on algorithmic work scheduling, which impacts shift workers who make up 25% of workers in the United States. It will take a participatory approach involving stakeholders, public policy and legal experts, governments, commercial software companies, as well as software users in firms and those affected by the software?s use, in the software design and evaluation. The research will take place in three thrusts in the context of algorithmic scheduling: (1) participatory formalization of regulatory software requirements, (2) scalable and interactive formal methods and automated reasoning for software guarantees and decision support, and (3) regulatory outcome evaluation and monitoring. By developing accountable scheduling software, the project has the potential for significant broader impacts by giving businesses the tools they need for compliance with and accountability to existing work scheduling regulations, as well as the capacity to provide more schedule stability and predictability in their business operations.

Ahmer Arif

National Science Foundation (NSF)

10/01/2022 to 09/30/2024

The collaborative award is $5,000,000 over the project period. The School of Information portion of the award is $1,368,142

NSF Convergence Accelerator Track F: Co-designing for Trust: Reimagining Online Information Literacies with Underserved Communities

In 2011, the National Science Foundation began requiring that all funded projects provide data management
plans (DMPs) to ensure that project data, computer codes, and methodological procedures were available to other
scientists for future use. However, the extent to which these data management requirements have resulted in more and
better use of project data remains an open question. This project thus investigates the National Science Foundation's
DMP mandate as a national science policy and examines the broad impacts of this policy across a strategic sample of five
disciplines funded by the National Science Foundation. It considers the organization and structure of DMPs across fields,
the institutions involved in data sharing, data preservation practices, the extent to which DMPs enable others to use
secondary project data, and the kinds of data governance and preservation practices that ensure that data are sustained
and accessible. Systematic investigation of the impact of DMPs and data sharing cultures across fields will assist funding
agencies and research scientists working to produce reproducible and open science by identifying barriers to data
archiving, sharing, and access. The principal investigators will use project findings to develop data governance guidelines
for information professionals working with scientific data and to articulate best practices for scientific communities
using DMPs for data management.

This project aims to enhance understanding of the role data management plans (DMPs) play in shaping data life-cycles.
It does so by examining DMPs across five fields funded by the National Science Foundation to understand data practices,
archiving and access issues, the infrastructures that support data sharing and reuse, and the extent to which project
data are later used by other researchers. In phase I, the investigators will gather a strategic sample of DMPs
representing a wide range of data types and data retention practices from different scientific fields. Phase II consists of
forensic data analysis of a subset of DMPs to discover what has become of project data. Phase III develops detailed case
studies of research project data life-cycles and data afterlives with qualitative interviews and archival documentary
analysis to help develop best practices for sustainable data preservation, access, and sharing. Phase IV will translate
findings into data governance recommendations for stakeholders. The project thus contributes to research about
contemporary studies of scientific data production and circulation while assessing the effect of DMPs as a national
science policy initiative affecting data management practices in different scientific communities. The comparative
research design and mixed methods enables theory building about cross-disciplinary data practices and data cultures
across fields and advances knowledge within data studies, information management studies, and science and
technology studies.

Matthew Lease

Jessy Li

Cisco Systems Inc.

06/01/2022 to 08/31/2025

The award is $199,458 over the project period. 

Classifying Text with Intuitive and Faithful Model Explanations

The objective of this Research Project is to develop an advanced neural NLP modeling framework for interpretable and accurate text classification. Intuitively, when human users better understand model predictions (via model interpretability), the users can better use model predictions to augment their own human reasoning and decision-making. More generally, effective model explanations offer a variety of other potential benefits, such as promoting trust, adoption, auditing, and documentation of model decisions. Our modeling framework, ProtoType-based Explanations for Natural Language (ProtoTexNL), seeks to provide faithful explanations for model predictions in relation to training examples and features of the input text.