Microsoft Research Partners with UT Austin, Texas iSchool for Microsoft Ability Initiative

Sandlin, Anu  |  Mar 29, 2019

News Image: 
Image Caption: 
From left to right: Danna Gurari, University of Texas; Ed Cutrell, Microsoft Research; Roy Zimmermann, Microsoft Research; Meredith Ringel Morris, Microsoft Research; Ken Fleischmann, University of Texas; Neel Joshi, Microsoft Research
Microsoft Ability Initiative
Texas iSchool
Microsoft Research
Danna Gurari
Ken Fleischmann
image captioning
visual impairments

Despite significant developments in the world of automated image captioning, current image captioning approaches are not well-aligned with the needs of people with visual impairments. People who are blind or with low vision share a unique and real challenge –their visual impairment exposes them to a time-consuming, and sometimes, impossible task of learning what content is present in an image without visual assistance. As such, these communities often seek a visual assistant to describe photos they take themselves or find online. 

In an ideal world, a fully-automated computer vision (CV) approach would provide such descriptions. However, this artificial intelligence (AI) process is riddled with challenges. Not only is CV work missing images taken by this population, but people who are blind and with low vision are required to passively listen to one-size-fits-all descriptions of images to locate information of interest. In addition, CV algorithms often deliver incomplete or incorrect information. Because of these shortcomings, reliable image captioning systems continue to depend on humans to provide descriptions of photos to people with visual impairments. 

Determined to find a way to improve image captioning for blind and low vision communities, Principal investigator and Texas iSchool Assistant Professor Danna Gurari and Associate Professor Ken Fleischmann believe there is a more efficient and effective solution that reduces human effort and produces accurate results for communities who are blind or with low vision. And they recently embarked on a new project to “design algorithms and systems that close the gap between CV algorithm and human performance for describing pictures taken by both sighted and visually impaired photographers.” 

But the Texas School of Information professors weren’t the only ones thinking about how to improve image captioning for people who are blind or with low vision. A team of researchers at Microsoft Research recently announced a similar vision and goal –to train AI systems to provide more detailed captions that can offer a richer understanding, and more accurate representation of images for the blind or those with low vision. In light of this mission, Microsoft Research developed a new project called the Microsoft Ability Initiative.

According to Microsoft Research Principal Researcher and Research Manager Meredith Ringel Morris, “the companywide initiative aims to create a public dataset that ultimately can be used to advance the state of the art in AI systems for automated image captioning.”

After a competitive process involving a select number of universities, the search for an academic research unit with whom they could partner for the new venture came to an end when Microsoft Research chose The University of Texas at AustinSchool of Information. The proposed work of Gurari and Fleischmann was the only project selected through this competition.

The Texas iSchool research team proposed two main tasks of (1) introducing the first publicly-available image captioning dataset from people with visual impairments paired with a community AI challenge and workshop, and (2) identifying the values and preferences of people with visual impairments –to inform the design of next-generation image captioning systems and datasets. 

“The collaboration builds upon prior Microsoft research that has identified a need for new approaches at the intersection of computer vision and accessibility,” explained Morris.

The companywide initiative aims to create a public dataset that ultimately can be used to advance the state of the art in AI systems for automated image captioning.

The Microsoft Research team which includes Ed Cutrell, Roy Zimmermann, Meredith Ringel Morris, and Neel Joshi, plans to collaborate with UT Austin, School of Information over an 18-month period. Gurari and Fleischmann will lead the UT Austin team, which will also include three PhD students and one postdoctoral fellow.

The Microsoft Ability Initiative builds on the interdisciplinary team’s expertise in computer vision, human-computer interaction, accessibility, ethics, and value-sensitive design. Gurari’s team is experienced in establishing new datasets, designing human-machine partnerships, creating human computer interaction systems, and developing accessible technology. As co-founder of the ECCV VizWiz Grand Challenge in 2018, Gurari is skilled in community-building and has a previous record of success in creating public datasets to advance the state-of-the-art in AI and accessibility.

Fleischmann’s team offers complementary experience in the ethics of AI and understanding users’ values to inform technology design. Given his expertise in the role of human values in the design and use of information technologies, Fleischmann will lead the effort focused on uncovering the needs and values of people with visual impairments –which will ultimately inform the design of future image captioning systems.

The Microsoft researchers involved in this initiative have specialized experience in accessible technologies, human-centric AI systems, and computer vision. “Our efforts are complemented by colleagues in other divisions of the company, including the AI for Accessibility program, which helps fund the initiative, and Microsoft 365 accessibility,” explained Morris.

Dubbed “a collaborative quest to innovate in image captioning for people who are blind or with low vision,” Morris explained that “the Microsoft Ability Initiativeis one of an increasing number of initiatives at Microsoft in which researchers and product developers are coming together in a new, cross-company push to spur innovative and exciting new research and development in the area of accessible technologies.” 

Gurari believes that the initiative “will not only advance the state of the art of vision-to-language technology, but it will also continue the progress Microsoft has made with such tools and resources as the Seeing AI mobile phone application and the Microsoft Common Objects in Context (MS COCO) dataset. It will also serve as a great teaching opportunity for Texas iSchool students.”

The Texas iSchool team will employ a user-centered approach to the problem, including working with communities who are blind or with low vision to improve understanding of their expectations of image captioning tools. The team will also host community challenges and workshops to accelerate progress on algorithm development and facilitate the development of more accessible methods to assist people who are blind or with low vision. 

Gurari and Fleischmann explain that “this work can empower people with visual impairments to more rapidly and accurately learn about the diversity of visual information, while contributing to solving related problems including image search, visual question answering, and robotics.”

The Microsoft Research team launched the new collaboration with the Texas iSchool during a two-day visit to Austin in January. Morris noted that the Microsoft Research team came away from the meeting at The University of Texas at Austin, School of Information, “even more energized about the potential for this initiative to have real impact in the lives of millions of people around the world.” “We couldn’t be more excited,” she said.

The Texas iSchool professors share the Microsoft Research team’s excitement about their upcoming collaboration. “To be selected for this gift is a great honor,” said Gurari and Fleischmann. “We look forward to working with the Microsoft Research team over the months, and are eager to make progress with our shared goal –to better align image captioning systems with the needs of those who are blind or with low vision.” 

Grant to Boost Understanding of Ethical, Political, and Legal Implications of Machine Learning

Sandlin, Anu  |  Jan 30, 2019

machine learning
artificial intelligence
Ken Fleischmann
Sherri Greenberg
Cisco Research Center
News Image: 

Texas School of Information Associate Professor Ken Fleischmann received a $100,000 grant from the Legal Implications for IoT, Machine Learning, and Artificial Intelligence Systems programCisco Research Center, for "Field Research with Policy, Legal, and Technological Experts about Transparency, Trust, and Agency in Machine Learning." The Cisco Research Center connects researchers and developers from Cisco, academia, governments, customers, and industry partners with the goal of facilitating collaboration and exploration of new and promising technologies.

The request for proposals (RFP 16-02) invited researchers to investigate legal and policy issues in the quickly developing world of machine learning (ML), artificial intelligence (AI), machine-to-machine interactions, and the rapidly expanding world of data creation, transfer, collection, and analysis from Internet of Things (IoT).

How can we ensure that ML experts are aware of the ethical, political, and legal implications of ML, and that policy experts and legal scholars are up to date in their understanding of ML and its potential societal implications?

The project’s principal investigators, Dr. Fleischmann and Sherri Greenberg of the LBJ School of Public Affairsexplain that while machine learning has the potential to revolutionize society, transform how we do business, defend our homeland, and heal diseases, it also raises numerous ethical challenges, which our legal and political systems are largely ill-equipped to deal with. In their proposal, they ask: “How can we ensure that ML experts are aware of the ethical, political, and legal implications of ML, and that policy experts and legal scholars are up to date in their understanding of ML and its potential societal implications?”

According to Fleischmann, the project’s goal is to “bridge the gap in expertise among technology experts and legal and policy experts.” On one hand, this involves helping legal and policy experts to understand the limits of technology, both at present and (our best projection of what will be possible) ten years down the road, and on the other, helping technology experts to understand the legal and policy implications of their work,” said Fleischmann.

Fleischmann explains that this project can lead to insights that enhance the academic education and workplace training of technologists, as well as legal and policy scholars in future research. Not only does it have the potential “to help educate and prepare ML researchers and developers about the potential ethical, legal, and policy implications of their work, but it will also help prepare future policy makers and legislators about how to regulate and legislate to ensure safe and efficient use of ML.”

Dr. Ken Fleischmann Wins 2018 Social Informatics Best Paper Award

Sandlin, Anu  |  Nov 26, 2018

News Image: 
Professor Fleischmann presenting at the ASIS&T Conference
Texas iSchool
Ken Fleischmann
Social Informatics
Best Paper Award
News Image: 
Professor Fleischmann accepting Best Social Informatics paper award

Texas iSchool Associate Professor Ken Fleischmann recently accepted the 2018 Social Informatics Best Paper Award from the Association for Information Science and Technology (ASIS&T) Special Interest Group for Social Informatics.

Based upon work supported by the National Science Foundation, the paper, “The Societal Responsibilities of Computational Modelers: Human Values and Professional Codes of Ethics,” focuses on understanding how values shape modelers’ experiences with and attitudes toward codes of ethics. The findings reveal that individuals who place great value on equality and social justice are more likely to advocate for following a code of ethics. 

Fleischmann explains that innovations in artificial intelligence (AI) have advanced computational modeling to a point where its design can have life-or-death consequences – especially because AI-based computational models are used to predict climate change, design aircraft, and evaluate and refine medical techniques. “Thus, it is important that computational modelers are both willing and able to consider not only the technical implications, but also the societal implications of their work.”

It is important that computational modelers are both willing and able to consider not only the technical implications, but also the societal implications of their work.

The Social Informatics Best Paper Award recognizes the best paper published in a peer-reviewed journal on a topic informed by social informatics during the previous calendar year.

The winning paper, co-authored with Cindy Hui and William A. Wallace of Rensselaer Polytechnic Institute, was published in the Journal of the Association for Information Science and Technology in 2017 (

Fleischmann presented the paper on November 10 in Vancouver, Canada at the ASIS&T 2018 Annual Meeting, during The 14th Annual Social Informatics Research Symposium: Sociotechnical perspective on ethics and governance of emerging information technologies.

“There is no greater professional honor than for your work to be recognized by your peers,” notes Fleischmann. “I hope that this will help to further shine a spotlight on the important ethical implications of AI.”

glqxz9283 sfy39587stf02 mnesdcuix8