Improving transparency and appropriate trust of an agentic AI

Goal: Understand the kinds of information users need to be able to correctly evaluate the accuracy of an agentic AI system's responses.

Methods: Task-based think-aloud user study with 24 participants interacting with an agentic AI chatbot and semi-structured interviews.

Findings: Users often put too much trust into an AI system. They are easily misled by the amount of information available (such as a list of sources). Many users do want to know about the agentic AI's capabilities, limitations, and decision-making processes.

Impact: Influenced the early design of the end-user interface for BeeAI, an open-source platform to build agentic AI systems which won a Fast Company Innovation by Design Honorable Mention.

Discovering current and future generative AI needs of business users

Goal: Learn how knowledge workers in an enterprise context use LLMs, how they would like to use them in the future, challenges in using them, and reasons for lack of use.

Methods: Survey of 216 knowledge workers and follow-up survey with 107 participants.

Findings: We found four key types of use: creating, finding information, getting advice, and automation. Users initially focused on creating content, like drafts of documents or emails. The second survey revealed that use of LLMs was increasing for information-focused tasks like searching and learning. Workers still hoped for more AI support in getting feedback on their work, analysis, and automating tasks. Workers also wanted to use AI for more complex tasks that require domain knowledge and context, but often did not feel that the existing systems were capable in those areas (as of the time the surveys were run).

Impact: Our findings were presented and shared broadly within IBM, informing product and innovation teams and strategy. This work inspired future projects, such as an AI tool for agile epic evaluation, aimed at improving product management workflows. We published this work at ACM CSCW and ACM CHI. It was also featured in Undark Magazine.

Supporting Product Management Workflows with AI

Goal: Understand the challenges and needs of product managers during agile epic creation and when using AI to support agile epic evaluation.

Methods: User study and interviews with 17 product managers.

Findings: Participants found value in an AI system for helping them evaluate and prioritize agile epics. However, this kind of system should provide felixibility to support diverse practices and could help organizations be more consistent in their agile practices. Lack of domain knowledge and context are also challenges for AI systems in this space.

Impact: The findings and tool were distributed broadly internally. We also published a case study at CHIWORK, which won Best Paper.

Enabling end-users to generate automation flows using transparency and explanation

Goal: Understand how explanations can help end-users create correct automation flows using a natural language to automation system.

Methods: Between-subjects user study of creating automation flows with several variations of explanations or without explanations on Amazon Mechanical Turk (252 participants).

Findings: Providing suggestions of terms to add to an utterance based on others users' inputs helpedusers to repair and generate correct flows more than system-focused explanations.

Impact: Findings were integrated into the design of IBM's AppConnect AI-powered natural language to automation product.