k j

To Use or Not to Use: Impatience and Overreliance When Using Generative AI Productivity Support Tools

Han Qiao, Jo Vermeulen, George Fitzmaurice, Justin Matejka
January 2025 · Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI)
To Use or Not to Use: Impatience and Overreliance When Using Generative AI Productivity Support Tools

Abstract

Generative AI has the potential to assist people with completing various tasks, but increased productivity is not guaranteed due to challenges such as uncertainty in output quality and unclear processing time. Through an online crowdsourced experiment (N=508), leveraging a “paint by numbers” task to simulate properties of GenAI assistance, we explore how, and how well, users make decisions on whether to use or not use automation to maximize their productivity given varying waiting times and output quality. We observed gaps between user’s actual choices and their optimal choices and characterized these gaps as the “gulf of impatience” and the “gulf of overreliance”. We also distilled strategies that participants adopted when making their decisions. We discuss design considerations in supporting users to make more informed decisions when interacting with GenAI tools and make these tools more useful for improving users’ task performance, productivity and satisfaction.

Figures

Figure 1: An example of an empty grid (left) and the final image presented to users (right) in a paint by numbers task.
Figure 2: The experiment landing page instructing users how they can complete the paint by numbers game. There are two options for completing the task. One is Manual Fill mode, in which users fill in all colors manually through keypresses. The other is Assisted Fill, in which users wait for the simulated GenAI to fill in the colors, and then they manually fix any mistakes.
Figure 3: Illustration of the workflow for using Manual Fill (top) and using Assisted Fill (bottom). Assisted Fill simulates GenAI support tools through various latency and error rates that require users to fix any mistakes before completing the task.
Figure 4: Workflow of the experiment and an example participant’s performance. Participants went through six steps in total: (1) Introduction and two practice rounds to get to know Manual Fill and Assisted Fill; (2) The first full grid task, where participants will be assigned to either Manual Fill or Assisted Fill. In this example, the participant first was assigned to Manual Fill; (3) The second full grid task, in which participants will be assigned to a different mode of interaction from the first task; (4) Pre-Task 3 questions, in which participants answer questions that determine whether they get to use their assigned Assisted Fill tool or manually fill in colors for Task 3; (5) Task 3, in which participants complete a third paint by numbers game using the mode of interaction based on their answer to Question 1 in Pre-Task 3; (6) Post-Task 3 survey.
Figure 5: Participants’ answers to Question 1 and Question 2. Answering one question generates 15 and 16 data points respectively.
Figure 6: Four plots illustrating the main patterns of chosen versus optimal percentages of using Assisted Fill given various error rate conditions. The y-axis in each chart represents the percentage of participants choosing Assisted Fill, with y = 1 meaning everyone chose Assisted Fill and y = 0 indicating no one chose Assisted Fill. The x-axis represents the different latency conditions. The charts represent these variables for increasing error rates: 0% (top left), 20% (top right), 50% (bottom left), and 75% (bottom right). Sigmoid curves are fitted to the raw data to illustrate trends. Shaded areas around the red and blue curves represent 1-sigma bootstrap confidence intervals [8, 40].
Figure 7: A heat map generated with 16 pairs of sigmoid curves (Figure 6) across 16 error rate conditions, illustrating the differences between the percentage of participants’ actual choices of choosing Assisted Fill and the percentage of participants’ optimal choices of choosing Assisted Fill. White areas indicate close to optimal choices, blue areas show overreliance on Assisted Fill, and red areas show underreliance on Assisted Fill. Data used here was collected from participants’ answer to Question 1 before working on Task 3: Given X error rate, what is the longest time a participant is willing to wait for the Assisted Fill tool.
Figure 8: Four plots illustrating key patterns in chosen versus optimal percentages of using Assisted Fill given various latency conditions. The y-axis in each chart represents the percentage of participants choosing Assisted Fill, with y = 1 meaning everyone chose Assisted Fill and y = 0 indicating no one chose Assisted Fill. The x-axis represents the different error rate conditions. The four horizontal charts represent these variables for increasing latency: 0s (top left), 30s (top right), 105s (bottom left), and 210s (bottom right). Sigmoid curves are fitted to the raw data to illustrate trends. Shaded areas around the red and blue curves represent 1-sigma bootstrap confidence intervals [8, 40].
Figure 9: A heat map generated with 15 pairs of sigmoid curves (Figure 8) across 15 latency conditions, illustrating the differences between the percentage of participants’ actual choices of choosing Assisted Fill and the percentage of participants’ optimal choices of choosing Assisted Fill. White areas indicate close to optimal choices, blue areas show overreliance on Assisted Fill, and red areas show underreliance on Assisted Fill. Data used here was collected from participants’ answer to Question 2 before working on Task 3: Given X latency, what is the largest error rate a participant is willing to tolerate for Assisted Fill.
Figure 10: A heat map based on the gap between participants’ answers for Question 1 and Question 2 and their optimal choices. We obtained this continuous heat map by applying bicubic smoothing to the discrete heat map on the left. White areas indicate close to optimal choices, blue areas show overreliance on Assisted Fill, and red areas show underreliance on Assisted Fill.
Figure 11: Strip plots illustrating the distribution of differences between chosen and optimal latency (top) and error rate (bottom) by adoption of strategy. The orange strip plots illustrate the distribution of participants who leveraged a certain strategy and the green strip plot illustrate the distribution of participants who did not leverage that strategy. The dashed lines represent the medians of each group. There is not much difference across each pair.

BibTeX

@inproceedings{10.1145/3706598.3714103,
author = {Qiao, Han and Vermeulen, Jo and Fitzmaurice, George and Matejka, Justin},
title = {To Use or Not to Use: Impatience and Overreliance When Using Generative AI Productivity Support Tools},
year = {2025},
isbn = {9798400713941},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3706598.3714103},
doi = {10.1145/3706598.3714103},
abstract = {Generative AI has the potential to assist people with completing various tasks, but increased productivity is not guaranteed due to challenges such as uncertainty in output quality and unclear processing time. Through an online crowdsourced experiment (N=508), leveraging a “paint by numbers” task to simulate properties of GenAI assistance, we explore how, and how well, users make decisions on whether to use or not use automation to maximize their productivity given varying waiting times and output quality. We observed gaps between user’s actual choices and their optimal choices and characterized these gaps as the “gulf of impatience” and the “gulf of overreliance”. We also distilled strategies that participants adopted when making their decisions. We discuss design considerations in supporting users to make more informed decisions when interacting with GenAI tools and make these tools more useful for improving users’ task performance, productivity and satisfaction.},
booktitle = {Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems},
articleno = {1122},
numpages = {18},
keywords = {generative AI, decision-making, productivity, reliance, AI, automation, controlled experiment},
location = {
},
series = {CHI '25}
}