k j

Communicating Design Intent Using Drawing and Text

William P. McCarthy, Justin Matejka, Karl D.D. Willis, Judith E. Fan, Yewen Pu
January 2024 · Proceedings of the 16th Conference on Creativity & Cognition (C&C)
Communicating Design Intent Using Drawing and Text

Abstract

Realizing a designer’s intent in software currently requires tedious manipulation of geometric primitives, such as points and curves. By contrast, designers routinely communicate more abstract design goals to one another using an efficient combination of natural language and drawings. What would it take to develop artificial systems that understand how humans naturally convey design intent, and thereby enable more seamless interactions between humans and machines throughout the design process? First, it is vital to establish benchmarks that showcase the full range of strategies that humans use to successfully communicate about design intent. Here we take initial steps towards that goal by conducting an online study in which pairs of human participants – a “Designer” and “Maker” – collaborated over multiple turns to recreate target designs. In each turn, Designers sent messages containing language, drawings, or both to the Maker, describing how to modify an existing design toward the target. We found a preference for communicating using drawings in early turns and observed several multimodal strategies for conveying design intent. By comparing how human Makers and GPT-4V carried out instructions, we identify a gap in human and machine understanding of multimodal instructions and suggest a path for bridging this gap.

Figures

Figure 1: Novice participants were paired in an online experiment and assigned the role of Designer or Maker. The Designer wasshown a target CAD and asked to instruct the Maker how to recreate it, usi
Figure 2: A) Modality use across rounds. Participants generally sent multimodal messages, but leaned more heavily on drawings in earlier rounds; B) The number of strokes sent in instructions decreased across rounds; C) The number of characters increased across rounds.
Figure 3: 4 example trials from paired Designers and Makers, with 3 additional responses fromSolo Makers and 3 from GPT-4V. Solo participants followed instructions, improving the current CAD, whereas GPT-4V usually made things worse.
Figure 4: A) Average distance away from target CAD in final round of paired study, sorted by number of elements in target CAD; B) Distances from paired Maker’s reconstruction to target decrease in each round (blue), consistently across stimuli (gray). C) Solo Makers’ modifications reliably reduced distance to target, whereas GPT-4V’s made CAD’s more dissimilar.
Figure 4: A) Average distance away from target CAD in final round of paired study, sorted by number of elements in target CAD; B) Distances from paired Maker’s reconstruction to target decrease in each round (blue), consistently across stimuli (gray). C) Solo Makers’ modifications reliably reduced distance to target, whereas GPT-4V’s made CAD’s more dissimilar.

BibTeX

@inproceedings{10.1145/3635636.3664261,
  author = {McCarthy, William P. and Matejka, Justin and Willis, Karl D.D. and Fan, Judith E. and Pu, Yewen},
  title = {Communicating Design Intent Using Drawing and Text},
  year = {2024},
  isbn = {9798400704857},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3635636.3664261},
  doi = {10.1145/3635636.3664261},
  abstract = {Realizing a designer’s intent in software currently requires tedious manipulation of geometric primitives, such as points and curves. By contrast, designers routinely communicate more abstract design goals to one another using an efficient combination of natural language and drawings. What would it take to develop artificial systems that understand how humans naturally convey design intent, and thereby enable more seamless interactions between humans and machines throughout the design process? First, it is vital to establish benchmarks that showcase the full range of strategies that humans use to successfully communicate about design intent. Here we take initial steps towards that goal by conducting an online study in which pairs of human participants – a “Designer” and “Maker” – collaborated over multiple turns to recreate target designs. In each turn, Designers sent messages containing language, drawings, or both to the Maker, describing how to modify an existing design toward the target. We found a preference for communicating using drawings in early turns and observed several multimodal strategies for conveying design intent. By comparing how human Makers and GPT-4V carried out instructions, we identify a gap in human and machine understanding of multimodal instructions and suggest a path for bridging this gap.},
  booktitle = {Proceedings of the 16th Conference on Creativity \& Cognition},
  pages = {512–519},
  numpages = {8},
  location = {Chicago, IL, USA},
  series = {C&C '24},
}