Dynamic Opacity Optimization for Scatter Plots

Justin Matejka, Fraser Anderson, George Fitzmaurice

January 2015 · Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI)

Abstract

Scatterplots are an effective and commonly used technique to show the relationship between two variables. However, as the number of data points increases, the chart suffers from \"over-plotting\" which obscures data points and makes the underlying distribution of the data difficult to discern. Reducing the opacity of the data points is an effective way to address over-plotting, however, setting the individual point opacity is a manual task performed by the chart designer. We present a user-driven model of opacity scaling for scatter plots built from crowd-sourced responses to opacity scaling tasks using several synthetic data distributions, and then test our model on a collection of real-world data sets.

DOI PDF

Figures

Figure 1. Scatter plots for two data sets (left side and right side) with varying numbers of data points rendered. The top row shows the appearance with an individual point opacity of 100%, while the second and third rows show the crowd-sourced results for the opacity scaling task and the results of our technique respectively.

Figure 2. The three distribution types used in the first study, with representative samples from the number of point range.

Figure 3. Point opacity values from the first study.

Figure 4. Mean opacity of the utilized chart pixels from the charts produced by the users in Study One.

Figure 5. Graph of the algorithmic model results overlaid on the user-generated results.

Figure 6. Distributions of data used for the validation study.

Figure 7. Grid/Dot configurations used for Study Two.

30 02 .06 Figure 8. Results of the second study.

BibTeX

@inproceedings{10.1145/2702123.2702585,
author = {Matejka, Justin and Anderson, Fraser and Fitzmaurice, George},
title = {Dynamic Opacity Optimization for Scatter Plots},
year = {2015},
isbn = {9781450331456},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/2702123.2702585},
doi = {10.1145/2702123.2702585},
abstract = {Scatterplots are an effective and commonly used technique to show the relationship between two variables. However, as the number of data points increases, the chart suffers from "over-plotting" which obscures data points and makes the underlying distribution of the data difficult to discern. Reducing the opacity of the data points is an effective way to address over-plotting, however, setting the individual point opacity is a manual task performed by the chart designer. We present a user-driven model of opacity scaling for scatter plots built from crowd-sourced responses to opacity scaling tasks using several synthetic data distributions, and then test our model on a collection of real-world data sets.},
booktitle = {Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems},
pages = {2707–2710},
numpages = {4},
keywords = {visualization, scatter plots, overplotting, opacity},
location = {Seoul, Republic of Korea},
series = {CHI '15}
}