为什么饼图有问题

介绍 (Introduction)

It seems as if people are split on pie charts: either you passionately hate them, or you are indifferent. In this article, I am going to explain why pie charts are problematic and, if you fall into the latter category, what you can do when creating pie charts to avoid upsetting those in the former.

似乎人们在饼图上是分裂的:要么热情地恨他们,要么冷漠。 在本文中,我将解释为什么饼图会出现问题,如果您属于后者,那么创建饼图时可以做些什么,以避免使前者烦恼。

饼图为什么有问题? (Why are pie charts problematic?)

他们用大小来传达信息 (They use size to convey information)

A pie chart uses the size of a portion (slice) of a circle (pie) to display a numerical variable. This factor is not an issue in and of itself, as many chart types use size to convey information, including bubble charts and bar charts; however, while bubble charts and bar charts use diameter and height, respectively, to convey information, pie charts rely on the angle describing a slice — -and the human eye is not very good at recognizing differences in angles.

饼图使用圆(饼图)的一部分(切片)的大小来显示数字变量。 这个因素本身不是问题,因为许多图表类型都使用大小来传达信息,包括气泡图和条形图。 但是,虽然气泡图和条形图分别使用直径和高度来传达信息,但饼图依赖于描述切片的角度-人眼在识别角度差异方面并不十分擅长。

Suppose I took a survey on people’s favorite kinds of pie. In the chart below, it is difficult to see how the categories relate to each other; individually, Cherry and Rhubarb seem to comprise a smaller portion of the pie than either Apple or Pumpkin, but it may not be obvious (without looking at the data) which is the smaller slice.

假设我对人们喜欢的派进行了调查。 在下面的图表中,很难看到类别之间的相互关系。 分别看来,樱桃和大黄在馅饼中所占的份额似乎比苹果或南瓜要小,但不明显(不查看数据)是较小的份额。

#Adjusting plot size and margins
options(repr.plot.width=8, repr.plot.height=4)
par(mfrow=c(1,1), mai = c(0.5, 0, 0.75, 0))#Data for pie chart
x = c(18, 15, 13, 25, 29)
labels = c("Key Lime", "Cherry", "Rhubarb", "Pumpkin", "Apple")
cols = c("greenyellow", "red", "mediumvioletred", "darkorange",
"cornsilk")#Build the pie chart
pie(x, labels, radius = 1, col=cols)

他们不能很好地显示许多类别 (They cannot display many categories well)

This issue of conveying size via angle is even more pronounced when many categories are shown in a single pie chart. Furthermore, unlike some charts that are used to display several categories at once, such as bar charts, pie charts depend on differences in color to denote category; therefore, a large palette of colors is necessary, and without proper selection of the palette, the results could be either garish or ambiguous.

当在单个饼图中显示许多类别时,通过角度传送尺寸的问题更加明显。 此外,与某些条形图等用于一次显示多个类别的图表不同,饼形图依靠颜色的不同来表示类别。 因此,必须使用大的调色板,并且如果不正确选择调色板,结果可能是乱码或模棱两可的。

#Adjusting plot size and margins
options(repr.plot.width=8, repr.plot.height=4)
par(mfrow=c(1,1), mai = c(0.55, 0, 0.8, 0))#Data for pie chart
x = c(2, 4, 5, 10, 13, 15, 15, 17, 19)
labels = c("Key Lime", "Pecan", "Cherry", "Blueberry", "Rhubarb",
"Lemon Meringue", "Blackberry", "Pumpkin", "Apple")
cols = c("greenyellow", "tan4", "red", "darkblue",
"mediumvioletred", "yellow", "black", "darkorange",
"cornsilk2")#Build the pie chart
pie(x, labels, radius = 1, col=cols)
Image for post
Image by Author
图片作者

它们显示了整体的一部分 (They show parts of a whole)

Pie charts represent a whole as its components. Therefore, if your dataset is a subset of a larger dataset (and thus does not represent the whole) or if your dataset consists of independent categories (and thus represents multiple wholes), then a pie chart may not be appropriate.

饼图将整体表示为其组成部分。 因此,如果您的数据集是较大数据集的子集(因此不表示整体),或者如果您的数据集由独立的类别组成(并因此表示多个整体),则饼形图可能不合适。

热门套餐中的饼图 (Pie charts in popular packages)

I wouldn’t want to assume anyone’s opinion on as divisive a topic as the pie chart, but perhaps the disdain for this chart type is best exhibited by the lack of built-in functions for creating them in two very popular data visualization packages: ggplot2 (R) and seaborn (Python). With both packages, a pie chart can be created only through trickery.

我不想假设任何人都对饼图这个具有争议性的话题发表意见,但是对于这种图表类型的鄙视最好表现为缺乏在两个非常流行的数据可视化软件包中创建内置图表的内置功能:ggplot2 (R)和seaborn(Python)。 使用这两个软件包,只能通过欺骗来创建饼图。

诡计 (Trickery)

It is convenient — -perhaps a little too convenient — -that a pie chart is no more than a single stacked bar displayed in polar coordinates. The code below builds the pie chart shown above, but using ggplot2.

饼图只不过是极坐标中显示的单个堆积条,这很方便-也许有点方便了。 下面的代码使用ggplot2构建上面显示的饼图。

#Adjusting plot size and margins
options(repr.plot.width=8, repr.plot.height=4)
par(mfrow=c(1,1), mai = c(0.55, 0, 0.8, 0))#Data for the pie chart
values = c(9, 2, 5, 10, 13, 15, 10, 17, 19)
labels = c("Key \nLime", "Pecan", "Cherry", "Blueberry", "Rhubarb",
"Lemon \nMeringue", "Blackberry", "Pumpkin", "Apple")
cols = c("Key \nLime"="greenyellow", "Pecan"="tan4",
"Cherry"="red", "Blueberry"="darkblue",
"Rhubarb"="mediumvioletred", "Lemon \nMeringue"="yellow",
"Blackberry"="black", "Pumpkin"="darkorange",
"Apple"="cornsilk2")data = data.frame(labels, values)#Build the pie chart
ggplot(data, aes(x="", y=values, fill=labels))+
geom_bar(width = 1, stat = "identity") +
scale_fill_manual(values=cols) +
coord_polar("y", start=0) + #Use polar coordinates
theme(axis.title=element_blank(),
axis.text=element_blank(),
legend.title=element_blank())
Image for post
Image by Author
图片作者

哪些图表类型可用于替换饼图? (What chart types can be used to replace pie charts?)

条形图 (Bar charts)

Similar to pie charts, bar charts use size to convey information; however, for bar charts, the height of a rectangle varies, and differences between the heights of bars are easier to recognize than the differences between the angles of portions of a circle. Furthermore, bar charts can be configured to show absolute numbers, percentages, or both!

类似于饼图,条形图使用大小来传达信息。 但是,对于条形图,矩形的高度是变化的,并且条形的高度之间的差异比圆的各个部分的角度之间的差异更容易识别。 此外,可以将条形图配置为显示绝对数字,百分比或同时显示两者!

#Adjusting plot size and margins
options(repr.plot.width=8, repr.plot.height=4)
par(mfrow=c(1,1), mai = c(0.5, 1, 0.2, 1))#Data for bar chart
values = c(9, 2, 5, 10, 13, 15, 10, 17, 19)
labels = c("Key \nLime", "Pecan", "Cherry", "Blueberry", "Rhubarb",
"Lemon \nMeringue", "Blackberry", "Pumpkin", "Apple")data = data.frame(labels, values)
data = data[order(-values),]#Build the bar chart
barplot(height=data$values,
names.arg=data$labels,
ylab="Votes",
ylim = c(0, 20),
cex.names=0.7)
Image for post
Image by Author
图片作者

华夫饼图 (Waffle Charts)

Waffle charts, which are growing in popularity, use number rather than size to visualize a numerical dimension. The resulting graph is similar to a stacked bar or tree map; however, because each square is a unit, compared to alternatives that rely solely on size, it is easier for a person to confirm if a perceived difference between categories is real without relying on text.

华夫饼图越来越流行,它使用数字而不是大小来可视化数字维度。 生成的图形类似于堆积的条形图或树形图。 但是,由于每个正方形都是一个单位,与仅依赖于大小的替代方案相比,一个人可以更容易地在不依赖文本的情况下确认类别之间的感知差异是否是真实的。

#Adjusting plot size and margins
options(repr.plot.width=8, repr.plot.height=4)
par(mfrow=c(1,1), mai = c(0.5, 1, 0.2, 1))#Create data
pies = c("Pecan"=2, "Cherry"=5, "Key Lime"=9, "Blueberry"=10,
"Blackberry"=10, "Rhubarb"=13, "Lemon Meringue"=15,
"Pumpkin"=17, "Apple"=19)waffle(pies, rows=5, size=1.5,
colors=c("tan4", "red", "greenyellow", "darkblue", "black",
"mediumvioletred", "yellow", "darkorange",
"cornsilk2"),
xlab="1 square = 1 vote", legend_pos = "bottom")
Image for post
Image by Author
图片作者

但是,如果我不喜欢其他选择,该怎么办? (But what if I don’t like the alternatives?)

Even though there are many alternatives (e.g., bar charts, stacked bars, waffle charts, lollipop charts, tree maps), pie charts are a familiar chart type to most people, and depending on the audience, familiarity may be an important factor that affects interpretability. So if you want to stick with pie charts, consider taking the following advice.

即使有很多选择(例如条形图,堆积的条形图,华夫饼图,棒棒糖图,树形图),饼图也是大多数人熟悉的图表类型,并且取决于受众,熟悉度可能是影响可解释性。 因此,如果您要坚持饼图,请考虑采取以下建议。

通过分组限制类别数 (Limit the number of categories via grouping)

To avoid visual clutter and to ensure your pie chart is readable, the number of categories should be small. Therefore, it may be useful to group categories that individually comprise a small proportion of the pie into a single category. Note that, when using this approach, it may be helpful to list the items contained in the derived category. Furthermore, it is best to ensure that the new category does not form the majority of the resulting pie.

为避免视觉混乱,并确保饼图可读,类别的数量应少。 因此,将单独包含一小部分馅饼的类别归为一个类别可能很有用。 请注意,使用这种方法时,列出派生类别中包含的项目可能会有所帮助。 此外,最好确保新类别不构成结果派的大部分。

将百分比或绝对数字(或两者)显示为文本 (Show percentages or absolute numbers (or both) as text)

Text can be used to prevent misunderstandings due to ambiguity. By including text information, a person can see if there are differences among the categories. However, if it is necessary to include text, then one can argue that the visualization itself is ineffective (so be prepared to defend your choice of chart type).

文本可用于防止由于歧义而引起的误解。 通过包含文本信息,人们可以查看类别之间是否存在差异。 但是,如果有必要包含文本,则可以认为可视化本身是无效的(因此请准备好捍卫您选择的图表类型)。

#Adjusting plot size and margins
options(repr.plot.width=8, repr.plot.height=4)
par(mfrow=c(1,1), mai = c(0.55, 0, 0.8, 0))#Data for pie chart
x = c(15, 20, 35, 30)
labels = c("Other (15%)", "Cherry (20%)", "Pumpkin (35%)",
"Apple (30%)")
cols = c("black", "red", "darkorange", "cornsilk2")#Build the pie chart
pie(x, labels, radius = 1, col=cols)
Image for post
Image by Author
图片作者

结论 (Conclusions)

I hope you found this discussion of pie charts informative. While pie charts can be avoided in most cases, they remain a pithy little chart on which many, many people have little to no opinion. However, to avoid a mass uptake of pitchforks and torches, please remember to employ pie charts responsibly and to use caution when including any controversial chart type in your next presentation.

我希望您发现对饼形图的讨论能提供更多信息。 尽管在大多数情况下都可以避免饼图,但它们仍然是一个精巧的小图表,很多人对此几乎没有意见。 但是,为避免大量吸收干草叉和火炬,请记住要负责任地使用饼形图,并在下次演示中包括任何有争议的图表类型时保持谨慎。

所需的库 (Required libraries)

library(repr)
library(dplyr)
library(plotly)
library(waffle)
library(ggplot2)
library(RColorBrewer)

翻译自: https://medium.com/analytics-vidhya/why-pie-charts-are-problematic-1075ce7d0270

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/389443.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

New Distinct Substrings(后缀数组)

New Distinct Substrings&#xff08;后缀数组&#xff09; 给定一个字符串&#xff0c;求不相同的子串的个数。\(n<50005\)。 显然&#xff0c;任何一个子串一定是后缀上的前缀。先&#xff08;按套路&#xff09;把后缀排好序&#xff0c;对于当前的后缀\(S_i\)&#xff0…

Android dependency 'com.android.support:support-v4' has different version for the compile (26.1.0...

在项目中加入react-native-camera的时候 出现的错误. 解决方案: 修改 implementation project(:react-native-camera)为 implementation (project(:react-native-camera)) {exclude group: "com.android.support"}查看原文 Could not find play-services-basement.aa…

先知模型 facebook_使用Facebook先知进行犯罪率预测

先知模型 facebookTime series prediction is one of the must-know techniques for any data scientist. Questions like predicting the weather, product sales, customer visit in the shopping center, or amount of inventory to maintain, etc - all about time series …

莫烦Pytorch神经网络第四章代码修改

4.1CNN卷积神经网络 import torch import torch.nn as nn from torch.autograd import Variable import torch.utils.data as Data import torchvision import matplotlib.pyplot as pltEPOCH 1 BATCH_SIZE 50 LR 0.001 DOWNLOAD_MNIST False #如果数据集已经下载到…

github gists 101使代码共享漂亮

If you’ve been going through Medium, looking at technical articles, you’ve undoubtedly seen little windows that look like the below:如果您一直在阅读Medium&#xff0c;并查看技术文章&#xff0c;那么您无疑会看到类似于以下内容的小窗口&#xff1a; def hello_…

loj #6278. 数列分块入门 2

题目 题解 区间修改&#xff0c;询问区间小于c的个数。分块排序&#xff0c;用vector。至于那个块的大小&#xff0c;好像要用到均值不等式 我不太会。。。就开始一个个试&#xff0c;发现sizsqrt(n)/4时最快&#xff01;&#xff01;&#xff01;明天去学一下算分块复杂度的方…

基于Netty的百万级推送服务设计要点

1. 背景1.1. 话题来源最近很多从事移动互联网和物联网开发的同学给我发邮件或者微博私信我&#xff0c;咨询推送服务相关的问题。问题五花八门&#xff0c;在帮助大家答疑解惑的过程中&#xff0c;我也对问题进行了总结&#xff0c;大概可以归纳为如下几类&#xff1a;1&#x…

莫烦Pytorch神经网络第五章代码修改

5.1动态Dynamic import torch from torch import nn import numpy as np import matplotlib.pyplot as plt# torch.manual_seed(1) # reproducible# Hyper Parameters INPUT_SIZE 1 # rnn input size / image width LR 0.02 # learning rateclass…

鲜为人知的6个黑科技网站_6种鲜为人知的熊猫绘图工具

鲜为人知的6个黑科技网站Pandas is the go-to Python library for data analysis and manipulation. It provides numerous functions and methods that expedice the data analysis process.Pandas是用于数据分析和处理的Python库。 它提供了加速数据分析过程的众多功能和方法…

VRRP网关冗余

实验要求 1、R1创建环回口&#xff0c;模拟外网 2、R2&#xff0c;R3使用VRRP技术 3、路由器之间使用EIGRP路由协议  实验拓扑  实验配置  R1(config)#interface loopback 0R1(config-if)#ip address 1.1.1.1 255.255.255.0R1(config-if)#int e0/0R1(config-if)#ip addr…

网页JS获取当前地理位置(省市区)

网页JS获取当前地理位置&#xff08;省市区&#xff09; 一、总结 一句话总结&#xff1a;ip查询接口 二、网页JS获取当前地理位置&#xff08;省市区&#xff09; 眼看2014又要过去了&#xff0c;翻翻今年的文章好像没有写几篇&#xff0c;忙真的或许已经不能成为借口了&#…

大熊猫卸妆后_您不应错过的6大熊猫行动

大熊猫卸妆后数据科学 (Data Science) Pandas is used mainly for reading, cleaning, and extracting insights from data. We will see an advanced use of Pandas which are very important to a Data Scientist. These operations are used to analyze data and manipulate…

数据eda_关于分类和有序数据的EDA

数据eda数据科学和机器学习统计 (STATISTICS FOR DATA SCIENCE AND MACHINE LEARNING) Categorical variables are the ones where the possible values are provided as a set of options, it can be pre-defined or open. An example can be the gender of a person. In the …

PyTorch官方教程中文版:PYTORCH之60MIN入门教程代码学习

Pytorch入门 import torch""" 构建非初始化的矩阵 """x torch.empty(5,3) #print(x)""" 构建随机初始化矩阵 """x torch.rand(5,3)""" 构造一个矩阵全为 0&#xff0c;而且数据类型是 long &qu…

Flexbox 最简单的表单

弹性布局(Flexbox)逐渐流行&#xff0c;越来越多的人开始使用&#xff0c;因为它写Css布局真是太简单了一一、<form>元素表单使用<form>元素<form></form>复制代码上面是一个空的表单&#xff0c;根据HTML标准&#xff0c;它是一个块级元素&#xff0c…

CSS中的盒子模型

一.为什么使用CSS 1.有效的传递页面信息 2.使用CSS美化过的页面文本&#xff0c;使页面漂亮、美观&#xff0c;吸引用户 3.可以很好的突出页面的主题内容&#xff0c;使用户第一眼可以看到页面主要内容 4.具有良好的用户体验 二.字体样式属性 1.font-family:英…

jdk重启后步行_向后介绍步行以一种新颖的方式来预测未来

jdk重启后步行“永远不要做出预测&#xff0c;尤其是关于未来的预测。” (KK Steincke) (“Never Make Predictions, Especially About the Future.” (K. K. Steincke)) Does this picture portray a horse or a car? 这张照片描绘的是马还是汽车&#xff1f; How likely is …

PyTorch官方教程中文版:入门强化教程代码学习

PyTorch之数据加载和处理 from __future__ import print_function, division import os import torch import pandas as pd #用于更容易地进行csv解析 from skimage import io, transform #用于图像的IO和变换 import numpy as np import matplotlib.pyplot a…

css3-2 CSS3选择器和文本字体样式

css3-2 CSS3选择器和文本字体样式 一、总结 一句话总结&#xff1a;是要记下来的&#xff0c;记下来可以省很多事。 1、css的基本选择器中的:first-letter和:first-line是什么意思&#xff1f; :first-letter选择第一个单词&#xff0c;:first-line选择第一行 2、css的伪类选…

mongodb仲裁者_真理的仲裁者

mongodb仲裁者Coming out of college with a background in mathematics, I fell upward into the rapidly growing field of data analytics. It wasn’t until years later that I realized the incredible power that comes with the position. As Uncle Ben told Peter Par…