小米盒子4 拆解图解_我希望当我开始学习R时会得到的盒子图解指南

小米盒子4 拆解图解

Customizing a graph to transform it into a beautiful figure in R isn’t alchemy. Nonetheless, it took me a lot of time (and frustration) to figure out how to make these plots informative and publication-quality. Rather than hoarding this information like a dragon presiding over its treasure, I want to share the code I use to plot data, explaining what each piece of it does.

自定义图形以将其转换为R中的漂亮图形并不是炼金术。 尽管如此,我还是花了很多时间(和沮丧)来弄清楚如何使这些情节具有信息性和出版质量。 我要分享的是我用来绘制数据的代码,而不是像巨龙一样掌控着它的宝藏,而是解释了每个信息的作用。

First thing’s first, let’s load our packages and generate a dummy dataset, with one independent variable and one continuous dependent variable. Before the code snippet, I’ll discuss the functions and arguments we will use. We also set an order for our categorical variables beforehand.

首先,让我们加载包并生成一个虚拟数据集,其中包含一个自变量和一个连续因变量。 在代码片段之前,我将讨论我们将使用的函数和参数。 我们还预先为分类变量设置了顺序。

虚拟数据集:函数和参数 (Dummy Dataset: Functions and arguments)

  • runif generates 100 numbers between 0 and 1

    runif生成100个介于0和1之间的数字

  • sample randomly selects the number 1 or 2, 100 times

    样本随机选择数字1或2,进行100次

  • factor sets a preferred order for our grouping variables

    因子为分组变量设置了首选顺序

### Load packages
install.packages('ggplot2')
install.packages('RColorBrewer')
install.packages('ggpubr')
library(ggplot2)
library(RColorBrewer)
library(ggpubr)#Create dummy dataset
ndf <- data.frame(
value = rep(runif(100)),
factor1 = as.char(rep(sample(1:2, replace = TRUE, 100))
)
ndf$factor1 = factor(ndf$factor1, levels = c('1', '2'))

箱形图:函数和参数 (Box Plot: Functions and Arguments)

  • ggplot allows us to specify the independent and dependent variables, as well as the dataset to use for the graph

    ggplot允许我们指定自变量和因变量,以及用于图的数据集

  • stat_boxplot lets us specify the type of whiskers to add onto the plot

    stat_boxplot让我们指定要添加到绘图中的晶须类型

  • geom_boxplot specifies the independent and dependent variables for the boxes in the plot

    geom_boxplot为图中的框指定自变量和因变量

The first basic attempt isn’t very informative or visually appealing. We focus first on just plotting the first independent variable, factor1. I also don’t like the default grey theme within ggplot.

最初的基本尝试不是很有启发性或视觉吸引力。 我们首先专注于绘制第一个自变量factor1。 我也不喜欢ggplot中的默认灰色主题。

plot_data <- ggplot(ndf, aes(y = value, x = factor1)) +
stat_boxplot( aes(y = value, x = factor1 ),
geom='errorbar', width=0.5) +
geom_boxplot(aes(y = value, x = factor1))
Image for post
Generated by the author in R
由作者在R中生成

带点的箱形图:函数和参数 (Box Plot with Dots: Functions and Arguments)

  • theme_bw() sets a black-and-white theme for the plot, getting rid of the pesky grey background

    theme_bw()为情节设置了黑白主题,摆脱了讨厌的灰色背景

  • geom_dotplot: binaxis specifies which variable will be displayed with the dots. The other arguments specify the appearance of the dots, while binwidth specifies how many dots we want in the same row on our plot. If you want to make the dots see-through, use the alpha argument which lets you specify the opacity of the dots from 0 to 1.

    geom_dotplot: binaxis指定将与点一起显示的变量。 其他参数指定点的外观,而binwidth指定在绘图的同一行中需要多少个点。 如果要使点透明,请使用alpha参数,该参数可以指定点的不透明度从0到1。

Now our plot is more informative but it still needs improvement. We want to modify some of the colors, axes and labels.

现在我们的情节提供了更多信息,但仍需要改进。 我们要修改一些颜色,轴和标签。

plot_data <- ggplot(ndf, aes(y = value, x = factor1)) +
stat_boxplot( aes(y = value, x = factor1 ),
geom='errorbar', width = 0.5) +
geom_boxplot(aes(y = value, x = factor1)) +
theme_bw() +
geom_dotplot(binaxis='y',
stackdir = "center",
binwidth = 1/20,
dotsize = 1,
aes(fill = factor1))
Image for post

出汗的小事情 (Sweating the Little Things)

  • scale_fill_manual lets us set specific colors for specific values of our independent variables

    scale_fill_manual让我们为自变量的特定值设置特定的颜色

  • xlab and ylab allow us to label the x and y axis respectively. If you put the escape character following the newline symbol into the title (\n) it continues the rest of the label on the next line!

    xlab和ylab允许我们分别标记x和y轴。 如果将换行符后面的转义字符放在标题(\ n)中,它将在下一行继续其余标签!

  • ggtitle names your plot

    ggtitle为您的情节命名

  • theme is a very customizable part of the script. It let’s us customize just about anything we want with its arguments. Since I don’t like the default grids placed in ggplots, I attribute element_blank to a few panel arguments to get rid of the border around the plot. I make the axis colored black rather than grey. I also specify how much blank space I want around the graph with plot_margin. Since this plot is simple, we get rid of the legend for this example. With the various axis_text arguments I set the size, font face, color and positioning of axis text and labels.

    主题是脚本中非常可定制的部分。 它使我们可以自定义几乎所有我们想要的参数。 由于我不喜欢放置在ggplots中的默认网格,因此我将element_blank赋予一些面板参数以摆脱绘图周围的边界。 我将轴设为黑色而不是灰色。 我还使用plot_margin指定想要在图形周围多少空白。 由于该图很简单,因此在此示例中,我们摆脱了图例。 通过各种axis_text参数,我设置了轴文本和标签的大小,字体,颜色和位置。

  • scale_y_continuous sets the y axis increases from 0 to 1 in 0.2 intervals. By expanding the limit of the plot beyond 1, it gives the title more breathing room. If I needed to use a y-axis with discrete variable names, I could use a different argument — scale_y_discrete.

    scale_y_continuous设置y轴以0.2的间隔从0增加到1。 通过将图的限制扩展到超过1,可以为标题提供更多的喘息空间。 如果我需要使用带有离散变量名称的y轴,则可以使用其他参数-scale_y_discrete。

  • scale_x_discrete lets me relabel the two independent variables. This lets you quickly rename codified data. I’ve relabeled the independent variables as Danny DeVito and Nicolas Cage. Let’s posit that we are looking at how well Nic Cage and Danny DeVito performed at the box office using scores scaled between 0 and 1.

    scale_x_discrete让我重新标记两个自变量。 这使您可以快速重命名编码的数据。 我将自变量重新标记为Danny DeVito和Nicolas Cage。 假设我们正在研究Nic Cage和Danny DeVito在0和1之间的得分在票房上的表现。

  • stat_compare_means lets you do statistical tests. This one’s here for you to explore! Just type stat_compare_means( and press tab to explore the different options available to you. Otherwise you can type ?stat_compare_means to get the help manual for this function!

    stat_compare_means使您可以进行统计检验。 这个在这里供您探索! 只需键入stat_compare_means(并按Tab键即可探索可用的其他选项。否则,您可以键入?stat_compare_means以获得此功能的帮助手册!

If you’re wondering about the hjust and vjust within different elements of the code, it simply allows us to change position of text elements. I added some flare by changing the boxplot line coloring as well as the fill of our dots. If you’re building a busy plot, it’s also important to check for accessibility. You can find colorblind friendly palette with this tool: https://colorbrewer2.org/.

如果您想知道代码中不同元素之间的平衡,它只是允许我们更改文本元素的位置。 我通过更改箱形图线的颜色以及点的填充来增加了耀斑。 如果您要建立繁忙的地块,检查可访问性也很重要。 您可以使用此工具找到对色盲友好的调色板: https : //colorbrewer2.org/ 。

For convenience, I’ve bolded any changes in the code for the plot to make it easier to follow.

为了方便起见,我将图中代码的所有更改加粗了,以使其易于理解。

plot_data <- ggplot(ndf, 
aes(y = value, x = factor1, color = factor1))+
scale_color_manual(values =
c('1' = 'blue',
'2' = 'firebrick4')) +
stat_boxplot( aes(y = value, x = factor1 ),
geom='errorbar', width = 0.5) +
geom_boxplot(aes(y = value, x = factor1), shape = 2)+
theme_bw() +
geom_dotplot(binaxis='y',
stackdir = "center",
binwidth = 1/20,
dotsize = 1,
aes(fill = factor1)) +
scale_fill_manual(values = c('1' = '#0571b0', '2' = '#ca0020')) +
xlab("Independent\nVariable") + ylab("Dependent\nVariable") +
ggtitle("Box-Dot Plot") +
theme( panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line = element_line(colour = "black"),
panel.border = element_blank(),
plot.margin=unit(c(0.5,0.5, 0.5,0.5),"cm"),
legend.position="none",
axis.title.x= element_text(size=14, color = 'black', vjust = 0.5),
axis.title.y= element_text(size=14, color = 'black', vjust = 0.5),
axis.text.x = element_text(size=12, color = 'black'),
axis.text.y = element_text(size=14, color = 'black'),
plot.title = element_text(hjust = 0.5, size=16, face = 'bold', vjust = 0.5)) +
scale_y_continuous(expand = c(0,0), breaks = seq(0,1,0.2),
limits = c(0,1.1)) +
scale_x_discrete(labels = c('Nicholas Cage', 'Danny DeVito'))
stat_compare_means(method = 't.test', hide.ns = FALSE, size = 4, vjust = 2)
Image for post

In a very short span of time, we’ve gone from a simple boxplot to a more visually appealing, accessible plot that gives us more information about the individual datapoints. You can save the code and modify it as you wish whenever you need to plot data. This makes the customization process so much easier! No more (okay maybe a lot less) computer defenestrations!

在很短的时间内,我们已经从简单的箱形图变成了更具视觉吸引力的可访问图,它为我们提供了有关各个数据点的更多信息。 您可以在需要绘制数据时保存并根据需要修改代码。 这使定制过程变得非常容易! 没有更多(好的,也许更少)计算机故障!

If you’re looking to download the code, find it here: https://github.com/simon-sp/R_ggplot_basic_dotplot. I’ve left plenty of comments within the code to help you out in case you get stuck! Remember that placing a question mark before a function brings up an instruction guide for you in RStudio!

如果您要下载代码,请在这里找到它: https : //github.com/simon-sp/R_ggplot_basic_dotplot 。 我在代码中留下了很多注释,以帮助您避免被卡住! 请记住,在功能之前加一个问号会在RStudio中为您提供指导指南!

翻译自: https://towardsdatascience.com/the-box-plot-guide-i-wish-i-had-when-i-started-learning-r-d1e9705a6a37

小米盒子4 拆解图解

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/388259.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

蓝牙一段一段_不用担心,它在那里存在了一段时间

蓝牙一段一段You’re sitting in a classroom. You look around and see your friends writing something down. It seems they are taking the exam, and they know all the answers (even Johnny who, how to say it… wasn’t the brilliant one). You realize that your ex…

普通话测试系统_普通话

普通话测试系统Traduzido/adaptado do original por Vincius Barqueiro a partir do texto original “Writing Alt Text for Data Visualization”, escrito por Amy Cesal e publicado no blog Nightingale.Traduzido / adaptado由 VinciusBarqueiro 提供原始 文本“为数据可…

美国队长3:内战_隐藏的宝石:寻找美国最好的秘密线索

美国队长3:内战There are plenty of reasons why one would want to find solitude in the wilderness, from the therapeutic effects of being immersed in nature, to not wanting to contribute to trail degradation and soil erosion on busier trails.人们有很多理由想要…

Java入门第三季——Java中的集合框架(中):MapHashMap

1 package com.imooc.collection;2 3 import java.util.HashSet;4 import java.util.Set;5 6 /**7 * 学生类8 * author Administrator9 * 10 */ 11 public class Student { 12 13 public String id; 14 15 public String name; 16 17 public Set<…

动漫数据推荐系统

Simple, TfidfVectorizer and CountVectorizer recommendation system for beginner.简单的TfidfVectorizer和CountVectorizer推荐系统&#xff0c;适用于初学者。 目标 (The Goal) Recommendation system is widely use in many industries to suggest items to customers. F…

1.3求根之牛顿迭代法

目录 目录前言&#xff08;一&#xff09;牛顿迭代法的分析1.定义2.条件3.思想4.误差&#xff08;二&#xff09;代码实现1.算法流程图2.源代码&#xff08;三&#xff09;案例演示1.求解&#xff1a;\(f(x)x^3-x-10\)2.求解&#xff1a;\(f(x)x^2-1150\)3.求解&#xff1a;\(f…

Alex Hanna博士:Google道德AI小组研究员

Alex Hanna博士是社会学家和研究科学家&#xff0c;致力于Google的机器学习公平性和道德AI。 (Dr. Alex Hanna is a sociologist and research scientist working on machine learning fairness and ethical AI at Google.) Before that, she was an Assistant Professor at th…

安全开发 | 如何让Django框架中的CSRF_Token的值每次请求都不一样

前言 用过Django 进行开发的同学都知道&#xff0c;Django框架天然支持对CSRF攻击的防护&#xff0c;因为其内置了一个名为CsrfViewMiddleware的中间件&#xff0c;其基于Cookie方式的防护原理&#xff0c;相比基于session的方式&#xff0c;更适合目前前后端分离的业务场景&am…

Kubernetes的共享GPU集群调度

问题背景 全球主要的容器集群服务厂商的Kubernetes服务都提供了Nvidia GPU容器调度能力&#xff0c;但是通常都是将一个GPU卡分配给一个容器。这可以实现比较好的隔离性&#xff0c;确保使用GPU的应用不会被其他应用影响&#xff1b;对于深度学习模型训练的场景非常适合&#x…

django-celery定时任务以及异步任务and服务器部署并且运行全部过程

Celery 应用Celery之前&#xff0c;我想大家都已经了解了&#xff0c;什么是Celery&#xff0c;Celery可以做什么&#xff0c;等等一些关于Celery的问题&#xff0c;在这里我就不一一解释了。 应用之前&#xff0c;要确保环境中添加了Celery包。 pip install celery pip instal…

网页视频15分钟自动暂停_在15分钟内学习网页爬取

网页视频15分钟自动暂停什么是网页抓取&#xff1f; (What is Web Scraping?) Web scraping, also known as web data extraction, is the process of retrieving or “scraping” data from a website. This information is collected and then exported into a format that …

前嗅ForeSpider教程:创建模板

今天&#xff0c;小编为大家带来的教程是&#xff1a;如何在前嗅ForeSpider中创建模板。主要内容有&#xff1a;模板的概念&#xff0c;模板的配置方式&#xff0c;模板的高级选项&#xff0c;具体内容如下&#xff1a; 一&#xff0c;模板的概念 模板列表的层级相当于网页跳转…

django 性能优化_优化Django管理员

django 性能优化Managing data from the Django administration interface should be fast and easy, especially when we have a lot of data to manage.从Django管理界面管理数据应该快速简便&#xff0c;尤其是当我们要管理大量数据时。 To improve that process and to ma…

3D场景中选取场景中的物体。

杨航最近在学Unity3D&#xfeff;&#xfeff;&#xfeff;&#xfeff;在一些经典的游戏中&#xff0c;需要玩家在一个3D场景中选取场景中的物体。例如《仙剑奇侠传》&#xff0c;选择要攻击的敌人时、为我方角色增加血量、为我方角色添加状态&#xff0c;通常我们使用鼠标来选…

canva怎么使用_使用Canva进行数据可视化项目的4个主要好处

canva怎么使用(Notes: All opinions are my own. I am not affiliated with Canva in any way)(注意&#xff1a;所有观点均为我自己。我与Canva毫无关系) Canva is a very popular design platform that I thought I would never use to create the deliverable for a Data V…

如何利用Shader来渲染游戏中的3D角色

杨航最近在学Unity3D&#xfeff;&#xfeff; 本文主要介绍一下如何利用Shader来渲染游戏中的3D角色&#xff0c;以及如何利用Unity提供的Surface Shader来书写自定义Shader。 一、从Shader开始 1、通过Assets->Create->Shader来创建一个默认的Shader&#xff0c;并取名…

Css单位

尺寸 颜色 转载于:https://www.cnblogs.com/jsunny/p/9866679.html

ai驱动数据安全治理_JupyterLab中的AI驱动的代码完成

ai驱动数据安全治理As a data scientist, you almost surely use a form of Jupyter Notebooks. Hopefully, you have moved over to the goodness of JupyterLab with its integrated sidebar, tabs, and more. When it first launched in 2018, JupyterLab was great but fel…

【Android】Retrofit 2.0 的使用

一、概述 Retrofit是Square公司开发的一个类型安全的Java和Android 的REST客户端库。来自官网的介绍&#xff1a; A type-safe HTTP client for Android and JavaRest API是一种软件设计风格&#xff0c;服务器作为资源存放地。客户端去请求GET,PUT, POST,DELETE资源。并且是无…

Mysql常用命令(二)

对数据库的操作 增 create database db1 charset utf8; 查 # 查看当前创建的数据库 show create database db1; # 查看所有的数据库 show databases; 改 alter database db1 charset gbk; 删 drop database db1; 对表的操作 use db1; #切换文件夹select database(); #查看当前所…