回归分析检验_回归分析

回归分析检验

Regression analysis is a reliable method in statistics to determine whether a certain variable is influenced by certain other(s). The great thing about regression is also that there could be multiple variables influencing the variable of interest. Regression analysis can be used for prediction.

回归分析是统计中确定某个变量是否受某些其他变量影响的可靠方法。 回归的伟大之处还在于,可能会有多个变量影响目标变量。 回归分析可用于预测。

You have to understand the two types of variables to get started with regression analysis:

您必须了解两种类型的变量才能开始回归分析:

Dependent variable — the variable that you want to examine, understand or predict.

因变量-您要检查,理解或预测的变量。

Independent variable(s) — all the other variables that you hypothisize to influence the dependent variable.

自变量—您假设的所有其他变量都会影响因变量。

In order to start the regression analysis, the dependent variable should be chosen. Then the independent variable or variables should be chosen which you hypothesize to affect the dependent variable.

为了开始回归分析,应选择因变量。 然后应选择一个或多个您假设会影响因变量的自变量。

The next step is obtaining data for the regression analysis. This is usually a dataset that has the identified dependent and independent variables. As an instance, if there are separate datasets available for each of the variables, the variables of interest can be extracted and combined into a new dataset.

下一步是获取用于回归分析的数据。 这通常是具有已标识的因变量和自变量的数据集。 例如,如果每个变量都有单独的数据集,则可以提取感兴趣的变量并将其合并到新的数据集中。

A scatter plot where the points are are scattered but follow a positive slope

After that, the data should be plotted. The dependent variable always goes on the x-axis and the independent variable on the y-axis.

之后,应绘制数据。 因变量始终在x轴上 ,而自变量始终在y轴上

From the plot, initial trends and correlation can be observed that suggest what kind of relationship the dependent and independent variables have. In the example to the left, the hypothetical data points have an increasing trend. As the independent variable increases the dependent increases as well.

从图中可以观察到初始趋势和相关性,它们表明因变量和自变量具有什么样的关系。 在左侧的示例中,假设的数据点呈上升趋势。 随着自变量的增加,因变量也随之增加。

A trend could be observed from the plot, but what is the precise degree to which the dependent variable is influenced by the independent? A regression line should be calculated. Usually, this can be done in software like STATA or Excel. The regression line is the best approximation of the data points on the plot.

从图中可以观察到趋势,但是因变量在多大程度上受到自变量的影响? 应该计算一条回归线。 通常,这可以在STATA或Excel之类的软件中完成。 回归线是图中数据点的最佳近似值。

In other words, explains Redman, “The red line is the best explanation of the relationship between the independent variable and dependent variable.”

换句话说,雷德曼解释说:“红线是对自变量和因变量之间关系的最好解释。”

Image for post

计算回归线 (Calculating the regression line)

Calculating a regression line means finding a best-fit line for all the data points. For simple linear regression analysis, usually, the least-squares method is used.

计算回归线意味着找到所有数据点的最佳拟合线。 对于简单的线性回归分析,通常使用最小二乘法。

The linear regression line is a simple line of the form y=mx+b. In order to find the best-fit line for your data you need to first find the five summary statistics:

线性回归线是形式为y = mx + b的简单线 为了找到最适合您的数据的行,您需要首先找到五个汇总统计信息:

  1. Mean of the x values

    x值的平均值

Image for post

2. Mean of the y values

2. y值的平均值

Image for post

3. The standard deviation of the x values (denoted sx)

3. x值的标准偏差(表示为sx )

4. The standard deviation of the y values (denoted sy)

4. y值的标准偏差(表示为sy )

5. The correlation between X and Y (denoted r)

5. XY之间的相关性(表示为r )

The formula for calculating the slope m of the regression line is the following:

回归线的斜率m的计算公式如下:

Image for post

This formula calculates the slope for the regression line equation of the form y=mx+b. Now the last part to calculate is the y-intercept b. It can be calculated using the formula below:

该公式计算形式为y = mx + b的回归线方程的斜率。 现在要计算的最后一部分是y截距 b 。 可以使用以下公式计算:

Image for post

are the means of the x values and y values respectively and m is the already calculated slope.

分别是x值和y值的均值, m是已经计算出的斜率。

The regression line that Excel will produce for example will look something like y=6x+70+error_term. This is different from the simple regression line that we calculated in that it has an error_term.

例如,Excel将产生的回归线将类似于y = 6x + 70 + error_term 。 这与我们计算的简单回归线不同,它具有error_term

Regression lines always consider an error term because in reality, independent variables are never precisely perfect predictors of dependent variables.

回归线总是考虑一个误差项,因为实际上,自变量从来都不是因变量的精确预测器。

In reality, the dependent term might be determined by a number of different factors. The regression line is only an estimate based on the data available to you and the larger the error term is the less definitely certain your regression line is.

实际上,从属项可能由许多不同的因素决定。 回归线只是基于您可用数据的估计值,误差项越大,确定线越不确定。

结论 (Conclusion)

Regression analysis helps determine effect of some variables on another. It is widely used in business analysis for determining different factors that influence the target variable and predict its future values.

回归分析有助于确定某些变量对另一个变量的影响。 它广泛用于业务分析中,以确定影响目标变量并预测其未来价值的不同因素。

We’ve discussed what regression analysis is and how to calculate the regression line.

我们已经讨论了什么是回归分析以及如何计算回归线。

翻译自: https://medium.com/swlh/regression-analysis-86e6a8bee0b7

回归分析检验

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/389190.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

是什么样的骚操作让应用上线节省90%的时间

优秀的程序员 总会想着 如何把花30分钟才能解决的问题 在5分钟内就解决完 例如在应用上线这件事上 通常的做法是 构建项目在本地用maven打包 每次需要clean一次,再build一次 部署包在本地ide、git/svn、maven/gradie 及代码仓库、镜像仓库和云平台间 来回切换 上传部…

Ubuntu 18.04 下如何配置mysql 及 配置远程连接

首先是大家都知道的老三套,啥也不说上来就放三个大招: sudo apt-get install mysql-serversudo apt isntall mysql-clientsudo apt install libmysqlclient-dev 这三步下来mysql就装好了,然后我们偷偷检查一下 sudo netstat -tap | grep mysq…

数据科学与大数据技术的案例_主数据科学案例研究,招聘经理的观点

数据科学与大数据技术的案例I’ve been in that situation where I got a bunch of data science case studies from different companies and I had to figure out what the problem was, what to do to solve it and what to focus on. Conversely, I’ve also designed case…

队列的链式存储结构及其实现_了解队列数据结构及其实现

队列的链式存储结构及其实现A queue is a collection of items whereby its operations work in a FIFO — First In First Out manner. The two primary operations associated with them are enqueue and dequeue.队列是项目的集合,由此其操作以FIFO(先进先出)的方…

cad2016珊瑚_预测有马的硬珊瑚覆盖率

cad2016珊瑚What’s the future of the world’s coral reefs?世界珊瑚礁的未来是什么? In February of 2020, scientists at University of Hawaii Manoa released a study addressing this very question. The models they developed forecasted a 70–90% worl…

EChart中使用地图方式总结(转载)

EChart中使用地图方式总结 2018年02月06日 22:18:57 来源:https://blog.csdn.net/shaxiaozilove/article/details/79274772最近在仿照EChart公交线路方向示例,开发表示排水网和污水网流向地图,同时地图上需要叠加排放口、污染源、污水处理厂等…

android mvp模式

越来越多人讨论mvp模式,mvp在android应用开发中获得更多的重视,这里说一下对MVP的简单了解。 什么是 MVP? MVP模式使逻辑从视图层分开,目的是我们在屏幕上怎么表现,和界面如何工作的所有事情就完全分开了。 View显示数据&…

Node.js REPL(交互式解释器)

2019独角兽企业重金招聘Python工程师标准>>> Node.js REPL(交互式解释器) Node.js REPL(Read Eval Print Loop:交互式解释器) 表示一个电脑的环境,类似 Window 系统的终端或 Unix/Linux shell,我们可以在终端中输入命令,并接收系统…

用python进行营销分析_用python进行covid 19分析

用python进行营销分析Python is a highly powerful general purpose programming language which can be easily learned and provides data scientists a wide variety of tools and packages. Amid this pandemic period, I decided to do an analysis on this novel coronav…

Alpha冲刺第二天

Alpha第二天 1.团队成员 郑西坤 031602542 (队长) 陈俊杰 031602504陈顺兴 031602505张胜男 031602540廖钰萍 031602323雷光游 031602319苏芳锃 0316023302.项目燃尽图 3.项目进展 时间工作内容11月18日UI设计、初步架构搭建11月19日UI设计、服务器的进一…

水文分析提取河网_基于图的河网段地理信息分析排序算法

水文分析提取河网The topic of this article is the application of information technologies in environmental science, namely, in hydrology. Below is a description of the algorithm for ranking rivers and the plugin we implemented for the open-source geographic…

请不要更多的基本情节

“If I see one more basic blue bar plot…”“如果我再看到一个基本的蓝色条形图……” After completing the first module in my studies at Flatiron School NYC, I started playing with plot customizations and design using Seaborn and Matplotlib. Much like doodl…

Powershell-获取DHCP地址租用信息

需求&#xff1a;业务需要获取现阶段DHCP服务器所有地址租用信息。 1.首先查看DHCP相关帮助信息&#xff1a;2.确定执行命令并获取相关帮助信息&#xff1a;help Get-DhcpServerv4Scope 名称 Get-DhcpServerv4Scope 语法 Get-DhcpServerv4Scope [[-ScopeId] <ipaddress[]>…

python 交互式流程图_使用Python创建漂亮的交互式和弦图

python 交互式流程图Python中的数据可视化 (Data Visualization in Python) R vs Python is a constant tussle when it comes to what is the best language, according to data scientists. Though each language has it’s strengths, R, in my opinion has one cutting-edg…

机器学习解决什么问题_机器学习帮助解决水危机

机器学习解决什么问题According to Water.org and Lifewater International, out of 57 million people in Tanzania, 25 million do not have access to safe water. Women and children must travel each day multiple times to gather water when the safety of that water …

Viewport3D 类Viewport3D 类Viewport3D 类

.NET Framework 类库Viewport3D 类更新&#xff1a;2007 年 11 月为三维可视内容提供呈现图面。命名空间&#xff1a; System.Windows.Controls程序集&#xff1a; PresentationFramework&#xff08;在 PresentationFramework.dll 中&#xff09;用于 XAML 的 XMLNS&#xf…

网络浏览器如何工作

Behind the scenes of modern Web Browsers现代Web浏览器的幕后花絮 The Web Browser is inarguably the most common portal for users to access the web. The advancement of the web browsers (through the series of history) has led many traditional “thick clients”…

让自己的头脑极度开放

为什么80%的码农都做不了架构师&#xff1f;>>> 一. 头脑封闭和头脑开放 头脑封闭 你是否经常有这样的经历&#xff0c;在一次会议或者在一次小组讨论时&#xff0c;当你提出一个观点而被别人否定时&#xff0c;你非常急迫地去反驳别人&#xff0c;从而捍卫自己的尊…

简介DOTNET 编译原理 简介DOTNET 编译原理 简介DOTNET 编译原理

简介DOTNET 编译原理 相信大家都使用过 Dotnet &#xff0c;可能还有不少高手。不过我还要讲讲Dotnet的基础知识&#xff0c;Dotnet的编译原理。 Dotnet是一种建立在虚拟机上执行的语言&#xff0c;它直接生成 MSIL 的中间语言&#xff0c;再由DotNet编译器 JIT 解释映象为本机…

RecyclerView详细了解

关于RecyclerView大家都不陌生了&#xff0c;它的使用也越来越受欢迎&#xff0c;现在总体了解一下RecyclerView的作用&#xff0c;为什么会有RecyclerView呢&#xff0c;我用ListView也能干所有的事情啊&#xff0c;尺有所短&#xff0c;寸有所长&#xff0c;先来看看Recycler…