spearman相关性_Spearman的相关性及其在机器学习中的意义

spearman相关性

This article is about correlation and its implication in the machine learning. In my previous article, I have discussed Pearson’s correlation coefficient and later we have written a code to show the usefulness of finding Pearson’s correlation coefficient. Well, you must be thinking that why is there a need to use Spearman's correlation when we already have Pearson’s correlation to find out the correlation between the feature values and the target values? The answer is that "Pearson’s correlation works fine only with the linear relationships whereas Spearman's correlation works well even with the non-linear relationships".

本文介绍了相关性及其在机器学习中的含义。 在上一篇文章中,我讨论了Pearson的相关系数 ,后来我们编写了代码以显示找到Pearson的相关系数的有用性。 好吧,您必须考虑一下, 当我们已经有了Pearson的相关性以找出特征值与目标值之间的相关性时为什么需要使用Spearman的相关性? 答案是“皮尔逊相关仅适用于线性关系,而斯皮尔曼相关甚至适用于非线性关系”

Another advantage of using Spearman’s correlation is that since it uses ranks to find the correlation values, therefore, this correlation well suited for continuous as well as discrete datasets.

使用Spearman相关性的另一个优点是,由于它使用秩来查找相关值,因此,此相关性非常适合于连续数据集和离散数据集。

Image source: https://digensia.files.wordpress.com/2012/04/s1.png

图片来源: https : //digensia.files.wordpress.com/2012/04/s1.png

Here, the the value of dican be calculated as X-Y where X= feature values and Y= target values.

在这里,dican的值可以计算为XY ,其中X =特征值Y =目标值

The Dataset used can be downloaded from here: headbrain4.CSV

可以从此处下载使用的数据集: headbrain4.CSV

Since we have used the continuous dataset. i.e. the same dataset used for Pearson’s correlation, you will not be able to observe much of a difference between the Pearson and Spearman correlation, you can download any discrete dataset and you’ll see the difference.

由于我们使用了连续数据集。 也就是说,与用于Pearson相关的数据集相同,您将无法观察到Pearson和Spearman相关之间的很大差异,您可以下载任何离散的数据集,然后看到差异。

So now, let us see how we can use Spearman's correlation in our machine learning program using python programming:

现在,让我们看看如何使用python编程在我们的机器学习程序中使用Spearman的相关性:

# -*- coding: utf-8 -*-
"""
Created on Sun Jul 29 22:21:12 2018
@author: Raunak Goswami
"""
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
#reading the data
"""
here the directory of my code and the headbrain4.csv 
file is same make sure both the files are stored in 
the same folder or directory
""" 
data=pd.read_csv('headbrain4.csv')
#this will show the first five records of the whole data
data.head()
#this will create a variable w which has the feature values i.e Gender
w=data.iloc[:,0:1].values
#this will create a variable x which has the feature values i.e Age Range
y=data.iloc[:,1:2].values
#this will create a variable x which has the feature values i.e head size
x=data.iloc[:,2:3].values
#this will create a variable y which has the target value i.e brain weight
z=data.iloc[:,3:4].values 
print(round(data['Gender'].corr(data['Brain Weight(grams)'],method='spearman')))          
plt.scatter(w,z,c='red')
plt.title('scattered graph for Spearman correlation between Gender and brainweight' )
plt.xlabel('Gender')
plt.ylabel('brain weight')
plt.show()
print(round(data['Age Range'].corr(data['Brain Weight(grams)'],method='spearman')))          
plt.scatter(x,z,c='red')
plt.title('scattered graph for Spearman correlation between age and brainweight' )
plt.xlabel('age range')
plt.ylabel('brain weight')
plt.show()
print(round((data['Head Size(cm^3)'].corr(data['Brain Weight(grams)'],method='spearman'))))         
plt.scatter(x,z,c='red')
plt.title('scattered graph for Spearman correlation between head size and brainweight' )
plt.xlabel('head size')
plt.ylabel('brain weight')
plt.show()
data.info()
data['Head Size(cm^3)'].corr(data['Brain Weight(grams)'])
k1=data.corr(method='spearman')
print("The table for all possible values of spearman's coeffecients is as follows")
print(k1)

After you run your code in Spyder tool provided by anaconda distribution just go to your variable explorer and search for the variable named as k1 and double-click to see the values in that variable and you’ll see something like this:

在anaconda发行版提供的Spyder工具中运行代码后,转到变量资源管理器并搜索名为k1的变量,然后双击以查看该变量中的值,您将看到类似以下内容:

k1 dataframe

Here,1 signifies a perfect correlation,0 is for no correlation and -1 signifies a negative correlation.

此处,1表示完全相关,0表示没有相关,-1表示负相关。

As you look carefully, you will see that the value of the correlation between brain weight and head size is always 1. If you remember were getting a similar value of correlation in Pearson’s correlation

仔细观察,您会发现大脑重量和头部大小之间的相关性值始终为1。如果您记得在皮尔森相关性中获得了相似的相关性值

Now, just go to the ipython console you will see some self-explanatory scattered graphs, in case you are having any trouble understanding those graphs just have a look at my previous article about Pearson’s correlation and its implication in machine learning and you’ll get to know.

现在,只要转到ipython控制台,您将看到一些不言自明的分散图,以防万一您无法理解这些图,请看一下我以前关于Pearson的相关性及其在机器学习中的含义的文章,您将获得要知道。

This was all for today guys hope you liked it if you have any queries just drop a comment below and I would be happy to help you.

今天,这就是全部,如果您有任何疑问,希望您喜欢它,只需在下面发表评论,我们将竭诚为您服务。

翻译自: https://www.includehelp.com/ml-ai/spearmans-correlation-and-its-implication-in-machine-learning.aspx

spearman相关性

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/545989.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

[java] 找出字符串中出现最多的字符和出现的次数

逛园子看到一童鞋做的华为上机题目,写来好长的代码,懒得看,感觉不可能这么难,于是动手敲了下。 import java.util.Scanner;public class StringTest {/*** param args*/public static void main(String[] args) {// TODO Auto-gen…

WIN7开启WIFI

开启windows 7的隐藏功能:虚拟WiFi和SoftAP(即虚拟无线AP),就可以让电脑变成无线路由器,实现共享上网,节省网费和路由器购买费。主机设置如下: 【第一步】开始->在搜索栏中输入‘CMD’->右…

被问哭了,一位小姐姐的阿里面经!(附部分答案)

这篇文章是一位 女读者 (加粗!太难得)的面试阿里的经历分享,虽然第二面就失败了,但是这样的经历对自己帮助应该还是很大的。下面的一些问题非常具有代表性,部分问题我简单做了修改(有些问题表述…

Python程序不使用函数将字符大写

In this article, we will go for capitalizing the characters i.e. conversion from lowercase to uppercase without using any function. This article is based on the concept that how inbuilt function perform this task for us? 在本文中,我们将大写字符…

C语言,你真的弄懂了么?

程序&#xff08;来源 &#xff09;&#xff1a; #include <stdio.h>int main(void) {int x[4];printf("%p\n", (void*) (x));printf("%p\n", (void*) (x 1));printf("%p\n", (void*) (&x));printf("%p\n", (void*) (&…

Oracle自增序列

SQL Server中数据库可以自增字段&#xff0c;但是Oracle中没有这个选项&#xff0c;实际应用中我们可以使用序列(Sequence)实现想要的功能。创建Sequence语法如下&#xff1a;CREATE SEQUENCE SEQUENCE_NAME? START WITH 1 --从 1 开始INCREMENT BY 1 --每次增加 1NOMA…

阿里《Java开发手册》最新嵩山版发布!

《Java 开发手册》是阿里巴巴集团技术团队的集体智慧结晶和经验总结&#xff0c;经历了多次大规模一线实战的检验及不断完善&#xff0c;公开到业界后&#xff0c;众多社区开发者踊跃参与&#xff0c;共同打磨完善&#xff0c;系统化地整理成册&#xff0c;当前的版本是嵩山版。…

递归转化成非递归过程_8086微处理器中的递归和重入过程

递归转化成非递归过程As we all know that a procedure is a set of instruction written separately which can be used any time in the code when required. A normal procedure execution includes calling of the procedure, shifting the control of the processor to th…

the development of c language(转)

c语言之父Dennis Ritchie 写的关于c语言开发历史的文章&#xff0c;来自这里 lisp专家Richard P.Gabriel 的《the Rise of Worse is Better 》&#xff08;wikipedia入口 &#xff0c;c2入口 &#xff0c;《Worse is Better 》&#xff0c;《软件开发宗旨 》&#xff09;中也…

漫谈软件研发特种部队之中的一个

特种部队&#xff0c;是指进行特殊任务的部队&#xff0c;具有编制灵活、人员精干、装备精良、机动高速、训练有素、战斗力强等特点。 特种部队最早出如今二战期间。德国于1939年9月1日的波兰战役中首次投入了一种被称为“勃兰登堡”部队的特种部队作为德国突击波兰的先锋&…

Oracle创建用户和授权

在OracleXE中创建scott用户1、打开SQL*Plus&#xff0c;以 sys用户登录数据库connect / as sysdba2、依次执行下面命令--DROP USER scott CASCADE;CREATE USER scott IDENTIFIED BY tiger;GRANT connect,resource TO scott;GRANT CREATE DATABASE LINK, CREATE MATERIALIZED VI…

不要一把梭了,这才是SQL优化的正确姿势!|原创干货

这是我的第 83 篇原创文章作者 | 王磊来源 | Java中文社群&#xff08;ID&#xff1a;javacn666&#xff09;转载请联系授权&#xff08;微信ID&#xff1a;GG_Stone&#xff09;年少不知优化苦&#xff0c;遇坑方知优化难。——村口王大爷全文内容预览&#xff1a;我之前有很多…

PLSQL_性能优化系列10_Oracle Array数据组优化

2014-09-25 Created By BaoXinjian 一、摘要 集合是Oracle开发中经常遇到的情况&#xff0c;Oracle集合分为三种情况&#xff1a;索引表集合(index by table)、嵌套表集合(nested table)、可变集合(varry table)。 PL/SQL中没有数组的概念&#xff0c;他的集合数据类型和数组是…

最小c编译器

最小c编译器&#xff08;来源 &#xff08;最好在linux下操作&#xff09;&#xff09;代码有好几个版本&#xff0c;我选择otccelfn.c 。 /*Obfuscated Tiny C Compiler with ELF outputCopyright (C) 2001-2003 Fabrice BellardThis software is provided as-is, without any…

在Java中使用Collat​​or和String类进行字符串比较

Given two strings and we have to compare them using Collator and String classed in Java. 给定两个字符串&#xff0c;我们必须使用Java中分类的Collat​​or和String进行比较。 Using Collator class – to compare two strings, we use compare() method – it returns…

Oracle数据库中表格的级联删除问题

数据库表中没有设置级联删除.怎样用SQL语句实现:如:EMP表中有字段DEPT_NO是外键POS表中有字段DEPT_NO是外键DEPT表中有字段DEPT_NO,如何实现删除DEPT表中数据时将EMP表,POS表中的相关数据也删除;这里有两种方法&#xff1a;方法一&#xff1a;触发器解决create or replace trig…

IDEA 2020.2 重磅发布,动画级新功能预览!

Guide 关注了 IDEA 的官推&#xff0c;平时没事就会去看看有没有啥比较好的更新。今天下午看到IntelliJ IDEA 2020.2 都已经发布并且还支持了 Java15。然后&#xff0c;我就去官网简单看了一下新特性。单看新特性&#xff0c;这个新版本还是有一点香的。虽然我还木有升级到这个…

访问控制模型ACL和RBAC

2019独角兽企业重金招聘Python工程师标准>>> 1.ACL ACL是最早也是最基本的一种访问控制机制&#xff0c;它的原理非常简单&#xff1a;每一项资源&#xff0c;都配有一个列表&#xff0c;这个列表记录的就是哪些用户可以对这项资源执行CRUD中的那些操作。当系统试图…

最常见并发面试题整理!(速度收藏)

前言并发编程是面试中必问的知识点之一&#xff0c;所以本文整理了一些最为常见的并发面试题&#xff0c;一起来看吧~1. synchronized的实现原理以及锁优化&#xff1f;synchronized的实现原理synchronized作用于「方法」或者「代码块」&#xff0c;保证被修饰的代码在同一时间…

JavaScript中的嵌套事件处理(在鼠标移动事件上)

Multiple event handling is the secret ingredient of dynamic WebPages seen now-a-days. 多重事件处理是当今动态网页的秘密组成部分。 Now, let’s get started... 现在&#xff0c;让我们开始吧... Example Code 范例程式码 <html lang"en"><head&…