arima 预测模型_预测未来:学习使用Arima模型进行预测

arima 预测模型

XTS对象 (XTS Objects)

If you’re not using XTS objects to perform your forecasting in R, then you are likely missing out! The major benefits that we’ll explore throughout are that these objects are a lot easier to work with when it comes to modeling, forecasting, & visualization.

如果您没有使用XTS对象在R中执行预测,那么您很可能会错过! 我们将始终探索的主要好处是,在建模,预测和可视化方面,这些对象更易于使用。

让我们进入细节 (Let’s Get to The Details)

XTS objects are composed of two components. The first is a date index and the second of which is a traditional data matrix.

XTS对象由两个组件组成。 第一个是日期索引,第二个是传统数据矩阵。

Whether you want to predict churn, sales, demand, or whatever else, let’s get to it!

无论您是要预测客户流失,销售,需求还是其他,我们都可以开始吧!

The first thing you’ll need to do is create your date index. We do so using the seq function. Very simply this function takes what is your start date, the number of records you have or length, and then the time interval or by parameter. For us, the dataset starts with the following.

您需要做的第一件事是创建日期索引。 我们使用seq函数。 很简单,此功能需要的只是你的开始日期,你有记录的数目或长度,然后将时间间隔或by参数。 对于我们来说,数据集从以下开始。

days <- seq(as.Date("2014-01-01"), length = 668, by = "day")

Now that we have our index, we can use it to create our XTS object. For this, we will use the xts function.

现在我们有了索引,可以使用它来创建XTS对象。 为此,我们将使用xts函数。

Don’t forget to install.packages('xts') and then load the library! library(xts)

不要忘了先安装install.packages('xts') ,然后加载库! library(xts)

Once we’ve done this we’ll make our xts call and pass along our data matrix, and then for the date index we will pass the index to the order.by option.

完成此操作后,我们将进行xts调用并传递数据矩阵,然后对于日期索引,我们会将索引传递给order.by选项。

sales_xts <- xts(sales, order.by = days)

让我们与Arima建立预测 (Let’s Create a Forecast with Arima)

Arima stands for auto regressive integrated moving average. A very popular technique when it comes to time series forecasting. We could spend hours talking about ARIMA alone, but for this post, we’re going to give a high-level explanation and then jump directly into the application.

有马代表自动回归综合移动平均线。 关于时间序列预测的一种非常流行的技术。 我们可能只花几个小时来谈论ARIMA,但是在这篇文章中,我们将给出一个高级的解释,然后直接进入该应用程序。

AR:自回归 (AR: Auto Regressive)

This is where we predict outcomes using lags or values from previous months. It may be that the outcomes of a given month have some dependency on previous values.

在这里,我们使用前几个月的滞后或值来预测结果。 给定月份的结果可能与以前的值有一定的依赖性。

一:集成 (I: Integrated)

When it comes to time series forecasting, an implicit assumption is that our model depends on time in some capacity. This seems pretty obvious as we probably wouldn’t make our model time based otherwise ;). With that assumption out of the way, we need to understand where on the spectrum of dependence time falls in relation to our model. Yes, our model depends on time, but how much? Core to this is the idea of Stationarity; which means that the effect of time diminishes as time goes on.

在进行时间序列预测时,一个隐含的假设是我们的模型在某种程度上取决于时间。 这似乎很明显,因为我们可能不会将模型时间设为其他时间;)。 有了这个假设,我们需要了解与我们的模型有关的依赖时间范围。 是的,我们的模型取决于时间,但是多少? 核心思想是平稳性 ; 这意味着随着时间的流逝,时间的影响减弱。

Going deeper, the historical average of a dataset tends to be the best predictor of future outcomes… but there are certainly times when that’s not true.. can you think of any situations when the historical mean would not be the best predictor?

更深入地讲,数据集的历史平均值往往是未来结果的最佳预测因子……但是,在某些情况下,这是不正确的……您能想到历史均值不是最佳预测因子的任何情况吗?

  • How about predicting sales for December? Seasonal Trends

    预测12月的销售情况如何? 季节性趋势
  • How about sales for a hyper-growth saas company? Consistent upward trends

    一家高速增长的saas公司的销售情况如何? 一致的上升趋势

This is where the process of Differencing is introduced! Differencing is used to eliminate the effects of trends & seasonality.

这就是引入差分过程的地方! 差异用于消除趋势和季节性的影响。

MA:移动平均线 (MA: Moving Average)

the moving average model exists to deal with the error of your model.

存在移动平均模型以处理模型误差。

让我们开始建模吧! (Let’s Get Modeling!)

火车/验证拆分 (Train/Validation Split)

First things first, let’s break out our data into a training dataset and then what we’ll call our validation dataset.

首先,让我们将数据分为训练数据集,然后将其称为验证数据集。

What makes this different than other validation testing, like cross-validation testing is that here we break it out by time, breaking train up to a given point in time and breaking out validation for everything thereafter.

与其他验证测试(例如交叉验证测试)不同的是,这里我们按时间细分,将训练分解到给定的时间点,然后对所有内容进行验证。

train <- sales_xts[index(sales_xts) <= "2015-07-01"] 
validation <- sales_xts[index(sales_xts) > "2015-07-01"]

是时候建立模型了 (Time to Build a Model)

The auto.arima function incorporates the ideas we just spoke about to approximate the best arima model. I will detail the more hands-on approach in another post, but below I’ll explore the generation of an auto.arima model and how to use it to forecast.

auto.arima函数结合了我们刚才谈到的想法,可以近似最佳arima模型。 我将在另一篇文章中详细介绍更多的动手方法,但是下面我将探讨auto.arima模型的生成以及如何使用它进行预测。

model <- auto.arima(train)

Now let’s generate a forecast. The same way we did before, we’ll create a date index and then create an xts object with the data matrix.

现在让我们生成一个预测。 与之前相同,我们将创建一个日期索引,然后使用数据矩阵创建一个xts对象。

From here you will plot the validation data and then throw the forecast on top of the plot.

在这里,您将绘制验证数据,然后将预测放在该图的顶部。

forecast <- forecast(model, h = 121) 
forecast_dates <- seq(as.Date("2015-09-01"), length = 121, by = "day")forecast_xts <- xts(forecast$mean, order.by = forecast_dates)plot(validation, main = 'Forecast Comparison')lines(forecast_xts, col = "blue")
Image for post

结论 (Conclusion)

I hope this was a helpful introduction to ARIMA forecasting. Be sure to let me know what’s helpful and any additional detail you’d like to learn about.

我希望这对ARIMA预测很有帮助。 请务必让我知道有什么帮助以及您想了解的任何其他详细信息。

If you found this helpful be sure to check out some of my other posts on datasciencelessons.com. Happy Data Science-ing!

如果您认为这有帮助,请务必在datasciencelessons.com上查看我的其他一些帖子。 快乐数据科学!

翻译自: https://towardsdatascience.com/predicting-the-future-learn-to-forecast-with-arima-models-879853c46a4d

arima 预测模型

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/388826.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

bigquery_在BigQuery中链接多个SQL查询

bigqueryBigquery is a fantastic tool! It lets you do really powerful analytics works all using SQL like syntax.Bigquery是一个很棒的工具&#xff01; 它使您能够使用像语法一样SQL来进行真正强大的分析工作。 But it lacks chaining the SQL queries. We cannot run …

大理石在哪儿 (Where is the Marble?,UVa 10474)

题目描述&#xff1a;算法竞赛入门经典例题5-1 1 #include <iostream>2 #include <algorithm>3 using namespace std;4 int maxn 10000 ;5 int main()6 {7 int n,q,a[maxn] ,k0;8 while(scanf("%d%d",&n,&q)2 && n &&q…

mysql 迁移到tidb_通过从MySQL迁移到TiDB来水平扩展Hive Metastore数据库

mysql 迁移到tidbIndustry: Knowledge Sharing行业&#xff1a;知识共享 Author: Mengyu Hu (Platform Engineer at Zhihu)作者&#xff1a;胡梦瑜(Zhhu的平台工程师) Zhihu which means “Do you know?” in classical Chinese, is the Quora of China: a question-and-ans…

XCode、Objective-C、Cocoa 说的是几样东西

大部分有一点其他平台开发基础的初学者看到XCode&#xff0c;第一感想是磨拳擦掌&#xff0c;看到 Interface Builder之后&#xff0c;第一感想是跃跃欲试&#xff0c;而看到Objective-C的语法&#xff0c;第一感想就变成就望而却步了。好吧&#xff0c;我是在说我自己。 如果…

递归函数基例和链条_链条和叉子

递归函数基例和链条因果推论 (Causal Inference) This is the fifth post on the series we work our way through “Causal Inference In Statistics” a nice Primer co-authored by Judea Pearl himself.这是本系列的第五篇文章&#xff0c;我们通过“因果统计推断”一书进行…

java lock 信号_java各种锁(ReentrantLock,Semaphore,CountDownLatch)的实现原理

先放结论&#xff1a;主要是实现AbstractQueuedSynchronizer中进入和退出函数&#xff0c;控制不同的进入和退出条件&#xff0c;实现适用于各种场景下的锁。JAVA中对于线程的同步提供了多种锁机制&#xff0c;比较著名的有可重入锁ReentrantLock&#xff0c;信号量机制Semapho…

Intent.ACTION_MAIN

1 Intent.ACTION_MAIN String: android.intent.action.MAIN 标识Activity为一个程序的开始。比较常用。 Input:nothing Output:nothing 例如&#xff1a; 1 <activity android:name".Main"android:label"string/app_name">2 <intent-filter…

足球预测_预测足球热

足球预测By Aditya Pethe通过阿蒂亚皮特(Aditya Pethe) From September to January every year, football takes over America. Games dominate TV Sunday and Monday nights, and my brother tears his hair out each week over his consistently underperforming fantasy te…

C#的特性Attribute

一、什么是特性 特性是用于在运行时传递程序中各种元素&#xff08;比如类、方法、结构、枚举、组件等&#xff09;的行为信息的声明性标签&#xff0c;这个标签可以有多个。您可以通过使用特性向程序添加声明性信息。一个声明性标签是通过放置在它所应用的元素前面的方括号&am…

python3中朴素贝叶斯_贝叶斯统计:Python中从零开始的都会都市

python3中朴素贝叶斯你在这里 (You are here) If you’re reading this, odds are: (1) you’re interested in bayesian statistics but (2) you have no idea how Markov Chain Monte Carlo (MCMC) sampling methods work, and (3) you realize that all but the simplest, t…

【转载】移动端布局概念总结

布局准备工作及布局思想及概念: 一个显示器&#xff08;pc端显示器 及 手机屏显示器&#xff09;&#xff0c;既有物理像素&#xff0c;又有独立像素&#xff08;独立像素也叫作css像素&#xff0c;用于前端人员使用&#xff09;&#xff1b; -->重要 首先确定设计稿的尺寸…

深入浅出:HTTP/2

上篇文章深入浅出&#xff1a;5G和HTTP里给自己挖了一根深坑&#xff0c;说是要写一篇关于HTTP/2的文章&#xff0c;今天来还账了。 本文分为以下几个部分&#xff1a; HTTP/2的背景HTTP/2的特点HTTP/2的协议分析HTTP/2的支持 HTTP/2简介 HTTP/2主要是为了解决现HTTP 1.1性能不…

画了个Android

画了个Android 今晚瞎折腾&#xff0c;闲着没事画了个机器人——android&#xff0c;浪费了一个晚上的时间。画这丫还真不容易&#xff0c;为那些坐标&#xff0c;差点砸了键盘&#xff0c;好在最后画出个有模有样的&#xff0c;心稍安。 下面来看看画这么个机器人需要些什么东…

数据治理 主数据 元数据_我们对数据治理的误解

数据治理 主数据 元数据Data governance is top of mind for many of my customers, particularly in light of GDPR, CCPA, COVID-19, and any number of other acronyms that speak to the increasing importance of data management when it comes to protecting user data.…

提高机器学习质量的想法_如何提高机器学习的数据质量?

提高机器学习质量的想法The ultimate goal of every data scientist or Machine Learning evangelist is to create a better model with higher predictive accuracy. However, in the pursuit of fine-tuning hyperparameters or improving modeling algorithms, data might …

mysql 集群实践_MySQL Cluster集群探索与实践

MySQL集群是一种在无共享架构(SNA&#xff0c;Share Nothing Architecture)系统里应用内存数据库集群的技术。这种无共享的架构可以使得系统使用低廉的硬件获取高的可扩展性。MySQL集群是一种分布式设计&#xff0c;目标是要达到没有任何单点故障点。因此&#xff0c;任何组成部…

matlab散点图折线图_什么是散点图以及何时使用

matlab散点图折线图When you were learning algebra back in high school, you might not have realized that one day you would need to create a scatter plot to demonstrate real-world results.当您在高中学习代数时&#xff0c;您可能没有意识到有一天需要创建一个散点图…

python字符串和List:索引值以 0 为开始值,-1 为从末尾的开始位置;值和位置的区别哦...

String&#xff08;字符串&#xff09;Python中的字符串用单引号 或双引号 " 括起来&#xff0c;同时使用反斜杠 \ 转义特殊字符。 字符串的截取的语法格式如下&#xff1a; 变量[头下标:尾下标]索引值以 0 为开始值&#xff0c;-1 为从末尾的开始位置。[一个是值&#x…

逻辑回归 python_深入研究Python的逻辑回归

逻辑回归 pythonClassification techniques are an essential part of machine learning and data science applications. Approximately 70% of problems in machine learning are classification problems. There are lots of classification problems that are available, b…

spring定时任务(@Scheduled注解)

&#xff08;一&#xff09;在xml里加入task的命名空间 xmlns:task"http://www.springframework.org/schema/task" http://www.springframework.org/schema/task http://www.springframework.org/schema/task/spring-task-4.1.xsd&#xff08;二&#xff09;启用注…