数据存储加密和传输加密_将时间存储网络应用于加密预测

数据存储加密和传输加密

I’m not going to string you along until the end, dear reader, and say “Didn’t achieve anything groundbreaking but thanks for reading ;)”.

亲爱的读者,我不会一直待到最后,然后说: “没有取得任何开创性的成就,但感谢您阅读;)”。

This network isn’t exactly a get-rich-quick scheme yet — when I make one, I probably won’t blog about it.

这个网络还不是一个快速致富的计划-当我建立一个网络时,我可能不会写博客。

It did, however, shed some interesting light on the 2018 Bitcoin crash. The car we’ve built runs, even if it often runs into walls and off cliffs. From here, it’s a question of tuning.

但是,它确实为2018年比特币崩溃提供了一些有趣的启示。 我们制造的汽车可以行驶,即使它经常撞到墙壁和悬崖上也是如此。 从这里开始,这是一个调优问题。

-

Last week I went through a good example of Hierarchical Temporal Memory algorithms predicting power consumption based only on the date+time & previous consumption. It went pretty well, very low mean-squared-error. HTM tech, unsurprisingly, does best when there’s a temporal element to data — some cause and effect patterns to “remember”.

上周,我看了一个很好的示例,该示例采用了分层时间记忆算法,该算法仅根据日期和时间以及先前的功耗来预测功耗。 它运行得很好,均方误差非常低。 毫不奇怪,HTM技术在存在数据临时元素(“记住”某些因果模式)的情况下效果最佳。

So I wondered: “What’s some other hot topic with temporal data hanging around for any random lad to download?” Bitcoin.

所以我想知道:“还有什么其他热门话题,其中有时间数据悬空供任何小伙子下载?” 比特币

Now hold on, is this a surmountable challenge in the first place? Surely nobody can predict the stock market, otherwise everyone would be doing it.But then that begs the question: why is there an entire industry of people moving money around at the proper time, and how are they afloat — if not for some degree of prediction?

现在坚持下去,这首先是一个可克服的挑战吗? 当然,没有人能预测股市,否则每个人都会做。但是,这引出了一个问题:为什么会有整个行业的人在适当的时候到处流动钱,他们如何生存-如果不是在某种程度上预测?

I’m fairly certain that crypto prices, at least, can be “learned” to some degree, because:

我相当确定,至少可以在某种程度上“学习”加密价格,因为:

  1. If anyone had a magic algorithm to reliably predict the prices, they’d keep it to themselves (or lease it out proprietarily)

    如果有人使用魔术算法来可靠地预测价格,他们会自行保留价格(或专有地出租)
  2. There’s a company that already does this

    有一家公司已经做到了

Intelletic is an interesting real-world example. Their listed ‘cortical algorithms’ are based on HTM (neocortex) tech, and they list “Price Prediction Alerts” as a main product for investors to make use of.

Intelletic是一个有趣的现实示例。 他们列出的“皮质算法”是基于HTM(新皮质 )技术的,他们将“价格预测警报”列为供投资者使用的主要产品。

Image for post
source资源

These are quite fascinating promises, and not too hard to believe, either. As we saw last week, HTM models learn & predict on the fly — no need to train 80/20 splits on historical data. In a landscape where high-frequency trading algorithms have started to compete with day traders, integrating near-present data is certainly a formidable advantage.

这些都是令人着迷的承诺,也不是很难相信。 正如我们在上周看到的那样,HTM模型可以实时进行学习和预测-无需根据历史数据训练80/20分割。 在高频交易算法已开始与日间交易者竞争的情况下,集成近乎当前的数据无疑是一个巨大的优势。

Can we quickly whip up an HTM network that predicts Bitcoin prices? Sure. Does it make great predictions?

我们可以快速建立一个预测比特币价格的HTM网络吗? 当然。 它能做出很好的预测吗?

Image for post
you’d have better luck with astrology
占星术会更好

Woe unto anyone who listens to a model that took 10 minutes to train on a laptop.

那些听模型花费了10分钟才能在笔记本电脑上进行训练的人都会感到不适。

I would describe this model as cute. Maybe “scrappy”.

我认为这个模特很可爱。 也许是“草率”。

I’ll dig into what makes the model bad & what makes it viable at all. My previous articles on HTM tech will get you up to speed with what I’m writing about, but the best and easiest way to understand any of this is the HTM School videos.

我将深入研究导致模型失效的原因以及使其可行的原因。 我之前关于HTM技术的文章可以使您快速掌握我的写作内容,但是了解其中任何内容的最佳,最简便的方法是HTM School视频。

I found some data online for the past 3 years of Coinbase’s BTC data, including opening & closing prices, high & low, and volume of USD & BTC for each hour since July 2017. Cleaned it up a little in pandas, standard stuff.

我发现了过去三年Coinbase BTC数据的一些在线数据,包括开盘价和收盘价,最高价和最低价以及自2017年7月以来每小时的美元和BTC交易量。在熊猫中清理了一些标准内容。

What’s interesting is that we’ve got 6 scalars & 1 DateTime to encode into an SDR:

有趣的是,我们有6个标量和1个DateTime可以编码为SDR:

# 
dateEncoder = DateEncoder(
timeOfDay = (30,1)
weekend = 21scalarEncoderParams = RDSE_Parameters() # random distributed scalar encoder
scalarEncoderParams.size = 800
scalarEncoderParams.sparsity = 0.02
scalarEncoderParams.resolution = 0.88
scalarEncoder = RDSE(scalarEncoderParams) # create the encoderencodingWidth = (dateEncoder.size + scalarEncoder.size*6) # since we're using the scalarEncoder 6 times

The dateEncoder runs much the same as power consumption prediction — coding for Year isn’t really going to give semantic meaning when there’s only 3 years to work with. These are the interesting questions to ask when you make SDRs — is the info I’m including worth it? Why?

dateEncoder的运行与功耗预测非常相似,只有3年的工作时间,Year的编码实际上并没有给出语义上的含义。 这些是制作SDR时会问的有趣问题-我所包括的信息值得吗? 为什么?

HTM.core has a good amount of Nupic’s Python 2.7 code translated to Python 3, but lacks their MultiEncoder — a neat wrapper that combines several encoders together for easy individual tuning. So I used the same ScalarEncoder for all 6 floats like a complete barbarian. A better plan would be to create 6 different encoders, one for each variable.

HTM.core拥有大量的Nupic的Python 2.7代码转换为Python 3,但缺少MultiEncoder —一种简洁的包装程序,将多个编码器组合在一起,可以轻松地进行单独调整。 因此,我对所有6个浮动对象都使用了相同的ScalarEncoder ,就像一个完整的野蛮人一样。 更好的计划是创建6个不同的编码器,每个变量一个。

I also increased the size of the SpatialPooler and Temporal Memory by ~30%, and boosted the TM’s number of cells per column from 13 to 20. My reasoning is that a good predictive model should look relatively further back when we’re dealing with one-hour timestamps, so longer columns makes a difference.20 hours might not be enough, in retrospect, considering many standard stock algorithms use 7–10 day rolling averages and other longer metrics.

我还将SpatialPooler和时间记忆的大小增加了约30%,并将TM每列的单元数从13增加到20。我的理由是,当我们处理一个模型时,一个好的预测模型应该相对地往后看时戳,因此较长的列会有所作为。回想起来,考虑到许多标准股票算法使用7-10天的滚动平均值和其他更长的指标,20小时可能还不够。

This changed my training time from 7 minutes to 10, by the way. Seems like there’s a lot of room to add computationally heavy parameters, but I don’t think that’s quite the problem here.

顺便说一下,这将我的培训时间从7分钟更改为10分钟。 似乎有很大的空间可以添加计算繁重的参数,但是我认为这并不是问题所在。

你得到你投入的东西 (You get out what you put in)

Here’s a neat visual example of SDR concatenation, shameless stolen from episode 6 of HTM School:

这是SDR串联的简洁直观示例,从HTM School第6集中偷偷偷偷偷偷地偷了:

Image for post
https://www.youtube.com/watch?v=PTYlge2K1G8https://www.youtube.com/watch?v=PTYlge2K1G8

A DateTime encoder creates mini-SDRs for each part of information contained by the timestamp, and combines it to form a larger encoding. As long as the order of concatenation is preserved (first X bits are day_of_week, next Y bits are weekend, etc) then you can combine any encoded datatypes into a coherent SDR to feed to the SpatialPooler.

DateTime编码器为时间戳所包含的信息的每个部分创建mini-SDR,并将其组合以形成更大的编码。 只要保留串联的顺序(前X个位是day_of_week ,下一个Y位是weekendSpatialPooler ),那么您就可以将任何编码的数据类型组合成一致的SDR,以馈送给SpatialPooler

Image for post

Here’s the first half of the training loop. We start by designating the output value to be predicted, and then encode the data, running 6 values through the scalarEncoder.

这是训练循环的前半部分。 我们首先指定要预测的输出值,然后对数据进行编码,并通过scalarEncoder运行6个值。

The neat part is creating encoding=SDR(encodingWidth).concatenate([list_of_SDRs]) . We only have to pre-define encodingWidth as the sum of each variable’s SDR length (or size), create an SDR with that size, then use .concatenate() to fill the empty SDR with bits.

整洁的部分是创建encoding=SDR(encodingWidth).concatenate([list_of_SDRs]) 。 我们只需要预定义encodingWidth作为每个变量的SDR长度(或大小)的总和,创建具有该大小的SDR,然后使用.concatenate()即可用位填充空的SDR。

错位 (Misalignment)

After training on 27,000 hours of price data, our accuracy didn’t seem too bad:

经过27,000小时的价格数据训练后,我们的准确性似乎还不错:

Root Mean Squared Error for 1-step, 5-step predictions:
{1: 2.8976319767388525, 5: 2.8978454708090684}

Considering the units are anywhere from ~4,000 to 10,000, this seemed a little too good to be true. I graphed it to find out:

考虑到单位在从4,000到10,000的范围内,这似乎有点不真实。 我将其绘制图形以找出:

Image for post
spooky
幽灵般的

It takes ~2000 hours to start outputting any decent predictions — you can see 1-step and 5-step blue/green lines jumping up and down in the very beginning.

开始输出任何体面的预测大约需要2000个小时-一开始,您会看到1步和5步蓝/绿线上下跳跃。

What’s quite interesting is the huge anomaly predictions around the 12,000–15,000 hour regions. In terms of price spikes, here’s nothing huge going on while it’s predicting anomalous next-hour values.

有趣的是,在12,000–15,000小时区域周围的巨大异常预测。 就价格飙升而言,在预测下一小时的价值异常 ,没有什么大不了的。

But after the first of the two big anomaly spikes, there’s a relatively sharp crash— from $7,000 to a bottom of under $3,000.

但是,在两次大的异常峰值中的第一个之后,出现了一个相对急剧的崩溃-从7,000美元跌至底部3,000美元以下。

This is the machine learning equivalent of walking your dog in the woods at night, when suddenly it starts whining and growling, looking off into the darkness. You don’t sense anything and keep walking. Out of the corner of your eye you see two bright red eyes staring at you, so you run like hell and wish you’d listened to the dog sooner.

这是机器学习的等效方法,它相当于在晚上dog狗到树林里,突然之间它开始抱怨和咆哮,直视黑暗。 您什么都没感觉到并且继续走。 从您的眼角可以看到两只鲜红的眼睛凝视着您,所以您像地狱般奔跑,希望您早日听见那只狗。

We’re predicting closing_price for each hour, but the model’s also learning from the high, low, currency volume, and the recent patterns and relationships of each. It picked up on some relatively strange activity — perhaps those stretches had anomalous fluctuations in high-low span or rapid micro-deviations in price.

我们正在预测每小时的closing_price ,但是该模型还在从高,低,货币数量以及每个货币的近期模式和关系中学习。 它开始了一些相对奇怪的活动-也许这些延伸段的高低跨度出现了异常波动,或者价格出现了微小的快速波动。

It also predicted anomalous activity for a while before the prices started to recover, although it’s a little less clear-cut. It could be said that sudden drops in price tend to change people’s speculative trading behavior, so anomalies would be likely occurrences. In that case, I wonder what caused the pre-crash anomaly readings.

该公司还预测,在价格开始回升之前的一段时间内会出现异常活动,尽管这一点不太明确。 可以说,价格突然下跌往往会改变人们的投机交易行为,因此很可能会出现异常情况。 在那种情况下,我想知道是什么原因导致了崩溃前的异常读数。

我们没有在看什么 (What We’re Not Looking At)

We had some interesting ideas from the large-scale graph. But if the goal is predict prices for trading, we need to look at the day-to-day use case. Here’s May 2018:

我们从大型图表中获得了一些有趣的想法。 但是,如果目标是预测交易价格,则我们需要查看日常用例。 这是2018年5月:

Image for post

The 5-step and 1-step predictions don’t follow the input that well, to be honest. It’s normally within a couple hundred dollars, but that’s not a very reassuring metric when you’re g̶a̶m̶b̶l̶i̶n̶g̶ investing. There’s far too many deviations for my adorable model to compete with Intelletic’s price predictions. But why?

老实说,五步预测和一步预测并没有很好地遵循输入。 通常在几百美元以内,但是当您进行大量投资时,这并不是一个令人放心的指标。 我的可爱模型有太多差异,无法与Intelletic的价格预测竞争。 但为什么?

The biggest weakness of this model is that it’s trying to put together a puzzle without all of the pieces. We only encoded Bitcoin price data, but the price of the most popular cryptocurrency is a result of many diverse factors — public confidence, private investors, whatever clickbait article was published, whether the president/Musk tweeted about crypto, etc.

该模型的最大缺点是,它试图在没有所有碎片的情况下拼凑一个难题。 我们仅对比特币价格数据进行编码,但是最受欢迎的加密货币的价格是多种因素共同作用的结果-公众信心,私人投资者,发表的任何点击诱饵文章,总统/马斯克是否发表过有关加密货币的推文等。

A decent metric to look at might be the prices of other cryptocurrencies, however. Many coins tend to swing in accordance with each other.

然而,一个不错的指标可能是其他加密货币的价格。 许多硬币倾向于相互摆动。

What is certain is that there’s much more room to encode more complex data into SDR input. Increasing the input size and model complexity by ~30% changed runtime from 7 to 10 minutes. This is largely due to the ease of bitwise comparisons between SDRs & the synaptic configuration of Temporal Memory systems; the exploding-runtime architecture complexity problem doesn’t apply in quite the same way.

可以肯定的是,还有更多的空间可以将更复杂的数据编码为SDR输入。 输入大小和模型复杂度增加约30%,将运行时间从7分钟更改为10分钟。 这主要是由于SDR之间的按位比较容易以及时间记忆系统的突触配置所致。 爆炸式运行时体系结构复杂性问题不适用于完全相同的方式。

So what have we learned?

所以我们学了什么?

It’s near impossible to get close to 100% accuracy with predicting markets like these, but some companies manage to get high enough certainty to make significant gains over time. That means the process works, and it’s worth pushing to improve it further.

预测这样的市场几乎不可能达到100%的准确性,但是有些公司设法获得足够高的确定性,以随着时间的流逝取得可观的收益。 这意味着该流程有效,值得进一步改进它。

Don’t underestimate the complexity of publicly-traded commodities, for a start. Train on more than 3 years of data, certainly.

首先,请不要低估公开交易商品的复杂性。 当然,要训练3年以上的数据。

But when in doubt, find more data. If your model is already at a certain point of complexity & you’re not getting the results you want, you’re either not looking at the whole picture (data) and/or not asking the right questions.

但是,如果有疑问,请查找更多数据。 如果您的模型已经处于某个复杂的点上,并且您没有得到想要的结果,那么您要么没有查看整个图片(数据)和/或没有提出正确的问题。

翻译自: https://medium.com/swlh/applying-temporal-memory-networks-to-crypto-prediction-24f924c3a014

数据存储加密和传输加密

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/389200.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

熊猫分发_熊猫新手:第一部分

熊猫分发For those just starting out in data science, the Python programming language is a pre-requisite to learning data science so if you aren’t familiar with Python go make yourself familiar and then come back here to start on Pandas.对于刚接触数据科学的…

多线程 进度条 C# .net

前言  在我们应用程序开发过程中,经常会遇到一些问题,需要使用多线程技术来加以解决。本文就是通过几个示例程序给大家讲解一下多线程相关的一些主要问题。 执行长任务操作  许多种类的应用程序都需要长时间操作,比如:执行一…

《Linux内核原理与分析》第六周作业

课本:第五章 系统调用的三层机制(下) 中断向量0x80和system_call中断服务程序入口的关系 0x80对应着system_call中断服务程序入口,在start_kernel函数中调用了trap_init函数,trap_init函数中调用了set_system_trap_gat…

Codeforces Round 493

心情不好&#xff0c;被遣散回学校 &#xff0c;心态不好 &#xff0c;为什么会累&#xff0c;一直微笑就好了 #include<bits/stdc.h> using namespace std; int main() {freopen("in","r",stdin);\freopen("out","w",stdout);i…

android动画笔记二

从android3.0&#xff0c;系统提供了一个新的动画&#xff0d;property animation, 为什么系统会提供这样一个全新的动画包呢&#xff0c;先来看看之前的补间动画都有什么缺陷吧1、传统的补间动画都是固定的编码&#xff0c;功能是固定的&#xff0c;扩展难度大。比如传统动画只…

回归分析检验_回归分析

回归分析检验Regression analysis is a reliable method in statistics to determine whether a certain variable is influenced by certain other(s). The great thing about regression is also that there could be multiple variables influencing the variable of intere…

是什么样的骚操作让应用上线节省90%的时间

优秀的程序员 总会想着 如何把花30分钟才能解决的问题 在5分钟内就解决完 例如在应用上线这件事上 通常的做法是 构建项目在本地用maven打包 每次需要clean一次&#xff0c;再build一次 部署包在本地ide、git/svn、maven/gradie 及代码仓库、镜像仓库和云平台间 来回切换 上传部…

Ubuntu 18.04 下如何配置mysql 及 配置远程连接

首先是大家都知道的老三套&#xff0c;啥也不说上来就放三个大招&#xff1a; sudo apt-get install mysql-serversudo apt isntall mysql-clientsudo apt install libmysqlclient-dev 这三步下来mysql就装好了&#xff0c;然后我们偷偷检查一下 sudo netstat -tap | grep mysq…

数据科学与大数据技术的案例_主数据科学案例研究,招聘经理的观点

数据科学与大数据技术的案例I’ve been in that situation where I got a bunch of data science case studies from different companies and I had to figure out what the problem was, what to do to solve it and what to focus on. Conversely, I’ve also designed case…

队列的链式存储结构及其实现_了解队列数据结构及其实现

队列的链式存储结构及其实现A queue is a collection of items whereby its operations work in a FIFO — First In First Out manner. The two primary operations associated with them are enqueue and dequeue.队列是项目的集合&#xff0c;由此其操作以FIFO(先进先出)的方…

cad2016珊瑚_预测有马的硬珊瑚覆盖率

cad2016珊瑚What’s the future of the world’s coral reefs?世界珊瑚礁的未来是什么&#xff1f; In February of 2020, scientists at University of Hawaii Manoa released a study addressing this very question. The models they developed forecasted a 70–90% worl…

EChart中使用地图方式总结(转载)

EChart中使用地图方式总结 2018年02月06日 22:18:57 来源&#xff1a;https://blog.csdn.net/shaxiaozilove/article/details/79274772最近在仿照EChart公交线路方向示例&#xff0c;开发表示排水网和污水网流向地图&#xff0c;同时地图上需要叠加排放口、污染源、污水处理厂等…

android mvp模式

越来越多人讨论mvp模式&#xff0c;mvp在android应用开发中获得更多的重视&#xff0c;这里说一下对MVP的简单了解。 什么是 MVP? MVP模式使逻辑从视图层分开&#xff0c;目的是我们在屏幕上怎么表现&#xff0c;和界面如何工作的所有事情就完全分开了。 View显示数据&…

Node.js REPL(交互式解释器)

2019独角兽企业重金招聘Python工程师标准>>> Node.js REPL(交互式解释器) Node.js REPL(Read Eval Print Loop:交互式解释器) 表示一个电脑的环境&#xff0c;类似 Window 系统的终端或 Unix/Linux shell&#xff0c;我们可以在终端中输入命令&#xff0c;并接收系统…

用python进行营销分析_用python进行covid 19分析

用python进行营销分析Python is a highly powerful general purpose programming language which can be easily learned and provides data scientists a wide variety of tools and packages. Amid this pandemic period, I decided to do an analysis on this novel coronav…

Alpha冲刺第二天

Alpha第二天 1.团队成员 郑西坤 031602542 &#xff08;队长&#xff09; 陈俊杰 031602504陈顺兴 031602505张胜男 031602540廖钰萍 031602323雷光游 031602319苏芳锃 0316023302.项目燃尽图 3.项目进展 时间工作内容11月18日UI设计、初步架构搭建11月19日UI设计、服务器的进一…

水文分析提取河网_基于图的河网段地理信息分析排序算法

水文分析提取河网The topic of this article is the application of information technologies in environmental science, namely, in hydrology. Below is a description of the algorithm for ranking rivers and the plugin we implemented for the open-source geographic…

请不要更多的基本情节

“If I see one more basic blue bar plot…”“如果我再看到一个基本的蓝色条形图……” After completing the first module in my studies at Flatiron School NYC, I started playing with plot customizations and design using Seaborn and Matplotlib. Much like doodl…

Powershell-获取DHCP地址租用信息

需求&#xff1a;业务需要获取现阶段DHCP服务器所有地址租用信息。 1.首先查看DHCP相关帮助信息&#xff1a;2.确定执行命令并获取相关帮助信息&#xff1a;help Get-DhcpServerv4Scope 名称 Get-DhcpServerv4Scope 语法 Get-DhcpServerv4Scope [[-ScopeId] <ipaddress[]>…

python 交互式流程图_使用Python创建漂亮的交互式和弦图

python 交互式流程图Python中的数据可视化 (Data Visualization in Python) R vs Python is a constant tussle when it comes to what is the best language, according to data scientists. Though each language has it’s strengths, R, in my opinion has one cutting-edg…