充分利用昂贵的分析

By Noor Malik

努尔·马利克(Noor Malik)

Let’s say you write a query in Deephaven which performs a lengthy and expensive analysis, resulting in a live table. For example, in a previous project, I wrote a query which pulled data from an RSS feed to create a live table of earnings call transcripts, and an expensive Sentiment Analysis machine learning model was used to predict overall sentiments.

假设您在Deephaven中编写了一个查询，该查询执行了冗长且昂贵的分析，从而产生了活动表。例如，在上一个项目中，我编写了一个查询，该查询从RSS提要中提取数据以创建实时收入通话记录表，并使用了昂贵的Sentiment Analysis机器学习模型来预测总体情绪。

After performing the analysis, you want to use the resulting live table in several other queries. For example, I wanted to use my live table of sentiment predictions in another query which verified whether the sentiment predictions matched the direction of the companies’ stocks. Luckily, Deephaven provides the ability to share tables between queries with Preemptive Tables.

执行分析之后，您想在其他几个查询中使用生成的活动表。例如，我想在另一个查询中使用我的情绪预测实时表，该查询验证了情绪预测是否与公司股票的方向一致。幸运的是，Deephaven提供了使用抢先表在查询之间共享表的功能。

With Preemptive Tables, the query processor automatically pushes a consistent snapshot of all data from a table on the server to subscribed clients at regular intervals. The publisher specifies the refresh rate of the Preemptive Table, the frequency at which the table is sent over the network to subscribers, and client queries set a timeout threshold, the maximum amount of time to wait for a connection to the publisher query to be established before the connection times out.

使用抢占式表，查询处理器会自动将所有数据的一致快照从服务器上的表定期推送到订阅的客户端。发布者指定抢占表的刷新率，该表通过网络发送给订户的频率以及客户端查询设置超时阈值，等待与发布者查询建立连接的最大时间在连接超时之前。

Any table on the Deephaven server can easily be published as a Preemptive table. In my “EarningsCallSentimentAnalysis” query, I produced a table called callPredictions that I wanted to share as a Preemptive Table with a 2-minute refresh rate. I did so as follows:

Deephaven服务器上的任何表都可以轻松地发布为抢先表。在我的“ EarningsCallSentimentAnalysis”查询中，我生成了一个名为callPredictions的表，我希望将其共享为2分钟刷新率的抢占式表。我这样做如下：

callPredictionsPre = callPredictions.preemptiveUpdatesTable(2*60*1000)

Image for post — My callPredictions table

My other query, which needed to use my callPredictions table, created a client connection with a timeout threshold of 3 minutes and subscribed to the table as follows:

我的另一个查询(需要使用我的callPredictions表)创建了一个超时阈值为3分钟的客户端连接，并按以下方式订阅了该表：

With Preemptive Tables, I was able to use the Sym column of the callPredictions table to look up past and present stock prices and join the directions of movement onto callPredictions in a column called Direction. I then created a boolean column called CorrectPrediction, which would show true if a company’s predicted earnings call sentiment matched their stock direction, and false otherwise.

借助Preemptive Tables，我可以使用callPredictions表的Sym列查询过去和现在的股价，并将移动方向加入到Direction列中的callPredictions上。然后，我创建了一个名为CorrectPrediction的布尔列，如果公司的预期收益电话情绪与他们的股票方向匹配，它将显示true，否则显示false。

Note that companies without values in the Direction and CorrectPrediction columns did not have stock data available.

请注意，“方向”和“正确预测”列中没有值的公司没有可用的库存数据。

This simple and easy-to-use method of table sharing helped me add another dimension to my Earnings Call Sentiment Analysis project, and allowed me to take my analyses further without having to perform the same lengthy computations again to re-use them for another purpose.

这种简单易用的表格共享方法帮助我在“收入呼吁情绪分析”项目中添加了新的维度，使我可以进一步进行分析，而不必再次执行相同的冗长计算即可将其重新用于其他目的。

翻译自: https://medium.com/swlh/get-the-most-out-of-expensive-analyses-fa95f0193d18

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.mzph.cn/news/389532.shtml

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈email:809451989@qq.com，一经查实，立即删除！

【java并发编程艺术学习】（一）初衷、感想与笔记目录

不忘初心，方得始终。学习java编程这么长时间，自认为在项目功能需求开发中没啥问题，但是之前的几次面试和跟一些勤奋的或者小牛、大牛级别的人的接触中，才发现自己的无知与浅薄。学习总得有个方向吧，现阶段就想把并发…

层次聚类和密度聚类思想及实现

层次聚类层次聚类的概念： 层次聚类是一种很直观的算法。顾名思义就是要一层一层地进行聚类。层次法（Hierarchicalmethods）先计算样本之间的距离。每次将距离最近的点合并到同一个类。然后，再计算类与类之间的距离&#xff0…

通配符或怎么浓_浓咖啡的咖啡渣新鲜度

通配符或怎么浓How long could you wait to brew espresso after grinding? Ask a barista, any barista, and I suspect their answer is immediately or within a few minutes. The common knowledge on coffee grounds freshness is that after 30 minutes or so, coffee…

保留

看见你在我眼前不去猜想我们隔多远当我夜幕中准备只想让沉默的能开解在不同的遭遇里我发现你的瞬间有种不可言说的温柔直觉在有限的深夜消失之前触摸你的脸我情愿这是幻觉也不愿是种告别已经忘了你的名字就在这座寂静星石怎么还有你的样子被保留给我一整个…

《netty入门与实战》笔记-02：服务端启动流程

为什么80%的码农都做不了架构师？>>> 1.服务端启动流程这一小节，我们来学习一下如何使用 Netty 来启动一个服务端应用程序，以下是服务端启动的一个非常精简的 Demo: NettyServer.java public class NettyServer {public static v…

谱聚类思想及实现

（这个我也没有怎么懂，为了防止以后能用上，还是记录下来） 谱聚类注意：谱聚类核心聚类算法还是K-means 算法进行聚类~ 谱聚类的实现过程： 1.根据数据构造一个图结构（Graph） &…

Tengine HTTPS原理解析、实践与调试【转】

本文邀请阿里云CDN HTTPS技术专家金九，分享Tengine的一些HTTPS实践经验。内容主要有四个方面：HTTPS趋势、HTTPS基础、HTTPS实践、HTTPS调试。一、HTTPS趋势这一章节主要介绍近几年和未来HTTPS的趋势，包括两大浏览器chrome和firefox对HTTPS的…

Linux 指定运行时动态库路径【转】

转自：http://www.cnblogs.com/cute/archive/2011/02/24/1963957.html 众所周知， Linux 动态库的默认搜索路径是 /lib 和 /usr/lib 。动态库被创建后，一般都复制到这两个目录中。当程序执行时需要某动态库， 并且该动态库还未加载到…

opencv:SIFT——尺度不变特征变换

SIFT概念： Sift（尺度不变特征变换），全称是Scale Invariant Feature Transform Sift提取图像的局部特征，在尺度空间寻找极值点，并提取出其位置、尺度、方向信息。 Sfit的应用范围包括物体辨别、机器人地图…

pca(主成分分析技术)_主成分分析技巧

pca(主成分分析技术)介绍 (Introduction) Principal Component Analysis (PCA) is an unsupervised technique for dimensionality reduction.主成分分析(PCA)是一种无监督的降维技术。 What is dimensionality reduction?什么是降维？ Let us start with an exam…

npm link run npm script

npm link & run npm script https://blog.csdn.net/juhaotian/article/details/78672390 npm link命令可以将一个任意位置的npm包链接到全局执行环境，从而在任意位置使用命令行都可以直接运行该npm包。 app-cmd.cmd #!/usr/bin/env nodeecho "666" &a…

一文详解java中对JVM的深度解析、调优工具、垃圾回收

2019独角兽企业重金招聘Python工程师标准>>> jvm监控分析工具一般分为两类，一种是jdk自带的工具，一种是第三方的分析工具。jdk自带工具一般在jdk bin目录下面，以exe的形式直接点击就可以使用，其中包含分析工具已经很强…

借用继承_博物馆正在数字化，并在此过程中从数据中借用

借用继承Data visualization is a great way to celebrate our favorite pieces of art as well as reveal connections and ideas that were previously invisible. More importantly, it’s a fun way to connect things we love — visualizing data and kicking up our fee…