By Noor Malik
努尔·马利克(Noor Malik)
Let’s say you write a query in Deephaven which performs a lengthy and expensive analysis, resulting in a live table. For example, in a previous project, I wrote a query which pulled data from an RSS feed to create a live table of earnings call transcripts, and an expensive Sentiment Analysis machine learning model was used to predict overall sentiments.
假设您在Deephaven中编写了一个查询,该查询执行了冗长且昂贵的分析,从而产生了活动表。 例如,在上一个项目中 ,我编写了一个查询,该查询从RSS提要中提取数据以创建实时收入通话记录表,并使用了昂贵的Sentiment Analysis机器学习模型来预测总体情绪。
After performing the analysis, you want to use the resulting live table in several other queries. For example, I wanted to use my live table of sentiment predictions in another query which verified whether the sentiment predictions matched the direction of the companies’ stocks. Luckily, Deephaven provides the ability to share tables between queries with Preemptive Tables.
执行分析之后,您想在其他几个查询中使用生成的活动表。 例如,我想在另一个查询中使用我的情绪预测实时表,该查询验证了情绪预测是否与公司股票的方向一致。 幸运的是,Deephaven提供了使用抢先表在查询之间共享表的功能。
With Preemptive Tables, the query processor automatically pushes a consistent snapshot of all data from a table on the server to subscribed clients at regular intervals. The publisher specifies the refresh rate of the Preemptive Table, the frequency at which the table is sent over the network to subscribers, and client queries set a timeout threshold, the maximum amount of time to wait for a connection to the publisher query to be established before the connection times out.
使用抢占式表,查询处理器会自动将所有数据的一致快照从服务器上的表定期推送到订阅的客户端。 发布者指定抢占表的刷新率,该表通过网络发送给订户的频率以及客户端查询设置超时阈值,等待与发布者查询建立连接的最大时间在连接超时之前。
Any table on the Deephaven server can easily be published as a Preemptive table. In my “EarningsCallSentimentAnalysis” query, I produced a table called callPredictions that I wanted to share as a Preemptive Table with a 2-minute refresh rate. I did so as follows:
Deephaven服务器上的任何表都可以轻松地发布为抢先表。 在我的“ EarningsCallSentimentAnalysis”查询中,我生成了一个名为callPredictions的表,我希望将其共享为2分钟刷新率的抢占式表。 我这样做如下:
callPredictionsPre = callPredictions.preemptiveUpdatesTable(2*60*1000)
My other query, which needed to use my callPredictions table, created a client connection with a timeout threshold of 3 minutes and subscribed to the table as follows:
我的另一个查询(需要使用我的callPredictions表)创建了一个超时阈值为3分钟的客户端连接,并按以下方式订阅了该表:
With Preemptive Tables, I was able to use the Sym column of the callPredictions table to look up past and present stock prices and join the directions of movement onto callPredictions in a column called Direction. I then created a boolean column called CorrectPrediction, which would show true if a company’s predicted earnings call sentiment matched their stock direction, and false otherwise.
借助Preemptive Tables,我可以使用callPredictions表的Sym列查询过去和现在的股价,并将移动方向加入到Direction列中的callPredictions上。 然后,我创建了一个名为CorrectPrediction的布尔列,如果公司的预期收益电话情绪与他们的股票方向匹配,它将显示true,否则显示false。
Note that companies without values in the Direction and CorrectPrediction columns did not have stock data available.
请注意,“方向”和“正确预测”列中没有值的公司没有可用的库存数据。
This simple and easy-to-use method of table sharing helped me add another dimension to my Earnings Call Sentiment Analysis project, and allowed me to take my analyses further without having to perform the same lengthy computations again to re-use them for another purpose.
这种简单易用的表格共享方法帮助我在“收入呼吁情绪分析”项目中添加了新的维度,使我可以进一步进行分析,而不必再次执行相同的冗长计算即可将其重新用于其他目的。
翻译自: https://medium.com/swlh/get-the-most-out-of-expensive-analyses-fa95f0193d18
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/389532.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!