Get the Most Out of Expensive Analyses

Sharing Query Results Through Preemptive Tables

Deephaven Data Labs
The Startup
3 min readAug 21, 2020

--

By Noor Malik

Photo by Scott Graham on Unsplash

Let’s say you write a query in Deephaven which performs a lengthy and expensive analysis, resulting in a live table. For example, in a previous project, I wrote a query which pulled data from an RSS feed to create a live table of earnings call transcripts, and an expensive Sentiment Analysis machine learning model was used to predict overall sentiments.

After performing the analysis, you want to use the resulting live table in several other queries. For example, I wanted to use my live table of sentiment predictions in another query which verified whether the sentiment predictions matched the direction of the companies’ stocks. Luckily, Deephaven provides the ability to share tables between queries with Preemptive Tables.

With Preemptive Tables, the query processor automatically pushes a consistent snapshot of all data from a table on the server to subscribed clients at regular intervals. The publisher specifies the refresh rate of the Preemptive Table, the frequency at which the table is sent over the network to subscribers, and client queries set a timeout threshold, the maximum amount of time to wait for a connection to the publisher query to be established before the connection times out.

Any table on the Deephaven server can easily be published as a Preemptive table. In my “EarningsCallSentimentAnalysis” query, I produced a table called callPredictions that I wanted to share as a Preemptive Table with a 2-minute refresh rate. I did so as follows:

My callPredictions table

My other query, which needed to use my callPredictions table, created a client connection with a timeout threshold of 3 minutes and subscribed to the table as follows:

With Preemptive Tables, I was able to use the Sym column of the callPredictions table to look up past and present stock prices and join the directions of movement onto callPredictions in a column called Direction. I then created a boolean column called CorrectPrediction, which would show true if a company’s predicted earnings call sentiment matched their stock direction, and false otherwise.

Note that companies without values in the Direction and CorrectPrediction columns did not have stock data available.

My callPredictions table after updates in my other query

This simple and easy-to-use method of table sharing helped me add another dimension to my Earnings Call Sentiment Analysis project, and allowed me to take my analyses further without having to perform the same lengthy computations again to re-use them for another purpose.

--

--

Deephaven Data Labs
The Startup

Deephaven is a high-performance time-series query engine. Its full suite of API’s and intuitive UI make data analysis easy. Check out deephaven.io