Using Deephaven Tables to Back Large Tensorflow Datasets

By Matthew Runyon

Photo by NASA on Unsplash

Deephaven can be used to store and manipulate large amounts of data quickly and efficiently Some sets of data can exceed hundreds of gigabytes and the amount of memory a server would have. This could lead to some issues with analyzing your data with an external library such as Tensorflow.

In this article, we will cover how to utilize generators to use a Deephaven table that is too big to fit in memory with Tensorflow.

What is a Generator?

A generator is a function which yields a value of some sort and may be infinite. One benefit of generators is they…


How to place orders with an external OMS and record trades

By Matthew Runyon

Photo by Uwe Conrad on Unsplash

This article is the second part of a two-part series on how to integrate Deephaven with an external order management system (OMS) to automate trading. In Part 1, we covered how to set up a persistent query in Deephaven that generates orders in one table and records trades in another.

In this part, we will cover how to utilize Deephaven’s OpenAPI with the persistent query from Part 1 to execute our orders with an external OMS and record the final trade result. A diagram showing the flow of events in the system is shown below.


How to generate orders and record trades

By Matthew Runyon

Photo by Photos Hobby on Unsplash

Deephaven is a powerful data analytics platform with the ability to easily analyze real-time data, such as stock market prices. One potential use of Deephaven is determining stocks to buy and sell in real-time. Order management systems (OMS) can then be used to automatically place these trade orders.

In this series, we will cover how to use Deephaven to generate, place, and track the result of trade orders with an OMS. Part 1 will cover how to set up a persistent query in Deephaven to generate trade orders and track their results. …


How to easily convert your Deephaven data into arrays, dataframes, and tensors

By Matthew Runyon

Photo by Joel Filipe on Unsplash

Deephaven is an efficient platform for storing and manipulating large amounts of data. Sometimes, you may want to use your massive Deephaven dataset with libraries such as SciKit Learn or Tensorflow to take advantage of artificial intelligence. These libraries require specific data structures to use. Although Deephaven tables are not the structure they natively support, Deephaven tables can be easily converted into other structures that allow these useful integrations. This article will show you how to convert data stored in Deephaven tables into other popular data structures.

Python Data Structures

The main data structures we will look at are:

  • Deephaven…

Exploring how the gut microbiome changes during an antibiotics course

By Noor Malik

Photo by Anastasia Dulgier on Unsplash

The gut microbiome is a complex place: thousands of microbial species are constantly interacting with each other, changing the composition of the system and influencing the health of the body. Microbial species diversity is a sign of a healthy microbiome: when diseases or other perturbations occur, diversity in the gut microbiome is known to drastically decrease.

To supplement my previous project, where I identified the gut microbiome enterotypes of 33 individuals, I wanted to analyze a time-series for a microbiome undergoing a perturbation or dysbiosis. …


A Powerful Workflow for Data Analysis and Visualization

By Alex Peters

Results from my COVID-19 using Deephaven data plotted in Jupyter

Though Deephaven has developed a powerful data interrogation (REPL / notebook) experience to support vital workflow, and to facilitate the launching of apps on the platform, our users also enjoy many additional capabilities and comfortable usage patterns by using Jupyter Notebook and Lab with Deephaven kernels.

In particular, the Deephaven-Python integration has focused on providing data scientists (and others similarly inclined) a smorgasbord of options:

  1. Delivering code to Deephaven from Jupyter — thereby using Deephaven as a server with Jupyter as a client.
  2. Supporting coding patterns that combine Python natively with Deephaven’s table operations.
  3. Integrating 2-way transformations…

Using Deephaven to identify and classify microbiome types

By Noor Malik

Photo by CDC on Unsplash

Introduction

Our bodies consist of roughly 10 trillion human cells, but also roughly 100 trillion microorganisms making up our “microbiome.” Collectively, these microbes regulate our bodies through thousands of different biotic functions. The microbiome is so important for our bodies to run smoothly that scientists often consider it a supporting organ.

Everyone has a unique network of microbiota comprising their microbiome, but in 2011, researchers discovered that regardless of nationality, gender, or any other anthropometric measure, adults fall into one of three distinct gut microbiome groups, called Enterotypes. These three densely-populated regions of the high-dimensional gut microbiome feature…


Sharing Query Results Through Preemptive Tables

By Noor Malik

Photo by Scott Graham on Unsplash

Let’s say you write a query in Deephaven which performs a lengthy and expensive analysis, resulting in a live table. For example, in a previous project, I wrote a query which pulled data from an RSS feed to create a live table of earnings call transcripts, and an expensive Sentiment Analysis machine learning model was used to predict overall sentiments.

After performing the analysis, you want to use the resulting live table in several other queries. For example, I wanted to use my live table of sentiment predictions in another query which verified whether the sentiment predictions…


Comparing Our Model’s Results with Reality

By Alex Peters

Photo by engin akyurt on Unsplash

In a series of previous posts, we took a lot of care to design and implement a disease-spreading model for COVID-19 in Deephaven using PyMC3. Now, we will explore the results of our model and compare its conclusions with what we actually observe in the ongoing COVID-19 pandemic.

INTRODUCTION

Our model for COVID-19 was based on a classic epidemiological formulation called the SIR model. This model is a system of equations that consists of two parameters: 𝛃 — the average transmission rate, and 𝜸 — the average removal rate (by recovery or fatality). We extended this model to…


Analyzing Students’ Exam Performance Using Deephaven’s R Integration

By Levi Petty

Photo by Ben Mullins on Unsplash

This project uses a public, synthesized exam scores dataset from Kaggle to analyze average scores in Math, Reading, and Writing subject areas, relative to the student’s parents’ level of education and whether the student took a preparation course before the exam. I used Deephaven’s R integration, which gave me access to cool plotting libraries. Additionally, users can see the code and plots step-by-step, yielding more detailed information.

My Process

Configuration Variables

We begin by defining some configuration variables:

  • home: Your home directory
  • system: Deephaven system to connect to (configured in the launcher)
  • keyfile: Key file used to authenticate when connecting to…

Deephaven Data Labs

Deephaven is high-performance time-series database, complemented by a full suite of API’s and an intuitive user experience. Check out deephaven.io

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store