Plotting Financial Data in Kotlin with Kandy

AndreiKingsley
12 min readApr 9, 2024

--

Hi! I want to show a simple financial data workflow — download, analysis, and visualizations — using Kotlin for Data Analysis tools.

This post is also available as a notebook (GitHub Gist/Datalore).

Tools

For working with datasets (loading and processing), I use Kotlin DataFrame. It is a library designed for working with structured in-memory data, such as tabular or JSON. It offers convenient storage, manipulation, and data analysis with a convenient, typesafe, readable API. With features for data initialization and operations like filtering, sorting, and integration, Kotlin DataFrame is a powerful tool for data analytics.
I also use the Kandy — Kotlin plotting library, designed specifically for full compatibility with Kotlin DataFrame. It brings many types of plots (including statistical) with rich customization options via a powerful Kotlin DSL.
The best way to run all of this is Kotlin Notebook. It works out of the box, has native rendering of Kandy plots and DataFrame tables, and has IntelliJ IDEA support. It can also be run in Jupyter notebooks with a Kotlin kernel and on Datalore.

Data

Load and Info

I downloaded the last six months of NASDAQ Composite Index (COMP)historical data as a CSV file (available on the official website). This process applies to any kind of financial data related to stock, foreign exchange, or cryptocurrency markets. You can download such data from the NASDAQ, YAHOO Finance, or other sources. The dataset I used can be accessed here. Note that with a different dataset, the plots will be different.

Create a dataframe with data from this file and take a look at its head:

// read CSV file into DataFrame
val dfRaw = DataFrame.readCSV("nasdaq_comp_6m.csv")
dfRaw.head()
         Date Close/Last     Open     High      Low
0 03/20/2024 16369,41 16185,76 16377,44 16127,48
1 03/19/2024 16166,79 16031,93 16175,59 15951,86
2 03/18/2024 16103,45 16154,92 16247,59 16094,17
3 03/15/2024 15973,17 16043,58 16055,33 15925,91
4 03/14/2024 16128,53 16209,19 16245,32 16039,68

Check out the dataset summary using the .describe() method, which shows information about DataFrame columns (including their name, type, null values, and simple descriptive statistics):

dfRaw.describe()
         name   type count unique nulls        top freq         mean         std        min     median        max
0 Date String 125 125 0 03/20/2024 1 null null 01/02/2024 10/02/2023 12/29/2023
1 Close/Last Double 125 125 0 16369 1 14620,494000 1082,262611 12596 14762 16369
2 Open Double 125 125 0 16186 1 14606,284640 1079,916315 12719 14641 16322
3 High Double 125 125 0 16377 1 14691,599680 1073,478378 12772 14847 16450
4 Low Double 125 125 0 16127 1 14523,346240 1081,460503 12544 14577 16199

As you can see, our data is a simple DataFrame of five columns: Date, Close/Last, Open, High, Low.

Each row corresponds to one trading day (note that there are no weekends); During the trading day, the price (for financial assets — stocks, currencies, etc.) or market index (in our case) changes. The values at the beginning and end of the day, as well as the minimum and maximum for the day, are used for analytics.

  • Date corresponds to the date. It's in String format and should be converted to a more convenient datetime type. After that, we can sort rows by date to easily calculate metrics based on sequential data.
  • Open corresponds to the opening value at the beginning of the trading day.
  • High corresponds to the maximum value during the day.
  • Low corresponds to the minimum value during the day.
  • Close/Last corresponds to the closing value – the value at the end of the trading day. It is a commonly used benchmark used to analyze value changes over time. We'll simplify this column name to just "Close" for ease of use.

I also prefer to work with columns whose names are in camel case, so let’s rename them.

Also note that there are no nulls in the dataset, so we don’t need to process them in any way.

Processing

Before we get to plotting, let’s process the data a bit:

  1. Convert Date column to LocalDate type (see kotlinx.datetime types).
  2. Sort rows by Date (ascending).
  3. Rename Close/Last into just Close.
  4. Change all column names to camel case.

The autogenerated dataframe column extension properties inside the contexts of the transform functions are used here and further as they allow avoiding misspelling column names and add type safety.

val df = dfRaw
// convert `Date` column to `LocalDate` type by given pattern
.convert { Date }.toLocalDate("MM/dd/yyyy")
// sort rows by `Date` - ascending by default
.sortBy { Date }
// rename `Close/Last` into "Close"
.rename { `Close-Last` }.into("Close")
// rename columns to camel case
.renameToCamelCase()
df.head()
         date    close     open     high      low
0 2023-09-21 13223,98 13328,06 13362,23 13222,56
1 2023-09-22 13211,81 13287,17 13353,22 13200,64
2 2023-09-25 13271,32 13172,54 13277,83 13132,00
3 2023-09-26 13063,61 13180,96 13199,13 13033,40
4 2023-09-27 13092,85 13115,36 13156,37 12963,16

Let’s make sure we got it right:

df.describe()
    name      type count unique nulls        top freq         mean         std        min     median        max
0 date LocalDate 125 125 0 2023-09-21 1 null null 2023-09-21 2023-12-19 2024-03-20
1 close Double 125 125 0 13224 1 14620,494000 1082,262611 12596 14762 16369
2 open Double 125 125 0 13328 1 14606,284640 1079,916315 12719 14641 16322
3 high Double 125 125 0 13362 1 14691,599680 1073,478378 12772 14847 16450
4 low Double 125 125 0 13223 1 14523,346240 1081,460503 12544 14577 16199

Ok, let’s build some plots!

Plots

Closing Price Line Plot

Let’s start with something simple — looking at the changes in the index over time. As described above, the closing price signifies the final value at which a stock changes hands during the day. It is the standard reference point for investors to monitor the stock’s performance across various periods. Let’s visualize close value changes over a given period with a line plot.

// create a plot using DataFrame.plot {} extension
df.plot {
// add line layer
line {
// `date` corresponds to `x`
x(date) // use extension properties again
// `close` corresponds to `y`
y(close)
}
}
Closing Price Line Plot

Now we can clearly see the growth with minor drawdowns. However, in recent days the growth has gradually slowed, transitioning to a plateau.

However, our data still isn’t very visually appealing. Let’s change that by customizing the appearance of the line (increase width and change color) and the plot layout (increase the size and add a title):

df.plot {
line {
x(date)
y(close)
// set the line width
width = 2.0
// set the line color
color = Color.BLUE
}
// layout parameters
layout {
// increase the plot width
size = 800 to 400
// set the plot title
title = "NASDAQ Composite Index (COMP) Closing price"
}
}
Closing Price Plot Customized

Now it looks so much better!

Closing Price Moving Average

The moving average (MA) is a standard technique that smooths a series of data points for analysis, making it easier to identify the direction of a trend. Let’s count the moving average for 10 days and compare it with the MA for 1 day (in this case just close).

// set size of window; it also can be 5 days, 20 or 40.
val window = 10
// create a new dataframe with a new column with MA values
val dfWithMA10 = df.add("ma$window") {
// new column builder; returning value is a value for MA for current given row.
// if row index is less than window, return null
if (index() < window) {
null
} else {
// take a window (previous `window` rows incl. this one) and count their `Close` values mean
relative(-(window - 1)..0).close.mean()
}
}
dfWithMA10.head()
         date    close     open     high      low ma10
0 2023-09-21 13223,98 13328,06 13362,23 13222,56 null
1 2023-09-22 13211,81 13287,17 13353,22 13200,64 null
2 2023-09-25 13271,32 13172,54 13277,83 13132,00 null
3 2023-09-26 13063,61 13180,96 13199,13 13033,40 null
4 2023-09-27 13092,85 13115,36 13156,37 12963,16 null

Now gather columns with close (since we have daily data, close contains the moving average for 1 day) and ma10 into one column, movingAverage(with moving averages in 1-day and 10-day intervals), with a key column interval. This will allow us to divide the dataset into two groups by this key and draw two lines of different colors (depending on the interval value) with a legend for color. Unfortunately, at the moment, series plotting is not available in Kandy, so we can't provide several lines as we would in other plotting libraries.

val dfWithMA = dfWithMA10.gather {
// rename `close` into `1 day` and `ma10` into `10 days`
// these names will be values in `interval` column after `gather`
close.named("1 day") and (ma10 named "10 days")
}.into("interval", "movingAverage")
dfWithMA.head(5)
         date     open     high      low interval movingAverage
0 2023-09-21 13328,06 13362,23 13222,56 1 day 13223,98
1 2023-09-21 13328,06 13362,23 13222,56 10 days null
2 2023-09-22 13287,17 13353,22 13200,64 1 day 13211,81
3 2023-09-22 13287,17 13353,22 13200,64 10 days null
4 2023-09-25 13172,54 13277,83 13132,00 1 day 13271,32

Now we have a column movingAverage containing the value of the moving average, while the interval column contains the size of the interval over which it is calculated.

Let’s make another line plot where we compare MA with different intervals:

// group dataframe by interval
dfWithMA.groupBy { interval }.plot {
line {
x(date)
// `movingAverage` values corresponds to `y`
y(movingAverage)
width = 2.0
// make color of line depends on `interval`
color(interval)
}
layout {
size = 800 to 400
title = "NASDAQ Composite Index (COMP) moving average"
}
}

Now the trend can be seen more clearly, and all the noise has been filtered out. The window size can be varied to achieve the best result.

Assessment of Volatility

Volatility is a measure of the price fluctuation of an asset on a financial market, which plays a key role in the analysis of risks and opportunities. Understanding volatility is essential for investors to determine potential risks and investment returns. Volatility aids in analyzing market instability and provides important information for portfolio management strategies, especially during periods of financial uncertainty. Let’s calculate the volatility with a window of 10 days and plot it.

// Same as for moving average but count standard deviation instead of mean
val dfWithVolatility = df.add("volatility") {
if (index() < window) {
null
} else {
relative(-(window - 1)..0).close.std()
}
}

dfWithVolatility.head()
         date    close     open     high      low volatility
0 2023-09-21 13223,98 13328,06 13362,23 13222,56 null
1 2023-09-22 13211,81 13287,17 13353,22 13200,64 null
2 2023-09-25 13271,32 13172,54 13277,83 13132,00 null
3 2023-09-26 13063,61 13180,96 13199,13 13033,40 null
4 2023-09-27 13092,85 13115,36 13156,37 12963,16 null
dfWithVolatility.plot { 
line {
x(date)
y(volatility)
color = Color.GREEN
width = 2.5
}
layout {
size = 800 to 400
title = "NASDAQ Composite Index (COMP) volatility (10 days)"
}
}

The following observations can be made from the chart:

  • Distinct peaks indicate periods of increased market instability. Economic news, corporate events, or market shocks could have caused these peaks.
  • The chart shows a trend of increasing volatility around the middle of the observed period (around December), followed by a decreasing trend.
  • Periods with lower values suggest relative market stability at those times.

You can try different windows to calculate the moving average and volatility for the best result!

Candlestick Chart

Candlestick charts (also known as OHLC charts) are a popular way to visualize price movements on financial markets. They consist of individual “candles”, each representing a specific time period, such as a day or an hour. Each candle displays four key pieces of information: the opening value, closing value, highest value, and lowest value within the given time frame, and also indicates if the value has grown during the period.

Now, let’s make a candlestick!

df.plot {
// add a candlestick by given columns
candlestick(date, open, high, low, close)
}

We see significant overplotting. Let’s take only the last 50 days, reduce the candle width, and customize the layout as we did before.

val dfLatest = df.takeLast(50)

dfLatest.plot {
// candlestick() optionally opens a new scope, where it can be configured
candlestick(date, open, high, low, close) {
// set smaller width
width = 0.7
}

layout {
title = "NASDAQ Composite Index (COMP)"
size = 800 to 500
}
}

Looks better and more informative!

Each candle represents a daily summary, with opening and closing index values (box edges) and minimum and maximum index values (whisker ends). Its color indicates whether the value has increased or decreased at the end of the day compared to the beginning (if close is greater than open it's green, otherwise it’s red).

We can analyze the candlesticks to draw several conclusions. For instance, until March 1, there were several greens, i.e., upward candles with large bodies, indicating stable growth. From March 1 until March 19, there is a prevalence of candles with small bodies, signaling a plateau.

Let’s customize candlestick a little bit, for example, change the color and increase transparency:

val increaseColor = Color.hex("#3F8EFC")
val decreaseColor = Color.hex("#FF6A00")

dfLatest.plot {
candlestick(date, open, high, low, close) {
// change parameters of increase candles
increase {
fillColor = increaseColor
borderLine.color = increaseColor
}
// change parameters of decrease candles
decrease {
fillColor = decreaseColor
borderLine.color = decreaseColor
}
// reduce width for all candles
width = 0.7
}

layout {
title = "NASDAQ Composite Index (COMP)"
size = 800 to 500
}
}

You can use any color palette, depending on your preferences and environment. For example, if you are using the plot in a dark theme, more contrasting colors would be appropriate.

An alternative way to show an increase or decrease is to use a filled box for increase and an empty box for decrease. Let’s do this by changing the alphavalue.

dfLatest.plot {
candlestick(date, open, high, low, close) {
// change alpha for increase / decrease candles through dot
// instead of opening new scope
increase.alpha = 1.0
decrease.alpha = 0.2
// set constant width, fill and border line colors for all candles
width = 0.8
fillColor = Color.GREY
borderLine.color = Color.GREY
}

layout {
title = "NASDAQ Composite Index (COMP)"
size = 800 to 500
}
}

Daily changes distribution analysis

Sometimes it’s useful to look at the distribution of daily changes in closing prices.

To do this, let’s calculate them and plot a histogram.

val dailyChanges = df.map {
// count difference with previous `Close`;
// `0.0` for the first row
val diff = diff(0.0) { close }
// сalculate the relative change in percentage
diff / (prev()?.close ?: 1.0) * 100.0
}.drop(1) // drop the first value with a plug
plot {
// add a histogram by given sample
histogram(dailyChanges)
}

Let’s count a sample average, use it to align bins, and add a mark line.

val changesAvg = dailyChanges.average()
changesAvg
0.1772034203139356
plot {
// set average as bins boundary
histogram(dailyChanges, binsAlign = BinsAlign.boundary(changesAvg))
// add a vertical line with fixed `x`
vLine {
xIntercept.constant(changesAvg)
// simple line setting
color = Color.RED
width = 2.0
type = LineType.DASHED
}
}

It can also be useful to add a density (KDE) plot here to see the distribution clearer:

plot {
/// histogram() optionally opens a new scope, where it can be configured
histogram(dailyChanges, binsAlign = BinsAlign.boundary(changesAvg)) {
// instead of count, use empirically estimated density in this point
y(Stat.density)
alpha = 0.7
}
// add a density plot by given sample
densityPlot(dailyChanges)
vLine {
xIntercept.constant(changesAvg)
color = Color.RED
width = 2.0
type = LineType.DASHED
}
}

Based on the chart, we can make the following conclusions:

  1. The distribution of changes resembles a normal distribution.
  2. Changes are significantly skewed to the positive side.
  3. The peak of the histogram is centered, indicating the mean of daily changes, which corresponds to the most probable value.

Summary

Throughout this article, you have learned:

  • Where to find and download financial data.
  • How to read them into a Kotlin DataFrame and perform initial processing.
  • How to visualize this data and explore it with Kandy using various plots.

Thanks for reading! I hope you enjoyed it and learned something new, and I was able to inspire you to experiment! I would be happy to receive feedback and answer any questions you may have.

Links

--

--