Data Science in F#/.NET with VsCode

Andri Rakotomalala
5 min readSep 10, 2020

--

When thinking of data science or machine learning, Python immediately comes to mind. No other production-ready programming language can match its extensive set of libraries (pandas, numpy, scikit, etc.) paired with proven experimentation tools (jupyter, dash plotly, etc.).

Other ecosystems are trying to catch up in terms of libraries but, when it comes to producing an analysis and insight in a short timeframe, almost none have the tooling that still makes python the champion of productivity.

On the other hand, there is one field where python is lagging compared to other languages: performance. Even though most of the supporting libraries for data science and machine learning used by python are written in native languages, it will never have the same performance as pure native code.

In a world where milliseconds (or even microseconds in some fields) matter to deliver information, some projects have to go through the following steps:

  1. Data scientists write a Proof Of Concept in python/jupyter
  2. When going to production:
    * -either- Software Engineers translate the logic to another language
    * -or- Data Scientists have to expose the logic as a micro-service (and therefore need knowledge in api authoring)

Thankfully, .NET now brings the best of both worlds:

In this article I will show you how to install and use the interactive console notebook for F#. In other future articles, I will write about technical implementation details, performance and libraries.

Installing the F# notebook for VsCode

In order to play with a notebook in Visual Studio Code:

  1. Install the ionide-fsharp extension
  2. Install .NET Core 5: the extension uses F#5 syntax and therefore is only compatible with dotnet core 5
  3. Install the F# notebook extension for Visual Studio Code itself
  4. Edit VSCode settings.jsonas specified in the extension documentation
  5. open the notebook panel with the command Ctrl+Alt+P > “F# Notebook+DataScience: Open Panel
  6. open an *.fsx file and start coding!

Tip: Alt+Enter will execute the current line

demo

Simple examples

The extension works exactly like the interactive fsharp interpreter (FSI) but with an additional panel that displays formatted data.

When one of the Notebook.* helpers are called, a cell will be added to the panel. The extension has multiple built-in formatters.

Primitives and markdown

// Ctrl+Alt+P : F# Notebook: Open Panel
Notebook.Text (1+1)
Notebook.Text "Hello world"
Notebook.Markdown """
# Hello, Markdown!
"""

Charts

open XPlot.Plotly
// Ctrl+Alt+P : F# Notebook: Open Panel
let chart =
Chart.Line
[ 1, 1
2, 2 ]
|> Chart.WithWidth 400
|> Chart.WithHeight 300
|> Chart.WithLayout(Layout(title = "my title"))
Notebook.Plotly chart

Maps

// Ctrl+Alt+P : F# Notebook: Open Panel
open XPlot.Plotly
open FSharp.Data
let marginWidth = 50.0
let margin = Margin(l = marginWidth, r = marginWidth, t = marginWidth, b = marginWidth)
type AlcoholConsumption = CsvProvider<"https://raw.githubusercontent.com/plotly/datasets/master/2010_alcohol_consumption_by_country.csv">let consumption = AlcoholConsumption.Load("https://raw.githubusercontent.com/plotly/datasets/master/2010_alcohol_consumption_by_country.csv")
let locations = consumption.Rows |> Seq.map (fun r -> r.Location)
let z = consumption.Rows |> Seq.map (fun r -> r.Alcohol)
let map =
Chart.Plot([ Choropleth(locations = locations, locationmode = "country names", z = z, autocolorscale = true) ])
|> Chart.WithLayout(Layout(title = "Alcohol consumption", width = 700.0, margin = margin, geo = Geo(projection = Projection(``type`` = "mercator"))))
// display chart
Notebook.Plotly map

Dataframes

// Ctrl+Alt+P : F# Notebook: Open Panel
#r "nuget: Microsoft.Data.Analysis"
open Microsoft.Data.Analysis
let locations, alcohol =
consumption.Rows
|> Seq.map (fun row -> row.Location, row.Alcohol)
|> List.ofSeq
|> List.unzip
let df = new DataFrame(
new StringDataFrameColumn("location", locations),
new PrimitiveDataFrameColumn<decimal>("consumption", alcohol)
)
Notebook.DataFrame df

Latex expressions

Notebook.Markdown @"This is cool $$x = {-b \pm \sqrt{b^2-4ac} \over 2a}.$$ isn't it"

Custom printers

You can also add your own printers that will display the data using a customized format.

open Notebook
fsi.AddPrinter(fun (data : YourType) ->
... // Format to string
|> HTML // or SVG or Markdown or Text
|> printerNotebook
)
let x = new YourType() // this will automatically print x in the notebook panel

Conclusion

This extension is not the only one that offers an interactive environment for F#. There are other projects that offer a similar functionality, notably the jupyter kernel for C#/F#.

But none of them offer the same ease of installation and level of integration with Visual Studio code (code completion, code lenses, integration with other F# extensions for formatting or code quality etc.).

That’s all from this article. In the next article, I will do a quick round on machine learning with F#. If you have any questions or just want to chat with me feel free to leave a comment below or contact me on social media.

Note: at the time I am writing this, F#5 is still in preview and has a very nasty bug that freezes autocompletion.

Note: (oct2020) it seems that Microsoft has finally decided to address the issue on F# notebook experience! Use Microsoft’s .NET Interactive Preview 3 instead, it’s better.

--

--

Andri Rakotomalala

a software engineer who likes to experiment with different techs