Write your own Excel in 100 lines of F#
Tomas Petricek is a computer scientist and open-source developer. He is a Visiting Researcher at the Alan Turing Institute working on tools for open data-driven storytelling. Check out his upcoming course, Fast Track to F# with Tomas Petricek & Phil Trelford on December 6 at Skills Matter, London.
I’ve been teaching F# for over seven years now, both in the public F# FastTrack course that we run at SkillsMatter in London and in various custom trainings for private companies. Every time I teach the F# FastTrack course, I modify the material in one way or another. I wrote about some of this interesting history last year in an fsharpWorks article. The course now has a stable half-day introduction to the language and a stable focus on the ideas behind functional-first programming, but there are always new examples and applications that illustrate this style of programming.
For the upcoming December course in London, I added a number of demos and hands-on tasks built using Fable, partly because running F# in a browser is an easy way to illustrate many concepts and partly because Fable has some amazing functional-first libraries.
If you are interested in learning F# and attending our course, the next F# FastTrack takes place on 6–7 December in London at SkillsMatter. We also offer custom on-site trainings. Get in touch at @tomaspetricek or email firstname.lastname@example.org for a 10% discount for the course.
One of the new samples I want to show, which I also live coded at NDC 2018, is building a simple web-based Excel-like spreadsheet application. The spreadsheet demonstrates all the great F# features such as domain modeling with types, the power of compositionality and also how functional-first approach can be amazingly powerful for building user interfaces.
What is a spreadsheet?
You can click on any cell to edit the cells. To confirm your edit, just click on any other cell. You can enter numbers such as
1 (in cell B1) or formulas such as
=B1+B2 in cell B3. Formulas support parentheses and four standard numerical operators. When you make an edit, the spreadsheet automatically updates. If you make a syntax error, reference empty cell or create a recursive reference, the spreadsheet will show
👍 For news and articles from Skills Matter, subscribe to our newsletter here.
Defining the domain model
Following the typical F# type-driven development style, the first thing we need to think about is the domain model. Our types should capture what we work with in a spreadsheet application. In our case, we have positions such as
C10, expressions such as
=A1+3 and the sheet itself which has user input in some of the cells. To model these, we define types for
Position is simply a pair of column name and a number. An expression is more interesting, because it is recursive. For example,
A1+3 is an application of a binary operator on sub-expressions
A1, which is a reference and
3 which is a numerical constant. In F#, we capture this nicely using a discriminated union. In the
Binary case, the left and right sub-expressions are themselves values of the
Expr type, so our
Exprtype is recursive.
Sheet is a map from positions to raw user inputs. We could also store parsed expressions or even evaluated results, but we always need the original input so that the user can edit it. To make things simple, we'll just store the original input and parse it each time we need to evaluate the value of a cell. To do the parsing and evaluation, we'll later define two functions:
We will talk about these later when we discuss the logic behind our spreadsheet, but writing the type down early is useful. Given these types, we can already see how everything fits together. Given a position, we can do a lookup into
Sheet to find the entered text, then we can parse it using
parse to get
Expr and, finally, pass the expression to
evaluate to get the resulting value. We also see that both
evaluatemight fail. The first one will fail if the input is not a valid formula and the second might fail if you reference an empty cell.
Now, all we have to do is to keep writing the rest of Excel until the type checker is happy!
Creating user interface using Elmish
I’m going to start by discussing the user interface and then get back to implementing the parsing and evaluation logic. For creating user interfaces, Fable comes with a great library called Elmish. Elmish implements a functional-first user interface architecture popularized by the Elm language, which is also known as model view update.
The idea of the architecture is extremely simple. You just need the following two types and two functions:
The two types and two functions define the user interface as follows:
Statestores all the user interface state that you need in order to render it.
Eventis a union of different events that can happen when the user interacts with the UI.
updateis a function that takes an original state and an event and produces a new modified state.
viewtakes the state and generates a HTML document; it also takes a function
Event -> unitwhich can be used in event handlers of the HTML document to trigger an event.
Conceptually, you can think that the application starts with an initial state, renders a page and, when some action happens and event is triggered, updates the state using
update and re-renders the page using
view. The key trick that makes this work is that Elmish does not replace the whole DOM, but diffs the new document with the last one and only updates DOM elements that have changed.
What state and events are there in our spreadsheet? As with the whole spreadsheet application, the first step in implementing the user interface is to define a few types:
In the state, we keep a list of row and column keys (this typically starts from
A1, but we do not require that), currently selected cell (this can be
None if no cell is selected) and, finally, the cells of the spreadsheet. There are two events that can happen. The
UpdateValue event happens when you change the text in the current cell; the
StartEdit event happens when you click on some other cell to start editing it.
Updating the spreadsheet after event
update function is quite easy - as with the main spreadsheet logic, we just need to write code until the type checker is happy!
In Elmish, the
update function is a little bit more complicated than I said above. In addition to returning new state, we can also return a list of commands. The commands are used to tell the system that it should start some action after updating the state. This can be things such as starting a HTTP web request to fetch some information from the server. In our case, we do not need any commands, so we just return
The implementation uses the
with construct, which creates a clone of the
state record and updates some of its fields. In the case of
StartEdit, we set the active cell to the newly selected one. In the case of
UpdateValue, we first add the new value to the sheet (the
Map.add function replaces existing value if there is one already) and then set the
Cells of the spreadsheet.
Rendering the spreadsheet
To construct the HTML document, Elmish comes with a lightweight wrapper built on top of React (although you can use other virtual DOM libraries too). The wrapper defines typed functions for creating HTML elements and specifying their attributes.
We’ll first implement the main
view function which generates the spreadsheet grid and then discuss the
renderCell helper which renders an individual cell.
Here, we’re using F# list comprehensions to generate the HTML document. For example, the lines 4–7 generate the header of the table. We create a
tr element with no attributes (the first argument) containing a couple of
th elements (the second argument). We're using
yield to generate the elements - first, we create the empty
th element in the left top corner and then we iterate over all the columns and produce a header for each of the columns. The
col variable is a character, so we first turn it into a string using
string before turning it into HTML content using
str function provided by Elmish.
The nice thing about writing your HTML rendering in this way is that it is composable. We do not have to put everything inside one massive function. Here, we call
renderCell (line 12) to render the contents of a cell.
Rendering spreadsheet cell
There are two different ways in which we render a cell. For the selected cell, we need to render an editor with an input box containing the original entered text. For all other cells, we need to parse the formula, evaluate it and display the result. The
renderCell function chooses the branch and, in the latter case, handles the evaluation:
We test whether the cell that is being rendered is the active one using the
state.Active = Some poscondition. Rather than comparing two
Position values, we compare
Position option values and do not have to worry about the case when
If the current cell is active, we take the entered value or empty string and pass it to
renderEditor (defined next). If no, then we try to get the input - if there is no input, we call
Some "" to render valid but empty cell. Otherwise, we use a sequence of
evaluate to get the result. We will look at both of these functions below, when discussing how the spreadsheet logic is implemented. Both
evaluate may fail, so we use the option type to compose them.
evaluate only when
parse succeeds; otherwise it propagates the
None result. We also use
Option.map to transform the optional result of type
int into an optional string which we then pass to
So far, we have not created any handlers that would trigger events when something happens in the user interface. We’re finally going to do this in
renderView, which are both otherwise quite straightforward:
renderView, we create red background and use the
#ERR string if the value to display is empty (indicating an error). We also add an
OnClick handler. When you click on the cell, we want to trigger the
StartEditevent in order to move the editor to the current cell. To do this, we specify the
OnClick attribute and, when a click happens, trigger the event using the
trigger function which we got as an input argument for the
viewfunction (and which we first passed to
renderCell and then to
renderEditor function is similar. We specify the
OnInput handler and, whenever the text in the input changes, trigger the
UpdateValue event to update the value and recalculate everything in the spreadsheet. We also specify
AutoFocus attribute which ensures that the element is active immediately after it is created (when you click on a cell).
Putting it all together
Now we have all the four components we need to run our user interface. We have the
Event type definitions and we have the
view functions. To put everything together, we need to define the initial state, specify the ID of the HTML element in which the application should be rendered and start it.
The initial state defines the ranges of available rows and columns and specifies that there are no values in any of the cells (the demo embedded above specifies the initial cells for computing factorial and Fibonacci here). Then we use
mkProgram to compose all the components together, we specify React as our execution engine and we start the Elmish application!
Implementing spreadsheet logic
So far, we defined the domain model which specifies what a spreadsheet is using F# types and we implemented the user interface using Elmish. The only thing we skipped so far is the spreadsheet logic — that is, parsing of formulas and evaluation. Completing these two is going to be easier than you might expect!
Evaluating spreadsheet formulas
First, let’s have a look at how to evaluate formulas. In the beginning, we defined the
Expr type as a discriminated union with three cases:
Reference. To evaluate an expression, we need to write a recursive function that uses pattern matching and appropriately handles each case. We'll start with a simple version that does not handle errors and does not check for recursive formulas:
The function takes the spreadsheet
cells as a first argument, because it may need to lookup values of cells referenced by the current expression. It also takes the expression
expr and pattern matches on it. Handling
Number is easy - we just return the number.
Binary is a bit more interesting, because we need to call
evaluate recursively to evaluate the value of the left and right sub-expressions. Once we have them, we use a simple dictionary to map the operator to a function (written using standard F# operators) and run the function.
Finally, when handling a
Reference, we first get the input at the given cell, parse it and then (again) recursively call
evaluate. This can fail in many ways - the cell might be empty or the parser could fail. We improve this in the next version of our evaluator where the function returns
int option rather than
int. The missing value
None indicates that something went wrong.
In case of
Number, we now return
Some num. In this case, evaluation cannot fail. In case of
Binary, both recursive calls can fail and we get two option values. To handle this, we use
Option.map - both of these will call the specified function only when the previous operation succeeded, otherwise, they immediately return
None indicating a failure. If both the left and the right sub-expressions can be evaluated, we can then apply binary numerical operator to their results. Handling of
Reference is similar - we sequence a number of operations that may fail using
Another interesting feature we added in this version is checking for recursive references. To do this, the
evaluate function now takes the
visited parameter which is a set of cells that were accessed during the evaluation. We add cells to the set using
Set.add pos visited on line 18. When we find a reference to a cell that we already visited (line 12), then we immediately return
None, because this would lead to an infinite loop.
Finally, the last part of logic that we need to implement is the parsing of formulas entered by the user into values of our
Expr type. For this, we're going to use a very simple parser combinator library (which you can find in the full source code). There are four key concepts in the library:
Parser<char, 'T>represents a parser that takes a list of characters as the input. It returns
Noneif the parser cannot parse the input. Otherwise, the parser parses a value and returns it together with the rest of the input. The fact that parsers do not have to consume the entire input makes it easy to compose them.
<*>is a binary operator that takes two parsers; it runs the first parser first, getting a value of type
'T1and then runs the second parser on the rest of the input, getting a value of type
'T2. It succeeds only if both parsers succeed and then it returns a pair with both values.
<|>is a binary operator that also takes two parsers, but they both have to recognise values of the same type. It tries to run the first parser and, if that fails, tries to run the second one. It succeeds if either of the parsers succeed and returns whatever the successful parser returned.
mapis a function that transforms the value that a parser produces. Given a parser of type
Parser<'T>and a function
'T -> 'R, it returns a parser that runs the original parser and, if that is successful, applies the function to the result.
The following snippet shows how we use these three ideas to create simple parsers to recognise operators, references and numbers:
char function creates a parser that recognises only the given character (and then returns it as the result). Thus, the
operator parser recognises the four standard numerical binary operators and accepts no other characters. The
reference parser recognises a letter followed by a number. This returns a
char * intpair which we turn into the
Reference value of
Expr using the
map function. Parsing a number is even easier - we just run the built-in
integer parser and wrap it in
Number. Note that the type of
number is now the same -
Parser<char, Expr>. This means that we can compose them using
<|> to create parser that recognises either of the two expression types.
Finishing the rest of the parsing is a bit more work, because we need to handle parentheses as
(1+2)*3 and also ignore whitespace, but the concepts are the same:
To deal with recursion, the library allows us to create a parser using
slot, use it, and then define what it is later using
exprSetter. In our case, we define
expr on line 1, use it when defining
brack (line 3) and then define it on line 9. This is a recursive reference;
exprAux can be
binary, which contains
term, which can be
brack and that, in turn, contains
The only other clever thing in the snippet are the
<*>> operators. Those behave like
<*>, but return only the result from the parser on the left or right (wherever the double arrow points). This is useful, because we can write
anySpace <*>> expr <<*> anySpace to parser expression surrounded by whitespace, but get a parser that returns just the result of
expr (we do not care what the whitespace was).
Finally, we define a formula which is
= followed by an expression and an equation - that is, the thing that you can type in the spreadsheet - which is either a formula or a number.
parse function defined on the last line lets us run the main
equation parser on a given input. It takes a sequence of characters and produces
option<Expr>, which is exactly what we've used earlier in the article.
In total, this article showed you some 125 lines of code. If we did not worry about nice formatting and skipped all the blank lines, we could have written a simple spreadsheet application in some 100 lines of code! Aside from standard Fable libraries, the only thing I did not count is the parser combinator library. I wrote that on my own, but there are similar existing libraries that you could use (though you’d need to find one that works with Fable).
The final spreadsheet application is quite simple, but it does a number of interesting things. It runs in a web browser and you can scroll back to the start of the article to play with it again! On the technical side, it has a user interface where you can select and edit cells, it parses the formulas you enter and it also evaluates them, handling errors and recursive references.
If you enjoyed this post and want to learn more about F# and also Fable, join our F# FastTrackcourse on 6–7 December in London at SkillsMatter. We’ll cover Fable, Elmish, but also many other F# examples. Get in touch at @tomaspetricek or email email@example.com for a 10% discount for the course, or if you are interested in custom on-site training.
I like this example, because it shows how a number of nice aspects of the F# language and also the F# community can come together to provide a fantastic overall experience. In case of our spreadsheet, this includes:
- The Elm architecture, as implemented by the Elmish library, is a fantastic way to write functional-first user interfaces. All we had to do to implement the spreadsheet user interface was to define types for the state and events and then implement the
- Finally, the example also used compositionality of functional programming in two ways. First, an expression is elegantly expressed by a recursive type
Exprwhich can consist of other
Exprvalues. Second, we composed a parser for spreadsheet formulas from just a few primitives using just two operators,
If you want to have a look at the complete source code, you can find it in my elmish-spreadsheet repository on GitHub. The repository is designed as a hands-on exercise where you can start with a template, complete a number of tasks and end up with a spreadsheet, but there is also
completed branch where you find the finished source code.
Originally published at tomasp.net.