F# Mentorship — Week 1

Willie Tetlow
4 min readFeb 23, 2018

The F# mentorship program is a 6 week community driven event where mentors volunteer to help you learn topics related to the language.

There are many topics you can get guidance on. Like a beginner’s introduction to F#, contributing to open source, or F# for machine learning.

This post is part of a series documenting my experience as a mentee, learning about and contributing to the F# compiler.

Week 1 Summary

  • Introductions & Building The Repo
  • Debugging The Compiler
  • Practice — Parsing

Introductions & Building The Repo

Applications closed on the 5th February and by the 12th I’d found out I was paired with Avi Avni. We organised a time to talk over Slack and for me to fork and build the visualfsharp repo beforehand.

I followed the instructions in the repo’s DEVGUIDE.md which contains information about developing the compiler on Mac, Linux and Windows.

For more information on what tools to use to develop F# on different operating systems, see:

Debugging The Compiler

In our first meeting Avi showed me how to debug the compiler. The instructions are as follows:

Windows

  1. Open CMD in Administrator, move to the base directory of the repo and run the following to build the solution.
$ build.cmd debug

2. Create a file called test.fs in <base-directory>/DEBUG/net40/bin/

3. From CMD move to location of test.fs.

$ cd DEBUG/net40/bin

4. From CMD run test.fs through compiler.

$ fsc.exe  --pause test.fs

5. In Visual Studio set a break-point (the entry point is in fscmain.fs, ln 58).

6. Go to Debug -> Attach to Process.

7. Select fsc.exe process.

8. From CMD press enter until break-point is hit.

Practice — Parsing

We decided to look at parsing to get experience of working with the compiler. Avi set me the task of allowing tuple declarations like int * int * int to be valid when represented as int * * 3.

Key Compiler Phases

F# Key Compiler Phases

I started by reading the F# Compiler Technical Overview. Mainly the Key Compiler Phases section, to get a general idea of what files should be looked at.

The diagram above shows that the parsing stage of compilation comes after lexing.

Lexical analysis generates a token stream from the original text in the source files and the parser uses these tokens to generate an Abstract Syntax Tree (AST).

To add the new tuple type declaration, the parser should be updated to recognise the tokens type(t) * * int(n) as a tuple t (*t) * n — 1.

But where is the parser?

Instead of just finding a file called parse.fs, I wanted to understand when and how the parsing stage was used. I put a break-point at the entry to the compiler (fscmain.fs, ln58), updated the test file to contain a tuple declaration, and started debugging.

test.fs with Tuple Declaration

Stepping through code, I quickly came to the file fsc.fs. The comment in this file confirmed I was getting somewhere… it said parsing!

Comment at the top of fsc.fs

I did the classic ctrl + f and, while searching for parse in fsc.fs, I came across the function CompileOps.ParseInput.

ParseInput in CompileOps.fs

Lines 21 and 24 are of interest as they both use the Parser module. This pointed me towards the file pars.fs where I finally found some parsing code!

pars.fs is created using FsYacc, an F# parser generator. Describing FsYacc and the concepts surrounding parser generators warrants more writing than I think is acceptable to put in this post but here’s the resources I used to learn about them.

Generally, a yacc specification provides rules for identifying token patterns. It converts a token stream (from lexing) into logical components that can be represented as an AST. Generating a parser, from a yacc specification, is far more maintainable than implementing it yourself, especially as the number and complexity of rules increases.

Updating the Parser

The F# compiler’s parser specification is located in pars.fsy. Inside this file I found the following tuple rules.

Tuple FsYacc Specifications in pars.fsy

The first thing that stood out was the type SynType.Tuple. Looking at its definition confirmed I was on the right track.

Tuple SynType Union Case in ast.fs

SynType is a discriminated union of the AST module and the Tuple union case allows us to denote tuples as an ordered list. To facilitate creating a tuple like type(t) * * int(n), the following rule needs to be added to pars.fsy to recognise the pattern as a SynType.Tuple and populate its list with n of the type in the declaration.

New Tuple Rule to add to pars.fsy

The only thing left to do is to try adding the rule to the tupleType specification (lines 24) and see if it works.

Updated tupleType with new rule, pars.fsy

I built the solution, ran my test case, and…it worked! The parser can now handle the new tuple declaration.

As this was to gain experience of working with the codebase, I haven’t added guards against things like the range of ints that can be used but this should demonstrate the process I undertook in identifying and updating part of the compiler.

I think it’s apparent that after the first week I’ve learnt a tremendous amount and none of this would of been possible without Avi’s help. I can’t wait to see what we achieve in the coming weeks.

On to Week 2!

--

--

Willie Tetlow

Senior Frontend Engineer at Shogun, Javascript and Functional Programming.