The F# mentorship program is a 6 week community driven event where mentors volunteer to help you learn topics related to the language.
There are many topics you can get guidance on. Like a beginner’s introduction to F#, contributing to open source, or F# for machine learning.
This post is part of a series documenting my experience as a mentee, learning about and contributing to the F# compiler.
Week 1 Summary
- Introductions & Building The Repo
- Debugging The Compiler
- Practice — Parsing
Introductions & Building The Repo
I followed the instructions in the repo’s DEVGUIDE.md which contains information about developing the compiler on Mac, Linux and Windows.
For more information on what tools to use to develop F# on different operating systems, see:
Debugging The Compiler
In our first meeting Avi showed me how to debug the compiler. The instructions are as follows:
- Open CMD in Administrator, move to the base directory of the repo and run the following to build the solution.
$ build.cmd debug
2. Create a file called test.fs in <base-directory>/DEBUG/net40/bin/
3. From CMD move to location of test.fs.
$ cd DEBUG/net40/bin
4. From CMD run test.fs through compiler.
$ fsc.exe --pause test.fs
5. In Visual Studio set a break-point (the entry point is in fscmain.fs, ln 58).
6. Go to Debug -> Attach to Process.
7. Select fsc.exe process.
8. From CMD press enter until break-point is hit.
Practice — Parsing
We decided to look at parsing to get experience of working with the compiler. Avi set me the task of allowing tuple declarations like
int * int * int to be valid when represented as
int * * 3.
Key Compiler Phases
The diagram above shows that the parsing stage of compilation comes after lexing.
Lexical analysis generates a token stream from the original text in the source files and the parser uses these tokens to generate an Abstract Syntax Tree (AST).
To add the new tuple type declaration, the parser should be updated to recognise the tokens
type(t) * * int(n) as a tuple
t (*t) * n — 1.
But where is the parser?
Instead of just finding a file called parse.fs, I wanted to understand when and how the parsing stage was used. I put a break-point at the entry to the compiler (fscmain.fs, ln58), updated the test file to contain a tuple declaration, and started debugging.
Stepping through code, I quickly came to the file fsc.fs. The comment in this file confirmed I was getting somewhere… it said parsing!
I did the classic
ctrl + f and, while searching for parse in fsc.fs, I came across the function
Lines 21 and 24 are of interest as they both use the Parser module. This pointed me towards the file pars.fs where I finally found some parsing code!
pars.fs is created using FsYacc, an F# parser generator. Describing FsYacc and the concepts surrounding parser generators warrants more writing than I think is acceptable to put in this post but here’s the resources I used to learn about them.
Generally, a yacc specification provides rules for identifying token patterns. It converts a token stream (from lexing) into logical components that can be represented as an AST. Generating a parser, from a yacc specification, is far more maintainable than implementing it yourself, especially as the number and complexity of rules increases.
Updating the Parser
The F# compiler’s parser specification is located in pars.fsy. Inside this file I found the following tuple rules.
The first thing that stood out was the type
SynType.Tuple. Looking at its definition confirmed I was on the right track.
SynType is a discriminated union of the AST module and the
Tuple union case allows us to denote tuples as an ordered list. To facilitate creating a tuple like
type(t) * * int(n), the following rule needs to be added to pars.fsy to recognise the pattern as a
SynType.Tuple and populate its list with
n of the type in the declaration.
The only thing left to do is to try adding the rule to the
tupleType specification (lines 2–4) and see if it works.
I built the solution, ran my test case, and…it worked! The parser can now handle the new tuple declaration.
As this was to gain experience of working with the codebase, I haven’t added guards against things like the range of ints that can be used but this should demonstrate the process I undertook in identifying and updating part of the compiler.
I think it’s apparent that after the first week I’ve learnt a tremendous amount and none of this would of been possible without Avi’s help. I can’t wait to see what we achieve in the coming weeks.