Introduction to functional programming with C#

Naveen Kumar
Apr 22, 2017 · 9 min read

It is no surprise that one of the biggest challenges in the enterprise software development is complexity. Change is inevitable. Especially when a new feature is introduced that adds complexity. This might lead to difficulty in understanding and verifying the software. It doesn’t stop there, it increases development time, introduces new bugs and at some point, it is even impossible to change anything in the software without introducing some unexpected behavior or side effects. This may result in slowing down the project and eventually lead to a project failure.

Imperative programming styles like object oriented programming have capabilities to minimize complexity to a certain level when done right by creating abstractions and hiding complexity. An object of a class has certain behavior that can be reasoned without caring about the complexity of the implementation. Classes when written properly will have high cohesion and low coupling where the code re-usability is increased while complexity is kept reasonable.

Object oriented programming is in my blood stream, I’ve been a C# programmer most of my life and my thought process always tends to use a sequence of statements to determine how to reach a certain goal by designing class hierarchies, focusing on encapsulation, abstraction and polymorphism that tend to change the state of the program which actively modifies memory. There’s always a possibility that any number of threads can read a memory location without synchronization. Consider if at least one of them mutates it that leads to a race condition. Not an ideal condition for a programmer who tries to embrace concurrent programming.

However, it is still possible to write an imperative program by writing a complete thread-safe code that supports concurrency. But one has to still reason about performance, which is difficult in a multi-threaded environment. Even if parallelism improves performance, it is difficult to refactor the existing sequential code into parallel code as the large portion of the codebase have to be modified to use threads explicitly.

What’s the solution?

It is worth to consider “Functional programming”. It is a programming paradigm originated from ideas older than the first computers when two mathematicians introduced a theory called lambda calculus. It provided a theoretical framework that treated computation as an evaluation of mathematical functions by evaluating expressions rather than execution of commands and avoiding changing-state and mutating data.

By doing so, it is much easier to understand and reason about the code and most importantly, it reduces the side effects. In addition to that, performing unit testing is much easier.

Functional Languages:

It is interesting to analyze the programming languages that support functional programming such as Lisp, Clojure, Erlang, OCaml and Haskell which have been used in industrial and commercial applications by a wide variety of organizations. Each of them emphasizes different features and aspects of the functional style. ML is a general purpose functional programming language and F# is the member of ML language family and originated as a functional programming language for the .Net Framework since 2002. It combines the succinct, expressive and compositional style of functional programming with the runtime, libraries, interoperability and object model of .NET

It is worthy to note that many general programming languages are flexible enough to support multiple-paradigms. A good example is C#, which has borrowed a lot of features from ML and Haskell although it is primarily imperative with a heavy dependence of mutable state. For example, LINQ which promotes declarative style and immutability by not modifying the underlying collection it operates on.

As I am already familiar with C#, I favor combining paradigms so that I have control and flexibility over choosing the approach that best suits the problem.

Fundamental principles:

Having stated before, Functional programming is programming with mathematical functions. The idea is, whenever the same arguments are supplied, the mathematical function returns the same results and the function’s signature must convey all information about possible input it accepts and the output it produces. By following two simple principles, namely — Referential transparency and Function honesty.

1.Referential Transparency:

Referential transparency, referred to a function, indicates that you can determine the result of applying that function only by looking at the values of its arguments. It means that the function should operate only on the values that we pass and it shouldn’t refer to the global state. Refer to the below example:

public int CalculateElapsedDays(DateTime from)
{
DateTime now = DateTime.Now;
return (now - from).Days;
}

This function is not referentially transparent. Why? Today it returns different output and tomorrow it will return another. The reason is that, it refers to the global DateTime.Now property.

Can this function can be converted into referentially transparent function? Yes.

How?

By making the function to operate only on the parameters that we passed in.

public static int CalculateElapsedDays(DateTime from, DateTime now) => (now - from).Days;

In the above function, we eliminated the dependency on global state thus making the function referentially transparent.

2.Function honesty:

Function honesty states that a mathematical function should convey all information about the possible input that it takes and the possible output that it produces. Refer to the below example:

public int Divide(int numerator, int denominator)
{
return numerator / denominator;
}

Is this function referentially transparent? Yes.

Does it qualify as a mathematical function? May not be.

Reason: The input arguments states that it takes two integers and return an integer as an output. Is it true in all scenarios? Nope.

What happens when we invoke the function like:

var result = Divide(1,0);

Yes, you guessed it right. It will throw “Divide By Zero” exception. Hence, the function’s signature doesn’t convey enough information about the output of the operation.

How to convert this function into a mathematical function?

Change the type of denominator parameter like below

public static int Divide(int numerator, NonZeroInt denominator)
{
return numerator / denominator.Value;
}

NonZeroInt is a custom type that can hold any integer except zero. Now, we have made this function honest as it conveys all information about the possible input that it takes and the output that it produces.

Despite the simplicity of these principles, functional programming requires a lot of practices that might seem intimidating and overwhelming to many programmers. In this article, I will try to cover some basic aspects to start with which includes “Functions as first class citizens”, “Higher order functions” and “Pure functions”

Functions as first-class citizens:

When functions are first-class citizens or first-class values, they can be used as input or output for any other functions. They can be assigned to variables, stored to collections, just like values of other type. For example:

Func<int, bool> isMod2 = x => x % 2 == 0;
var list = Enumerable.Range(1, 10);
var evenNumbers = list.Where(isMod2);

The above code illustrates that functions are indeed first-class values because you can assign the function ( x => x % 2 == 0) to a variable isMod2 which in-turn passed as an argument to Where (an extension on IEnumberable)

Treating functions as values is necessary in functional programming style as it gives the power to define Higher-order functions.

Higher-order functions (HOF):

HOFs are functions that take one or more functions as arguments or returns a function as a result or both. All other functions are First-order functions.

Let’s consider the same modulo 2 example in which “list.Where” does filtering to determine which number to be included in the final list based on a predicate provided by the caller. The predicate provided here is isMod2 function and theWhereextension method of IEnumberable is the Higher-order Function. Implementation of Where looks like below

public static IEnumerable<T> Where<T>
(this IEnumerable<T> ts, Func<T, bool> predicate)
{
foreach (T t in ts) ❶
if (predicate(t)) ❷
yield return t;
}

❶ The task of iterating over the list is an implementation detail of Where.

❷ The criterion determining which items are included is decided by the caller

The Where HOF applies the given function repeatedly to every element in the collection. HOFs can be designed to conditionally apply the function to the collection as well. I’ll leave that for you to experiment. By now you would have realized how concise and powerful Higher-order functions are.

Pure functions:

Pure functions are mathematical functions that follows the two basic principles that we discussed before — Referential transparency and Function honesty. In addition to that, Pure functions doesn’t cause any side effects. Which means, it doesn’t mutate neither global state nor input arguments. Pure functions are easy to test and reason about. Since the output is dependent only on input, the order of evaluation isn’t important. These characteristics are very vital for a program to be optimized for parallelization, Lazy evaluation and Memorization (Caching).

Consider the below example console application that multiplies a list of numbers by 2 and nicely formats that into multiplication table.

// Extensions.cs
public static class Extensions
{
public static int MultiplyBy2(this int value)❶
{
return value * 2;
}
}
// MultiplicationFormatter.cs
public static class MultpilicationFormatter
{
static int counter;

static string Counter(int val) => $"{++counter} x 2 = {val}"; ❷

public static List<string> Format(List<int> list)
=> list
.Select(Extensions.MultiplyBy2) ❸
.Select(Counter) ❸
.ToList();
}
// Program.cs
using static System.Console;
static void Main(string[] args)
{
var list = MultpilicationFormatter.Format(Enumerable.Range(1, 10).ToList());
foreach (var item in list)
{
WriteLine(item);
}
ReadLine();
}
// Output
1 x 2 = 2
2 x 2 = 4
3 x 2 = 6
4 x 2 = 8
5 x 2 = 10
6 x 2 = 12
7 x 2 = 14
8 x 2 = 16
9 x 2 = 18
10 x 2 = 20

❶ A pure function

❷ An impure function as it mutates the global state

❸ Pure and impure functions can be applied similarly

The above code executes well without any issue as we are operating only on 10 numbers. What if we want to do the same operation for bigger set of data especially when the processing is CPU intense? Would it make sense to do the multiplications in parallel? That means the sequence of data can be processed independently.

Certainly, we can use the power of Parallel LINQ (PLINQ) to get parallelization for almost free. How can we achieve this? Simply by adding AsParallel to the list

public static List<string> Format(List<int> list)
=> list.AsParallel()
.Select(Extensions.MultiplyBy2)
.Select(Counter)
.ToList();

But hold on, pure and impure functions don’t parallelize well together.

What do I mean by that? As you know that Counter function is an impure function. If you have some experience in multi-threading, this will be familiar to you. The parallel version will have multiple threads reading and updating the counter and there’s no locking in place. The program will end up losing some updates which will lead to incorrect counter results like below

1 x 2 = 2
2 x 2 = 4
7 x 2 = 6
8 x 2 = 8
6 x 2 = 10
4 x 2 = 12
9 x 2 = 14
10 x 2 = 16
5 x 2 = 18
3 x 2 = 20

Avoiding state mutation: One possible way to fix this is to avoid state mutation and running counter. Can we think of a solution to generate a list of counters that we need and map them from the given list of items? Let’s see how:

using static System.Linq.ParallelEnumerable;
public static class MultpilicationFormatter
{
public static List<string> Format(List<int> list)
=> list.AsParallel()❶
.Select(Extensions.MultiplyBy2)
.Zip(Range(1,list.Count), (val, counter) => $"{counter} x 2 = {val}")❷
.ToList();
}

Using Zip and Range, we have re-written the MultiplicationFormatter. Zip can be used as an extension method, so you can write the Format method using a more fluent syntax. After this change, Format method is now pure. Making it parallel is just a cherry on top of the cake ❶.This is almost identical to the sequential version except ❶ and ❷

Of course, it is not always as easy as this simple scenario. But the ideas that you have seen till now will deploy you into a better position to tackle such issues related to parallelism and concurrency.


In my next article, I will try to unveil some more aspects in functional programming. Stay tuned and follow me if you are interested! Kindly tap heart below to recommend this article so that others can see it.

Naveen Kumar

Written by

Software Architect | Technologist | Hobbyist Writer | Husband

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade