Functional Programming

Divyesh Bhatt
The ML Classroom
Published in
3 min readMay 17, 2020

--

A subtle introduction

NOTE: This article is an introduction to these functional programming techniques in Python. This article assumes basic knowledge of functional programming.

Introduction

In our understanding of data engineering, we have ventured from production grade databases, the optimization of SQL tables, learning about data structures and algorithms, and then using these concepts to write better code. Within each of these topics, we have learned how they work in isolation, but not how they fit together as a whole. We are at the point in our discussion where we can introduce the system that ties everything together.

This piece of the data engineering puzzle is the data pipeline. A data pipeline is a sequence of tasks. Each task takes in an input, and then returns an output that is used in the next task.

Add alt text

def task(input):
output = do_something(input)
return output

While trivial, this example of a function demonstrates its ability to conform to the task specification. If it can conform to the specification of a task, then perhaps it can be used for other higher level concepts?

In this article, I will describe a new paradigm of programming called functional programming. We will compare it with object-oriented programming (classes, objects, and state), and show how Python gives you the ability to switch between the two. Finally, we will finish the discussion by linking functional programming with data pipelines

Let’s run through an example of how we have been writing our programs so far.

Suppose we wanted to create a line counter class that took in a file, read each line, then counts the amount of lines. The class could look something like the following:

class LineCounter: def init(self, filename): self.file = open(filename, ‘r’) self.lines = []

def read(self):
self.lines = [line for line in self.file]
def count(self):
return len(self.lines)

While not the best implementation, it does provide an insight into object-oriented design. Within the class, there are the familiar concepts of methods and properties. The properties set and retrieve the state of the object, and the methods manipulate that state.

For both these concepts to work, the object’s state must change over time. This change of state is evident in the lines property after calling the read() method. As an example, here’s how we would use this class:

#example_file.txt contains 100 lines.
lc = LineCounter('example_file.txt')
print(lc.lines)
[]print(lc.count())0he lc object must read the file to
# set the lines property.
lc.read()
# The `lc.lines` property has been changed.
# This is called changing the state of the lc
# object
print(lc.lines)
[['Hello world!', ...]]print(lc.count())100

The ever changing state of an object is both its blessing and curse. To understand why a changing state can be seen as a negative, we have to introduce an alternative. The alternative is to build the line counter as a series of independent functions

In the exercise, we’ll be using an example log file containing log lines from a web server

Instructions:

  1. Write two functions:
  2. read(): takes in a filename, reads the file, then returns a list of lines from the file.
  3. count(): takes in a list, and returns its length.
  4. Call read() on the example_log.txt file, and assign the return value to a example_lines variable.
  5. Call count() on example_lines, and assign the return value to the lines_count variable

HINT to help you:

  1. You can use the examples of read() and count() methods from the Learn section to help with with the logic for your functions

……………………………….

ANSWER:

def read(filename):
with open(filename, 'r') as f:
return [line for line in f]
def count(lines):
return len(lines)
example_lines = read('example_log.txt')
lines_count = count(example_lines)

LET ME KNOW IF GOT THE SAME ANSWER!

--

--