How to follow a file in Python (tail -f in Python)

Saurabh AV
2 min readMar 20, 2020

--

In this blog post, we see how we can create a simple version of tail -f file in Python.

What are we doing?

We want to read a file using Python and keep reading the file, infinitely. We want to ‘follow’ a file. Essentially, we want to emulate the UNIX commandtail -f file does:

`tail -f` is a command widely used when monitoring server logs

We are reading an “infinite stream” of data. Here are a few things to keep in mind:

  • we want to constantly watch the file and yield lines as soon as new lines are written to the file
  • But we don’t really know how much data will actually be written
  • log files can often be enormous (so no question of reading an entire file every time and looking for updates)

How are we doing this?

We will be writing a simple Python script and use Pythonic concepts such as generators.

Disclaimer: in a real-world production scenario, it’s probably a bad (not scalable) idea using Python to create something like this, we’re doing this just for fun.

Let’s see some code

import time
import os
def follow(thefile):
'''generator function that yields new lines in a file
'''
# seek the end of the file
thefile.seek(0, os.SEEK_END)

# start infinite loop
while True:
# read last line of file
line = thefile.readline()
# sleep if file hasn't been updated
if not line:
time.sleep(0.1)
continue

yield line

if __name__ == '__main__':

logfile = open("run/foo/access-log","r")
loglines = follow(logfile)
# iterate over the generator
for line in loglines:
print(line)

What’s happening here:

  • we create a follow function that accepts a file and yeilds (and not returns) a sequence of lines
  • we iterate over the generator and keep printing new lines written into the file
  • an infinite loop is spawned within the follow generator function which makes

A bit on generators

Here, follow is a special type of function called a generator. What happens under the hood:

  • when a generator is encountered `loglines = follow(logfile)`: function execution is paused and a generator object is returned (this contains state variables related to the function)
  • the actual function runs when we iterate over the generator object previously returned
  • so when we iterate a __next__() method is executed, which is when the generator function executes and a value is yielded

To sum up: generators are functions that return objects, which can be iterated over, typically consumed in loops.

Conclusion

Infinite streams are tricky, generators are fun and Python is handy!

Here’s are some great resources on generators:

--

--