Running the Luigi scheduler

Chaayagirimon
featurepreneur
Published in
2 min readJan 6, 2022
Credits: https://luigi.readthedocs.io/

Luigi is run in the - -local-scheduler mainly for development purposes. The Central scheduler on the other hand helps us visualize tasks and also to make sure instances of the same task are not running simultaneously.

First, create a sample program:

import luigiclass GenerateWords(luigi.Task):def output(self):return luigi.LocalTarget('words.txt')
def run(self):# write a dummy list of words to output filewords = ['cat', 'tiger','wolf']with self.output().open('w') as f:for word in words:f.write('{word}\n'.format(word=word))class CountLetters(luigi.Task):def requires(self):return GenerateWords()def output(self):return luigi.LocalTarget('letter_counts.txt')def run(self):# read in file as listwith self.input().open('r') as infile:words = infile.read().splitlines()# write each word to output file with its corresponding letter countwith self.output().open('w') as outfile:for word in words:outfile.write('{word} | {letter_count}\n'.format(word=word,letter_count=len(word)))if __name__ == '__main__':luigi.build([CountLetters()])

This is a basic program that counts letters of words in a list to a text file.

To set up the scheduler run:

luigid

and then run the program normally while luigid is still running. Make sure the run statement no longer includes [- -local-scheduler].

output screen of the program:

Open up the scheduler in http://localhost:8082

You will now be able to see the central scheduler

You can also view dependency graphs:

As the pipelines get more complex the scheduler will be very helpful with the visualization as it can show the total tasks, tasks that are running, tasks that failed, and tasks dependencies.

--

--