Upskill tutorial for Refactoring argparse

Hud Wahab
4 min readJul 21, 2023

--

Day 18: argparse. StrEnum. Auto. JSON.

Hi 👋 I am Hud, a postdoc for engineering data science at the AI Manufacturing Center in Laramie, Wyoming. My funding is running out (AaAaaA !), so while I am actively looking for a new job, instead of doing the 205th coding certificate to prove my worthiness — I thought I’d do design challenges and document how I spend my time upskilling so other engineers can do the same.

Nowadays, certificates are everywhere. Documenting small upskill projects that you can later show off is the best way to get recognition as a professional engineer.

This is day 18 of a 30-day design challenge. Follow along and let me know if you get stuck!

TL;DR tasks

Download the provided code for the challenge.

Task 1: Identify duplication in the CLI code

  • Review the existing code in the main.py file and identify areas where there is duplication in the command-line interface code.
  • Note the common patterns or functionalities that can be extracted to improve the code structure.

Task 2: Extract common functionalities

  • Identify the common functionalities used in the CLI code, such as parsing arguments, handling commands, and displaying outputs.
  • Create separate functions for these common functionalities to reduce duplication.

Task 3: Refactor main function

  • Modify the main function in the main.py file to be a high-level function responsible for handling the command-line interface flow.
  • Delegate specific tasks to the newly created functions, streamlining the main function.

Task 4: Implement multilingual support

  • Analyze the code for the CLI to identify all English text and messages.
  • Replace the hard-coded English text with language-specific messages and translations.

Task 5: Separate language files

  • Create separate language files or modules for each supported language (e.g., English and Dutch).
  • Store the translated messages in these language files for easy maintenance and extension.

Task 6: Implement language selection

  • Add functionality to the CLI to allow the user to select their desired language.
  • Update the code to use the chosen language file for displaying messages and outputs.

Task 7: Test the CLI functionality

  • Run the refactored code and verify that the command-line interface functions correctly.
  • Test the multilingual support by switching between different language options.

Task 8: Document the refactoring process

  • Add comments or documentation to explain the changes made and the reasons behind them.
  • Describe how the duplication was eliminated and how the code was made more maintainable and extensible.
  • Document any additional considerations or modifications made during the refactoring process.

The problem

Every scientists major problem — a supersized main function:

The CLI parser above has a lot of duplicated code that handles the different weather conditions. It has a long if-else block that checks the value of the conditions argument and calls the appropriate function to retrieve the weather information. This code is not modular and is difficult to maintain and extend. If a new weather condition is added, the code needs to be modified to handle it. Also, if one were to translate this CLI to display another language (for example, German instead of English) — that’s a nightmare!

So how can we improve argparse ?

Let’s have a look:

The good code, on the other hand, solves the duplication problem by using the argparse module to handle the command-line arguments. It defines a function construct_parser that creates an argument parser with the expected arguments and options. The condition argument is defined with a list of choices that include all the possible weather conditions. This makes the code more modular and easier to maintain and extend. If a new weather condition is added, it only needs to be added to the list of choices in the construct_parser function.

With fetch_conditions_from_args, the function uses a dictionary texts to map the possible values of arg_condition to the corresponding Condition objects. If a new weather condition is added, it only needs to be added to the list of choices in the construct_parser function.

You might be wondering how the condition class is formed. Let’s have a look: The code defines an enumeration class Condition that inherits from StrEnum. StrEnum is a new feature in Python 3.11 that allows enumeration classes to have string values as well as integer values.

The Condition enumeration class has three members: TEMPERATURE, HUMIDITY, and WIND. These members are created using the auto() function, which automatically assigns a unique value to each member.

Compare now the big fat main code above and the refactored solution. We detach the JSON file containing the keys and values for texts , construct the parser, parse the arguments and simply print the conditions and fetch those conditions. Clean and simple!

What might that JSON look like though?

The values contains strings with F-formatted strings e.g. {humidity:.1f} but also lists of choices to choose for the conditions.

This way, if we did want to translate this to Icelandic or Arabic, we can simply provide a new texts_iceland.json file and that’s it!

Conclusion

Congratulations! You finished Day 18 from the 30-day design challenge.

If you have reached this far, you know how to:

  • Refactor long argparse code
  • Use StrEnum and auto to automatically enumerate objects
  • Separate parsing texts with a JSON file

Check out the day 19 challenge!

Also, you can access the full 30-day GitHub repository here.

💡 My goal here is to help engineering data scientists upskill in design. I’d like to hear from you! Was this helpful? Anything I can improve? Connect with me on LinkedIn | Medium

--

--

Hud Wahab

🤖 Senior ML Engineer | Helping machine learning engineers design and productionize ML systems. | Let's connect: https://rb.gy/vb6au