Refactoring Python Code for Machine Learning Projects. Python “Spaghetti Code” Everywhere!

Ernest Bonat, Ph.D.
13 min readJun 18, 2018

--

Update Apr/27/2024

1. Python Top-Down Programming Style
2. Using Procedure Programming (“the Pythonic way”) Instead of Object-Oriented Programming Design and Implementation
3. No Error Handling Implementation
4. Use Outdated Python Code and Syntax from Previous Versions
5. Hardcode of Default Numerical and String Parameters including Machine Learning Hyperparameters Model
6. Code Comments is not Provided at All, Especially Docstring Comments for Modules, Functions, Classes, or Methods Definition
7. Not Following the Python Naming and Conversion Standards Provided in “PEP 8 — Style Guide for Python Code”
8. Programs Unit Tests are not Implemented at All
9. Developer Program Documentation not Provided
10. Many Python Programmers Don’t Know the Latest Libraries and Frameworks Developed for Python Today
11. Best Practices to Untangle “Spaghetti Code”

A couple of weeks ago I was working on an image processing project using OpenCV (CV- Computer Vision) library using C# and Python. Right away after a couple of minutes of googling, I found many bad links with terrible C# and Python code. I was very surprised about this. There was a time when all my Application Developer friends were saying “Google is my friend”. Now, some of them are saying “be careful with Google”. It’s very sad to see how sharing computer information has completely changed in the last few years. Bad ‘spaghetti code’ exists all over the place these days. Because of that, it can take a long time to find the right specific piece of code for my projects. Can we say that Google became a “spaghetti code” collector? I hope not!

I will be using Python code examples to explain my point. Below are the main topics that will be covered:

1. Python Top-Down programming style
2. Using Procedure Programming (“the Pythonic way”) instead of Object-Oriented Programming design and implementation
3. No error handling implementation
4. Use outdated Python code and syntax from previous versions
5. Hardcode of default numerical and string parameters including Machine Learning hyperparameters model
6. Code comments is not provided at all, especially Dostring comments for modules, functions, classes, or methods definition
7. Not following the Python naming and conversion standards provided in PEP 8 — Style Guide for Python Code
8. Programs unit tests are not implemented at all
9. Developer program documentation not provided
10. Many Python programmers don’t know the latest libraries and frameworks developed for Python today

1. Python Top-Down Programming Style

In the 80’s I wrote computer programs in QBasic using top-down programming style. At the time there were not as many programming design patterns and frameworks that we have today. With Object-Oriented Programming (OOP) programming model, MVC framework and design patterns we can write professional production Python programs today. Why do so many Python developers use the top-down programming style to write their programs? Maybe because Data Science researchers and developers do not understand the Software Development Life Cycle and its impact on future code maintenance. If this is the case, I would recommend them to take some Computer Science courses and tutorials before writing any programs at all!

2. Using Procedure Programming (“the Pythonic way”) Instead of Object-Oriented Programming Design and Implementation

Many Python developers say that the Procedural Programming guides are the right “Pythonic” way of writing Python programs– this is a simple incompetent and false statement, in my opinion. Software design and development has nothing to do with a specific used programming language. The design and development of any computer program is based on known software engineering rules and procedures that we need to follow. When we write computer programs, for examples, in Java, C#, C++, etc., we apply the same software engineering guide and rules with each one. I really don’t believe that there should be a specific programming pattern for Python language named “Pythonic”. I have seen many Python library files with many functions with different names, parameters and practically the same body code — what a waste of code and confusion for everyone! Many of these functions can be easily encapsulated in classes and the beauty of OOP can be applied to them. This will considerably reduce the size of them and make the process easier for future updates, maintenance and unit tests, of course. Every Python developer should know the advantages of OOP implementation and use it all the time!

Many Data Scientists write Python programs with absolutely no software development knowledge and experience. We know they have the necessities to write Python programs for their research. The problem is that they don’t have the required software development knowledge and skills to do it. In this case, we have two options: 1. Take some qualified IT classes about Software Development Life Cycle (SDLC) and learn how to design and develop programs properly, or 2. Write programs using whatever they know and/or find on Google. In general, this type of approach has a very high probability of “spaghetti code” implementation as a result. Maybe they want to be part of the following “very nice” two clubs?

1. “Technical Debt

2. “Scientific Debt

For some reason, many people think if they write a program and it works, then everything is completely fine. They fail to consider the impact this programming has on future updates and maintenance requirements. Here is one of my best quotes:

“A bad computer programmer writes code to only run the program, a good computer programmer writes code to run the program that can be easily updated later” — Ernest Bonat, Ph.D.

I hope you get this point!

3. No Error Handling Implementation

If any of the implemented methods crashed, the program will crash too. Don’t you want to log your programs’ errors (exceptions) for future auditing and business analysis? The answer to this question should be yes. In 2008, when I was working on a localization project and learning Python at the same time, I spent quite a bit of time figuring out the best solution to catch errors in Python.

Let’s look at a common task in Machine Learning projects pipeline like tuning hyperparameters model. Suppose you’re working in an image classification project and you would like to try Support Vector Machine classifier (SVC) and Artificial Neural Network (ANN) with Multi-layer Perceptron classifier (MLPClassifier()). To tune the hyperparameters model we’ll be using the GridSearchCV() method for each one from scikit-learn Machine Learning framework.

# svc model
ml_model = SVC()
hyper_parameter_candidates = {"C": [1e-4, 1e-2, 1, 1e2, 1e4],
"gamma": [1e-3, 1e-2, 1, 1e2, 1e3],
"class_weight": [None, "balanced"],
"kernel":["linear", "poly", "rbf", "sigmoid"]}
scoring_parameter = "accuracy"
cv_fold = KFold(n_splits=5, shuffle=True, random_state=1)
classifier_model = GridSearchCV(estimator=ml_model,
param_grid=hyper_parameter_candidates,
scoring=scoring_parameter, cv=cv_fold)
classifier_model.fit(X_train, y_train)
# ann model
ml_model = MLPClassifier()
hyper_parameter_candidates = {"hidden_layer_sizes":[(20), (50),
(100)], "max_iter":[500, 800, 1000],
"activation":["identity", "logistic", "tanh", "relu"],
"solver":["lbfgs", "sgd", "adam"]}
scoring_parameter = "accuracy"
cv_fold = KFold(n_splits=5, shuffle=True, random_state=1)
classifier_model = GridSearchCV(estimator=ml_model,
param_grid=hyper_parameter_candidates,
scoring=scoring_parameter, cv=cv_fold)
classifier_model.fit(X_train, y_train)

Based on the code above, I could write a simple function as:

def tune_hyperparameter_model(ml_model, X_train, y_train, hyper_parameter_candidates, scoring_parameter, cv_fold):   
classifier_model = GridSearchCV(estimator=ml_model,
param_grid=hyper_parameter_candidates,
scoring=scoring_parameter, cv=cv_fold)
classifier_model.fit(X_train, y_train)
return classifier_model

Let’s add some error handling code:

def tune_hyperparameter_model(ml_model, X_train, y_train, hyper_parameter_candidates, scoring_parameter, cv_fold):   
try:
classifier_model = GridSearchCV(estimator=ml_model,
param_grid=hyper_parameter_candidates,
scoring=scoring_parameter, cv=cv_fold)
classifier_model.fit(X_train, y_train)
except:
exception_message = sys.exc_info()[0]
print("An error occurred. {}".format(exception_message))
return classifier_model

As you can see any error occurred in the function body will be stored in the variable exception_message. I don’t want to write this exception block code for every function I write.

exception_message = sys.exc_info()[0]
print("An error occurred. {}".format(exception_message))

With some planning, I should be able to write a better generic function print_exception_message() to print my exception messages and use it for all my functions.

def print_exception_message(message_orientation="horizontal"):
"""
print full exception message
:param message_orientation: horizontal or vertical
:return None
"""
try:
exc_type, exc_value, exc_tb = sys.exc_info()
file_name, line_number, procedure_name, line_code =
traceback.extract_tb(exc_tb)[-1]
time_stamp = " [Time Stamp]: " + str(time.strftime("
%Y-%m-%d %I:%M:%S %p"))
file_name = " [File Name]: " + str(file_name)
procedure_name = " [Procedure Name]: " +
str(procedure_name)
error_message = " [Error Message]: " + str(exc_value)
error_type = " [Error Type]: " + str(exc_type)
line_number = " [Line Number]: " + str(line_number)
line_code = " [Line Code]: " + str(line_code)
if (message_orientation == "horizontal"):
print( "An error occurred:{};{};{};{};{};{};
{}".format(time_stamp, file_name, procedure_name,
error_message, error_type, line_number, line_code))
elif (message_orientation == "vertical"):
print( "An error occurred:\n{}\n{}\n{}\n{}\n{}
\n{}\n{}".format(time_stamp, file_name,
procedure_name, error_message, error_type,
line_number, line_code))
else:
pass
except:
exception_message = sys.exc_info()[0]
print("An error occurred. {}".format(exception_message))

If we implement this function in our code, we’ll use a simple line of code in the exception block only.

def tune_hyperparameter_model(ml_model, X_train, y_train, hyper_parameter_candidates, scoring_parameter, cv_fold):   
try:
classifier_model = GridSearchCV(estimator=ml_model,
param_grid=hyper_parameter_candidates,
scoring=scoring_parameter, cv=cv_fold)
classifier_model.fit(X_train, y_train)
except:
print_exception_message()
return classifier_model

In our case, for the MLPClassifier() model, we’ll have:

ml_model = MLPClassifier()
hyper_parameter_candidates = {"hidden_layer_sizes":[(20), (50),
(100)], "max_iter":[500, 800, 1000],
"activation":["identity", "logistic", "tanh", "relu"],
"solver":["lbfgs", "sgd", "adam"]}
scoring_parameter = "accuracy"
cv_fold = KFold(n_splits=5, shuffle=True, random_state=1)
classifier_model = tune_hyperparameter_model(ml_model, X_train,
y_train, hyper_parameter_candidates, scoring_parameter,
cv_fold)

We’ll continue updating the function tune_hyperparameter_model() code in the next topics.

4. Use Outdated Python Code and Syntax from Previous Versions

Old Python code is everywhere! Lazy Python programmers don’t want to spend the time to find out and learn the new Python code updates in the latest releases. Here is a good simple example:

Old code:

print("Hello Python!" + " - a good programming language!")

New code:

print("{}{}".format("Hello Python!", " - a good programming language!"))

Both statements produce the same results:

Hello Python! — a good programming language!

If you go to an interview for a Python Application Developer position, would you like to show the newest or oldest Python code? I don’t think you will win the interview if you use old Python coding. So, spend some time and always look for the “what’s new in Python?” based on the release number.

5. Hardcode of Default Numerical and String Parameters including Machine Learning Hyperparameters Model

Many default numerical and string values have been hardcoded in Python programs everywhere. Why? How many times does a Computer Science college professor tell you: “Don’t hardcode anything in any programming language”. The same issue appears in many Machine Learning online papers, where the hyperparameters models have been hardcoded everywhere by lazy Data Scientists. A simple way to fix these issues is to put all the default numerical and string values in a simple configuration (config.py) file. For security purposes any config could be encrypted as necessary. Let me update our function tune_hyperparameter_model() to handle GridSearchCV() and RandomizedSearchCV() methods.

def tune_hyperparameter_model(ml_model, X_train, y_train, hyper_parameter_candidates, scoring_parameter, cv_fold, search_cv_type="grid"):   
try:
if (search_cv_type=="grid"):
classifier_model = GridSearchCV(estimator=ml_model,
param_grid=hyper_parameter_candidates,
scoring=scoring_parameter, cv=cv_fold)
elif (search_cv_type=="randomized"):
classifier_model =
RandomizedSearchCV(estimator=ml_model,
param_distributions=hyper_parameter_candidates,
scoring=scoring_parameter, cv=cv_fold)
classifier_model.fit(X_train, y_train)
except:
print_exception_message()
return classifier_model

As you can see the input parameter search_cv_type is optional and equal to “grid”. If we create a config.py file with the following two lines of code:

GRID_SEARCH_CV=”grid”
RANDOMIZED_SEARCH_CV=”randomized”

We can update our function (import config statement is required).

def tune_hyperparameter_model(ml_model, X_train, y_train, hyper_parameter_candidates, scoring_parameter , cv_fold, search_cv_type=”grid”): 
try:
if (search_cv_type==config.GRID_SEARCH_CV):
classifier_model = GridSearchCV(estimator=ml_model,
param_grid=hyper_parameter_candidates,
scoring=scoring_parameter, cv=cv_fold)
elif (search_cv_type== config.RANDOMIZED_SEARCH_CV):
classifier_model = RandomizedSearchCV(estimator=ml_model,
param_distributions=hyper_parameter_candidates,
scoring=scoring_parameter, cv=cv_fold)
classifier_model.fit(X_train, y_train)
except:
print_exception_message()
return classifier_model

Now we have a nice generic function that can be used everywhere and we can change the logic of it by changing the config.py file only. Nothing has been hardcoded here! If you interviewed with me for a Python Developer position and you showed me code with hardcode numerical and string values, I’d send you home to watch the Disney channel in a minute! No questions or sorry about it!

6. Code Comments is not Provided at All, Especially Docstring Comments for Modules, Functions, Classes, or Methods Definition

No code comments is the most common issue I see in many applications I see today. It is very weird to me when a program has no comments at all or just a little bit only. It’s like the developer writes code for themselves only; and does not care about other developers that may need to modify their code in the future. What do we do with a Python developer that doesn’t write comments on his programs? Fire him right away? — Excellent question!

Python programming language has a specific standard way of writing comments for class objects and function procedures. It is called Docstring — “docstring is a literal string that occurs as the first statement in a module, function, class, or method definition. Such a docstring becomes the __doc__ special attribute of that object”.

Let add the Docstring for our function tune_hyperparameter_model().

def tune_hyperparameter_model(ml_model, X_train, y_train, hyper_parameter_candidates, scoring_parameter , cv_fold, 
search_cv_type="grid"):
"""
apply grid search cv and randomized search cv algorithms to
find optimal hyperparameters model
:param ml_model: defined machine learning model
:param X_train: feature training data
:param y_train: target (label) training data
:param hyper_parameter_candidates: dictionary of
hyperparameter candidates
:param scoring_parameter: parameter that controls what metric
to apply to the evaluated model
:param cv_fold: number of cv divided folds
:param search_cv_type: type of search cv (gridsearchcv or
randomizedsearchcv)
:return classifier_model: defined classifier model
"""
try:
if (search_cv_type==config.GRID_SEARCH_CV):
classifier_model = GridSearchCV(estimator=ml_model,
param_grid=hyper_parameter_candidates,
scoring=scoring_paramete, cv=cv_fold)
elif (search_cv_type==config.RANDOMIZED_SEARCH_CV):
classifier_model =
RandomizedSearchCV(estimator=ml_model,
param_distributions=hyper_parameter_candidates,
scoring=scoring_parameter, cv=cv_fold)
classifier_model.fit(X_train, y_train)
except:
print_exception_message()
return classifier_model

Finally we have a real production Python function implemented now. We should be able to include this function to a base (super) class to be reused in any Machine Learning projects.

7. Not Following the Python Naming and Conversion Standards Provided in “PEP 8 — Style Guide for Python Code

Every programming language has predefined naming and conversion rules. It’s very important to learn and apply these rules for application team development and maintenance. The Python naming and conversion standard is provided in PEP 8 — Style Guide for Python Code . Please, Python developers, apply them all the time!

8. Programs Unit Tests are not Implemented at All

I have no idea who said that Machine Learning projects don’t need unit tests at all. Every computer program created needs to have unit tests. I can remember working as a Consulting Software Engineer in many companies that did not allow me to deploy any production application until all the unit tests were implemented and properly passed. I mean all of them — is this clear? Yes! We need to implement unit tests in Machine Learning projects as well. One of the most important unit tests in Machine Learning classification projects is the calculation of the Accuracy Classification Score. Suppose I want my test data to have a high Accuracy Score compared with a required Threshold Accuracy Score value. If the calculated Accuracy Score is greater than or equal to Threshold Accuracy Score value, then I’ll use the results to make the required business classification decisions. If not, I may need to do the following: retrain my previous model, try other classification models, collect more data if possible, speak to a domain expert to get more context information, etc.

Let’s look at a simple unit test code shown below. This test for done using the Fashion-MNIST image datasets and a trained ANN model file fashion_mnist_ann_classification.pkl.

import unittest
import os
import config
import pandas as pd
import pickle
from fashion_mnist_ann import calculate_accuracy_scoreclass ANNTest(unittest.TestCase):
"""
ann unit test class
"""
def testAccuracyScore(self):
"""
accuracy classification score unit test
"""
# get data folder path
data_folder_path = config.DATA_FOLDER_PATH
# define fashion mnist test pandas dataframe
fashion_mnist_test = config.FASHION_MNIST_TEST
df_fashion_mnist_test =
pd.read_csv(os.path.join(data_folder_path,
fashion_mnist_test), header=None)
# get number of columns
df_fashion_mnist_test_columns =
df_fashion_mnist_test.shape[1]
# select y test label
target_column_number = config.TARGET_COLUMN_NUMBER
y_test =
df_fashion_mnist_test.iloc[:,0:target_column_number]
# flat y test label
y_test_flattened = y_test.values.ravel()
# select X test features
X_test =
df_fashion_mnist_test.iloc[:,target_column_number:
df_fashion_mnist_test_columns]
# normalize X test features with min-max scaling
X_test = (X_test.astype("float32") - config.XMIN) /
(config.XMAX - config.XMIN)
# open and close fashion mnist model pkl file
mlp_classifier_model_pkl =
open(config.FASHION_MNIST_MODEL_FILE, "rb")
mlp_classifier_model_file =
pickle.load(mlp_classifier_model_pkl)
mlp_classifier_model_pkl.close()
# get y predict test
y_predict_test = mlp_classifier_model_file.predict(X_test)
# calculate accuracy classification score
accuracy_score_value =
calculate_accuracy_score(y_test_flattened,
y_predict_test)
# test for accuracy score value greater than or equal to
threshold accuracy score

self.assertGreaterEqual(accuracy_score_value,
config.THRESHOLD_ACCURACY_SCORE, "Test Accuracy Score
Failed.")

if __name__ == "__main__":
unittest.main()

As you can see I have commented every line of code for you to understand very clearly how this unit test runs. In real production projects, this code will be encapsulated in a main derived class file to run train, validation and test data. In my future blog papers, I’ll be covering a simple OOP approach to develop and deploy Machine Learning models.

The final config.py file provides the necessary settings to run this test.

GRID_SEARCH_CV="grid"
RANDOMIZED_SEARCH_CV="randomized"
XMIN=0
XMAX=255
DATA_FOLDER_PATH= r"your data folder path\Fashion MNIST images"
FASHION_MNIST_TRAIN="fashion_mnist_train.csv"
FASHION_MNIST_TEST="fashion_mnist_test.csv"
TARGET_COLUMN_NUMBER=1
THRESHOLD_ACCURACY_SCORE=86
FASHION_MNIST_MODEL_FILE="fashion_mnist_ann_classification.pkl"

The function calculate_accuracy_score() calculates the Accuracy Classification Score.

def calculate_accuracy_score(label_true, label_predict):       
"""
calculate accuracy classification score
:param label_true: label true values
:param label_predict: label predicted values
return: accuracy classification score
"""
try:
accuracy_score_value = accuracy_score(label_true,
label_predict) * 100
accuracy_score_value = float("
{0:0.2f}".format(accuracy_score_value))
except:
print_exception_message()
return accuracy_score_value

Let’s run some testing. From the config.py file the Threshold Accuracy Score is equal 86%. After we run the test, it will pass because the calculated Accuracy Score is greater than or equal to 86%. If Threshold Accuracy Score increases to 90%, the test failed with the following message:

AssertionError: 86.33 not greater than or equal to 90 : Test Accuracy Score Failed.
Ran 1 test in 1.415s
FAILED (failures=1)

It’s clear at this point that our calculated Accuracy Score is 86.33%. It’s up to the Data Analytics team to decide, if this score is sufficient enough to be used for real business data classification. I would like to recommend the following blog paper “A Data Scientist’s Guide to Writing Unit Tests”. I found it unique and very interesting!

9. Developer Program Documentation not Provided

Why Python programmers don’t write developer program documentation? Maybe they don’t care, maybe there’s no time for it, nobody asks them to do it. This is one of the worst habits of any Application Developer today. I think if you submit a program without developer documentation, I’d never take you for lunch. “Don’t smile, it’s very serious dude!

10. Many Python Programmers Don’t Know the Latest Libraries and Frameworks Developed for Python Today

Many Python developers don’t read programming articles, blogs, eBooks, etc. They don’t participate in any Python city meetup groups, seminars, conferences, etc. If you are a Python Software Engineer and you need to write a program, don’t you need to know the latest Python libraries and frameworks available for specific development tasks? Why are you writing old Python code? Here is my suggestion to you: “Stop watching TV at night and instead read and practice the latest Python programing technologies”.

“Be a good Python Software Engineer, not a good “Pythonic” Software Engineer” — Ernest Bonat, Ph.D.

11. Best Practices to Untangle “Spaghetti Code”

  1. Modularization: Break your code into smaller, manageable modules or functions. Each module should have a specific purpose, making it easier to understand and maintain.
  2. Comments and Documentation: Use clear and concise comments to explain complex sections of code. Documentation strings (docstrings) are also helpful for describing the purpose of functions and classes.
  3. Naming Conventions: Use descriptive and meaningful names for variables, functions, and classes. This makes it easier to understand the purpose of each component without diving deep into the code.
  4. Code Reviews: Have peers review your code. Fresh eyes can often spot areas that need improvement or simplification.
  5. Refactoring: Take the time to refactor your code periodically. This involves restructuring it to make it more organized and readable without changing its external behavior.
  6. Design Patterns: Familiarize yourself with common design patterns that promote clean, maintainable code. These patterns provide proven solutions to recurring problems in software development.
  7. Testing: Implement unit tests to verify that each component of your code works as expected. This helps catch errors early and ensures that changes don’t introduce new bugs.

I hope you all got my points in the blog paper. Agree or disagree? — Let me know, I’m happy to hear from you!

--

--

Ernest Bonat, Ph.D.

I’m a Senior Data Scientist and Engineer consultant. I work on Machine Learning application projects for Life Sciences using Python and Python Data Ecosystem.