Good Refactor vs Bad Refactor

Bijit Ghosh
8 min readAug 18, 2024

--

Master the art of refactoring by transforming your code into a cleaner, more efficient powerhouse while avoiding the pitfalls that can turn improvements into headaches. Refactor smartly with tests, type safety, and collaborative reviews to elevate your codebase without breaking a sweat.

Refactoring is an essential practice in software development that involves restructuring existing code without changing its external behavior. When done right, refactoring can improve code quality, readability, and maintainability. However, when executed poorly, it can introduce bugs, decrease performance, and create more problems than it solves. In this blog post, we’ll explore the differences between good and bad refactoring practices, provide examples with code, and share tips and tools to help you refactor more effectively.

Understanding Refactoring

Before we dive into the specifics of good and bad refactoring, let’s establish a clear understanding of what refactoring is and why it’s important.

Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure. It’s a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior.

The main goals of refactoring are:

  1. Improving code readability
  2. Reducing complexity
  3. Enhancing maintainability
  4. Facilitating future changes and extensions

Now, let’s explore the characteristics of good and bad refactoring practices.

Good Refactoring Practices

1. Incremental Changes

Good refactoring involves making small, incremental changes rather than large-scale rewrites. This approach allows you to maintain control over the refactoring process and easily identify and fix any issues that may arise.

Example:

Let’s say we have a function that calculates the total price of items in a shopping cart:

def calculate_total(items):
total = 0
for item in items:
total += item['price'] * item['quantity']
return total

A good refactoring approach would be to extract the price calculation into a separate function:

def calculate_item_total(item):
return item['price'] * item['quantity']
def calculate_total(items):
return sum(calculate_item_total(item) for item in items)

This change is small, easy to understand, and maintains the original functionality while improving readability and modularity.

2. Preserving Behavior

Good refactoring ensures that the external behavior of the code remains unchanged. This means that the refactored code should produce the same output for the same input as the original code.

Example:

Consider a function that validates an email address:

def validate_email(email):
if '@' in email and '.' in email:
return True
else:
return False

A good refactoring might improve the validation logic while maintaining the same behavior:

import re
def validate_email(email):
pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
return bool(re.match(pattern, email))

This refactoring improves the email validation logic without changing the function’s interface or expected behavior.

3. Improving Code Quality

Good refactoring should lead to improved code quality, including better readability, reduced duplication, and enhanced maintainability.

Example:

Consider a class with duplicated code for different types of notifications:

class NotificationService:
def send_email_notification(self, user, message):
print(f"Sending email to {user.email}: {message}")
# Email sending logic
def send_sms_notification(self, user, message):
print(f"Sending SMS to {user.phone}: {message}")
# SMS sending logic
def send_push_notification(self, user, message):
print(f"Sending push notification to {user.device_id}: {message}")
# Push notification logic

A good refactoring would extract the common behavior and use polymorphism:

class NotificationService:
def send_notification(self, user, message, channel):
channel.send(user, message)
class EmailNotification:
def send(self, user, message):
print(f"Sending email to {user.email}: {message}")
# Email sending logic
class SMSNotification:
def send(self, user, message):
print(f"Sending SMS to {user.phone}: {message}")
# SMS sending logic
class PushNotification:
def send(self, user, message):
print(f"Sending push notification to {user.device_id}: {message}")
# Push notification logic

This refactoring reduces duplication and improves extensibility by allowing easy addition of new notification types.

4. Maintaining Performance

Good refactoring should not significantly degrade performance. In some cases, it may even improve performance by optimizing algorithms or data structures.

Example:

Consider a function that finds the maximum value in a list:

def find_max(numbers):
max_value = numbers[0]
for num in numbers[1:]:
if num > max_value:
max_value = num
return max_value

A good refactoring might improve performance by using Python’s built-in max function:

def find_max(numbers):
return max(numbers)

This refactoring maintains the same behavior while potentially improving performance, especially for large lists.

Bad Refactoring Practices

Now that we’ve explored good refactoring practices, let’s look at some examples of bad refactoring and why they should be avoided.

1. Overengineering

Bad refactoring often involves introducing unnecessary complexity or abstraction layers that don’t provide significant benefits.

Example:

Consider a simple function that adds two numbers:

def add(a, b):
return a + b

An overengineered refactoring might look like this:

class MathOperation:
def __init__(self, operation_type):
self.operation_type = operation_type
def execute(self, a, b):
if self.operation_type == 'addition':
return self.add(a, b)
# Other operations...
def add(self, a, b):
return a + b
def perform_addition(a, b):
math_op = MathOperation('addition')
return math_op.execute(a, b)

This refactoring introduces unnecessary complexity for a simple operation, making the code harder to understand and maintain.

2. Breaking Existing Functionality

Bad refactoring can introduce bugs or change the expected behavior of the code, leading to unexpected results and potentially breaking dependent systems.

Example:

Consider a function that calculates the average of a list of numbers:

def calculate_average(numbers):
return sum(numbers) / len(numbers)

A bad refactoring might inadvertently change the behavior:

def calculate_average(numbers):
return sum(numbers) // len(numbers) # Integer division instead of float division

This refactoring changes the function’s behavior by using integer division instead of float division, potentially leading to incorrect results.

3. Ignoring Performance Impact

Bad refactoring may significantly degrade performance without providing substantial benefits in terms of code quality or maintainability.

Example:

Consider a function that checks if a number is prime:

def is_prime(n):
if n < 2:
return False
for i in range(2, int(n**0.5) + 1):
if n % i == 0:
return False
return True

A bad refactoring might prioritize readability over performance:

def is_prime(n):
return n > 1 and all(n % i != 0 for i in range(2, n))

While this refactoring may appear more concise, it significantly reduces performance for large numbers by checking all numbers up to n instead of stopping at the square root of n.

4. Lack of Testing

Refactoring without proper testing can lead to undetected bugs and regressions. Bad refactoring often involves making changes without adequate test coverage or without running existing tests.

Example:

Consider refactoring a function that calculates the factorial of a number:

def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n - 1)

A bad refactoring might introduce a bug without proper testing:

def factorial(n):
result = 1
for i in range(1, n): # Bug: should be range(1, n + 1)
result *= i
return result

Without proper testing, this bug might go unnoticed, leading to incorrect results for all inputs.

Tips and Tools for Better Refactoring

To avoid bad refactoring practices and improve your refactoring skills, consider the following tips and tools:

1. Write Tests Before Refactoring

Before making any changes, ensure you have a comprehensive suite of tests that cover the functionality you’re about to refactor. This will help you catch any regressions or unintended changes in behavior.

Example:

import unittest
def calculate_total(items):
return sum(item['price'] * item['quantity'] for item in items)
class TestCalculateTotal(unittest.TestCase):
def test_empty_cart(self):
self.assertEqual(calculate_total([]), 0)
def test_single_item(self):
items = [{'price': 10, 'quantity': 2}]
self.assertEqual(calculate_total(items), 20)
def test_multiple_items(self):
items = [
{'price': 10, 'quantity': 2},
{'price': 5, 'quantity': 3},
{'price': 15, 'quantity': 1}
]
self.assertEqual(calculate_total(items), 50)
if __name__ == '__main__':
unittest.main()

By writing these tests before refactoring, you can ensure that your changes don’t break existing functionality.

2. Use Type-Safe Languages or Type Hints

Using type-safe languages or adding type hints can help catch potential errors early in the development process and make refactoring safer.

Example (using Python with type hints):

from typing import List, Dict
def calculate_total(items: List[Dict[str, float]]) -> float:
return sum(item['price'] * item['quantity'] for item in items)

These type hints make it clear what kind of data the function expects and returns, reducing the chances of type-related errors during refactoring.

3. Leverage Code Review

Always have your refactoring changes reviewed by other developers. Fresh eyes can spot potential issues or suggest improvements you might have missed.

4. Use Refactoring Tools

Many modern IDEs and code editors offer built-in refactoring tools that can automate common refactoring tasks, reducing the risk of manual errors.

Some popular refactoring tools include:

  • PyCharm’s refactoring tools for Python
  • Visual Studio Code’s refactoring extensions
  • ReSharper for .NET development
  • Eclipse’s refactoring tools for Java

5. Apply the Boy Scout Rule

Follow the Boy Scout Rule: “Always leave the code better than you found it.” Make small improvements as you work on the codebase, gradually refactoring over time.

6. Document Your Refactoring Decisions

Keep track of your refactoring decisions and the reasoning behind them. This can help other developers understand your choices and make it easier to maintain the code in the future.

Example:

# Refactored on 2023-08-18
# Reason: Extracted price calculation logic to improve readability and reusability
def calculate_item_total(item: Dict[str, float]) -> float:
"""
Calculate the total price for a single item.

This function was extracted from the calculate_total function to improve
modularity and allow for easier unit testing of the price calculation logic.
"""
return item['price'] * item['quantity']
def calculate_total(items: List[Dict[str, float]]) -> float:
"""
Calculate the total price for all items in the cart.

This function now uses the calculate_item_total function to improve readability
and maintainability. The list comprehension was replaced with a generator
expression to potentially improve memory usage for large lists.
"""
return sum(calculate_item_total(item) for item in items)

7. Refactor Incrementally

Instead of attempting large-scale rewrites, focus on incremental improvements. This approach reduces risk and allows for easier rollback if issues arise.

8. Monitor Performance

Keep an eye on performance metrics before and after refactoring. Use profiling tools to identify any performance regressions introduced by your changes.

A Personal Refactoring Experience

Let me share a personal experience that highlights the importance of careful refactoring. I once witnessed a developer attempt to consolidate all API functions into a single, generic function. The intention was to reduce code duplication and create a more streamlined API layer. Here’s a simplified version of what they did:

Before refactoring:

def get_user_data(user_id):
response = requests.get(f"{API_BASE_URL}/users/{user_id}", timeout=5)
return response.json()
def create_order(order_data):
response = requests.post(f"{API_BASE_URL}/orders", json=order_data, timeout=10)
return response.json()
def update_product(product_id, product_data):
response = requests.put(f"{API_BASE_URL}/products/{product_id}", json=product_data, timeout=15)
return response.json()

After refactoring:

def make_api_request(method, endpoint, data=None):
url = f"{API_BASE_URL}/{endpoint}"
response = requests.request(method, url, json=data, timeout=10)
return response.json()
def get_user_data(user_id):
return make_api_request("GET", f"users/{user_id}")
def create_order(order_data):
return make_api_request("POST", "orders", data=order_data)
def update_product(product_id, product_data):
return make_api_request("PUT", f"products/{product_id}", data=product_data)

While this refactoring seemed to simplify the code, it introduced several issues:

  1. All API calls now used the same timeout value, which wasn’t appropriate for all endpoints.
  2. The consolidated function didn’t account for different memory requirements of various API calls.
  3. Error handling became more generic, making it harder to deal with endpoint-specific issues.

After deployment, these issues caused numerous problems, with some API calls timing out and others failing due to insufficient memory allocation.

A better approach would have been to pass through the options and create a more flexible API function:

def make_api_request(method, endpoint, data=None, timeout=10, **kwargs):
url = f"{API_BASE_URL}/{endpoint}"
response = requests.request(method, url, json=data, timeout=timeout, **kwargs)
return response.json()
def get_user_data(user_id):
return make_api_request("GET", f"users/{user_id}", timeout=5)
def create_order(order_data):
return make_api_request("POST", "orders", data=order_data, timeout=15)
def update_product(product_id, product_data):
return make_api_request("PUT", f"products/{product_id}", data=product_data, timeout=20, max_retries=3)

This refactoring maintains the benefits of consolidation while allowing for endpoint-specific configurations. It demonstrates the importance of considering the unique requirements of different parts of your system when refactoring.

Conclusion

Refactoring is a powerful technique for improving code quality and maintainability, but it must be approached with care and consideration. Good refactoring practices focus on making incremental, behavior-preserving changes that enhance code quality without introducing unnecessary complexity or performance issues. Bad refactoring, on the other hand, can lead to bugs, decreased performance, and increased technical debt.

By following the tips and best practices outlined in this post, you can improve your refactoring skills and avoid common pitfalls. Remember to:

  1. Write tests before refactoring
  2. Use type-safe languages or type hints
  3. Leverage code reviews
  4. Utilize refactoring tools
  5. Apply the Boy Scout Rule
  6. Document your refactoring decisions
  7. Refactor incrementally
  8. Monitor performance

Refactoring is an ongoing process that requires practice and patience. As you gain experience, you’ll develop a better intuition for when and how to refactor effectively. Always keep in mind that the goal of refactoring is to improve the overall quality and maintainability of your codebase

--

--

Bijit Ghosh

CTO | Senior Engineering Leader focused on Cloud Native | AI/ML | DevSecOps