21 Techniques to Write Better Code

Bruce H. Cottman, Ph.D.
Apr 9 · 10 min read

I have pulled together these techniques from 35 years of programming in multiple languages on multiple projects. There is before and after Python code examples that apply each technique. I also show how to personalize Pycharm for these techniques for your projects.

Image for post
Image for post
PyCharm Local file History feature, Animation by Rachel Cottman

Photonai incorporates Scikit-Learn and other machine learning (ML) or deep learning (DL) frameworks with one unifying paradigm. Photonai adopts Scikit-Learn’s Estimator, and Transformer class method architecture.

Photonai adds code that reduces manual coding and error by transforming pre- and post-learner algorithms into elements with their argument signature. Examples of elements are several choices of data cleaners, scalers, imputers, class balancers, cross-validators, hyper-parameter tuners, and ensembles.

By applying these twenty-one techniques, we start the adding of clustering to the Photonai package. I give a before and after Python code example to show the transformation of Photonai using the technique.

Technique #1:

  • Create a new version a+1.0.0, if there is a major amount of architecture change, a major amount of code changed and, a major amount of new behavior added.
  • Create a new sub-version a.b+1.0, if there is a minor architecture change, a minor amount of code changed and, a minor amount of new behavior added.
  • Create a new sub-sub-version a.b.c+1, if adding a minor amount of code, and/or, there are bug fixes.

I use PyCharm to git add all changed and new files of the project Photonai 1.1.0.

Image for post
Image for post
PyCharm git add for all changed files in project Photonai 1.1.0., Animation by Rachel Cottman

Note: You can git add a single file in the same manner as a project. Left-click on the file name instead of the project directory name.

The changes that are shown here result in a new sub-version 1.1.0 from 1.0.0. I use git.

git checkout -b 1.1.0
git status

terminal output ->>

On branch 1.1.0Changes to be committed:(use "git reset HEAD <file>..." to unstage)
modified: photonai/base/registry/element_dictionary.py
modified: photonai/base/registry/registry.py
new file: photonai/custom_elements/CustomElements.json
new file: photonai/driver.py
modified: photonai/processing/metrics.py
modified: photonai/tests/processing_tests/metrics_test.py
new file: photonai/tests_ext/CustomElements/CustomElements.json
new file: photonai/tests_ext/base/test_hyperpipe.py
new file: photonai/tests_ext/base/test_register.py
new file: photonai/tests_ext/processing/test_metrics.py
new file: photonai/util.py

Note: The above is a list of all files changed or added for project Photonai 1.1.0

Technique #2: Document locally any major addition or change in code.

class Scorer(object):
"""
Transforms a string literal into a callable instance of a
particular metric
BHC 1.1.0 - support cluster scoring by add clustering
scores and type
- added METRIC_METADATA dictionary
- added ML_TYPES = ["Classification", "Regression",
"Clustering"]
- added METRIC_<>ID Postion index of metric
metadata list
- added SCORE_TYPES
- added SCORE_SIGN
"""

Technique #3: Create long and descriptive names for constants, variables, functions and, class methods.

Technique #4: Apply PEP-8 naming conventions.

In the below example, a global constant ELEMENT_TYPE is uppercase and, variable machine_learning_type is lowercase.

Globals:ELEMENT_TYPE -> ML_TYPE
ELEMENT_DICTIONARY -> METRIC_METADATA
variables:machine_learning_type -> element_type

Note: PyCharm can run an external tool for formatting. I use black, which formats a .py file or the entire project into PEP-8 compliant format. The result is that all files are formatted the same. The follow-on result is an increased readability score.

Image for post
Image for post
PyCharm External Tool: black applied to all .py files of the project Photonai 1.1.0., Animation by Rachel Cottman

Note: PyCharm can change names wherever used in a package. I understand that the word refactor may be considered bad by some. However, changing a name throughout the project is really useful. Please ignore how PyCharm categorizes its features. Use rename from the beginning so that you don't have to refactor your code!

Image for post
Image for post
PyCharm git rename feature, Animation by Rachel Cottman

Note: PyCharm find usage feature, all the files in your project that reference, a constant, variable, function, class, or class method.

Image for post
Image for post
PyCharm Find Usages feature, Animation by Rachel Cottman

Python is a dynamically typed language. Python 3.5 and higher version releases enable type hints (PEP 484). I emphasize the word hints because type hints do not affect the Python interpreter. As far as you are concerned, the Python interpreter ignores type-hints.

Type hints(note: not strong type checking) make it possible to accomplish bug hunting, finding security holes and static type checking after the first pass of coding and unit testing.

Technique #5: Add type hinting to every function or class method.

def is_machine_learning_type(ml_type: str) -> bool:

Note: The docstring no longer needs each arg and the return datatype to be documented. The signature becomes self-documenting.

Note: The readability as well as information of post-code-checking tools has increased.

@dataclass was an added feature of Python 3.7. The main driving force was to eliminate boilerplate associated with the state in a def class definition.

@dataclass decorates a def class definition and automatically generates the five double dunder methods __init__() , __repr__() , __str__,__eq__(), and , __hash__().

@dataclass virtually eliminates repetitive boilerplate code required to define a basic class. Notice that all these five double dunder methods work directly with the class’s encapsulated state.

Technique #6: Use Python 3.7+ @dataclass.

class PhotonRegistry:
"""
Helper class to manage the PHOTON Element Register
...
"""
base_PHOTON_REGISTRIES = ['PhotonCore', 'PhotonNeuro']
PHOTON_REGISTRIES = ['PhotonCore', 'PhotonNeuro']
def __init__(self, custom_elements_folder: str = None):
if custom_elements_folder:
self._load_custom_folder(custom_elements_folder)
else:
self.custom_elements = None
self.custom_elements_folder = None
self.custom_elements_file = None

After @dataclassdecorater=>

@dataclass
class PhotonRegistry:
"""
Helper class to manage the PHOTON Element Register
...
"""

custom_elements_folder: str = None
custom_elements: str = None
custom_elements_file: str = None
base_PHOTON_REGISTRIES: ClassVar[List[str]] =/
["PhotonCore", "PhotonNeuro"]
PHOTON_REGISTRIES: ClassVar[List[str]] =/
["PhotonCore", "PhotonNeuro"]
def __post_init__(self):
if self.custom_elements_folder:
self._load_custom_folder(self.custom_elements_folder)

Note: Readability has increased substantially by using @dataclass with type hinting.

Technique #7: Transform irrelevant comments to relevant comments.

Version 1.0.0 (original)

# register new object
PhotonRegister.save("ABC1", "namespace.filename.ABC1", "Transformer")

Version 1.1.0

# register new element
PhotonRegister.
register("ABC1", "namespace.filename.ABC1", "Transformer")

Technique #8: Keep only architectural #todo comments.

Note: PyCharm has find all #Todo for the project Photonai 1.1.0.

Image for post
Image for post
PyCharm find all Todo of the project Photonai 1.1.0., Animation by Rachel Cottman

Technique #9: Delete old code that is commented

Version 1.1.0 (before commented out 1.1.0 code)

def __post_init__(self):
if self.custom_elements_folder:
self._load_custom_folder(self.custom_elements_folder)
# base_PHOTON_REGISTRIES = ["PhotonCore", "PhotonNeuro"]
# PHOTON_REGISTRIES = ["PhotonCore", "PhotonNeuro"]
# def __init__(self, custom_elements_folder: str = None):
# if custom_elements_folder:
# self._load_custom_folder()
# else:
# self.custom_elements = None
# self.custom_elements_folder = None
# self.custom_elements_file = None

Version 1.1.0 (after removal of commented out 1.1.0 code)

def __post_init__(self):
if self.custom_elements_folder:
self._load_custom_folder(self.custom_elements_folder)

Note: LOC (Lines-Of-Code) complexity is reduced by elimination of commented out old code. Readability score is increased.

Technique #10: Create function or method to encapsulate Value checking. The example below, creates a method to check if a value is in a global constant.

@staticmethod
def is_machine_learning_type(ml_type: str) -> bool:
"""
:raises
if not known machine_learning_type
:param machine_learning_type
:return: True
"""
if ml_type in Scorer.ML_TYPES:
return True
else:
logger.error(
"Specify valid ml_type to choose best config: {}".format(ml_type)
)
raise NameError(becomes

Technique #11: Create Package Error Type: For package Photoai, we created error type Photonai.

Technique #12: Create Raise Function with Package Error Type: For error type Photonai, we created function raise_PhotonaiError.

import loggingclass PhotoaiError(Exception):
pass
def raise_PhotoaiError(msg):
logger.error(msg)
raise PhotoaiError(msg)

Method is_machine_learning_type becomes:

@staticmethod
def is_machine_learning_type(ml_type: str) -> bool:
"""
Parameters:
-----------
machine_learning_type
Returns
-----------
True
Raises
-----------
if not known machine_learning_type
"""
if ml_type in Scorer.ML_TYPES:
return True
raise_PhotoaiError(
"Specify valid ml_type:{} of [{}]".format(ml_type,
Scorer.ML_TYPES))

Note:raise_PhotoaiError combines the log and runtime error. The case of NOT true halts and displays the stack. As a runtime error, it behaves as we would expect if it passes an unknown value. Notice the improvement in readability and maintenance costs.

Comments are one part of the code. They have a maintenance cost associated with any change in behavior.

Technique #13: Transform behaviorally wrong comments into behaviorally correct comments to lower maintenance cost and bug generation.

Technique #14: Add comments to increase Readability of code.

The comment before the change:

@staticmethod
def get_json(photon_package):
"""
Load JSON file in which the elements for the PHOTON submodule are stored.
The JSON files are stored in the framework folder by the name convention 'photon_package.json'Parameters:
-----------
* 'photon_package' [str]:
The name of the photonai submodule
Returns:
--------
JSON file as dict, file path as str
"""

The comment after change:

@staticmethod
def load_json(photon_package: str) -> Any:
"""
load_json Loads JSON file.

The class init PipelineElement('name',...)
stores the element metadata in a json file.

The JSON files are stored in the framework folder
by the name convention 'photon_<package>.json'.
(example:$HOME/PROJECTS/photon/photonai/base/registry/PhotonCore.json)

The file is of format
{ name-1: ['import-pkg-class-path-1', class-path-1)],
name-2: ['import-pkg-class-path-2', class-path-2)],
....}

Parameters
----------
photon_package: The name of the photonai package of element metadata
Returns
-------
[file_content, file_name]
Notes
-------
if JSON file does not exist, then create blank one.
"""

Note: It is not documented when photon_package file does not exist. If such behavior is proper, it should be documented.

Technique #15: Choose a docstring style and use consistently throughout the project (package, library, …). We use Numpy Style because it is suited for detailed documentation of classes, methods, functions, and parameters.

Note: PyCharm docstring format is set from the top ribbon sequence PyCharm|Preferences|Tools|Python Integrated Tools, as shown in the following gif. I am on a Macintosh. It is different on Windows or Linux.

Image for post
Image for post
PyCharm set docstring style for the project Photonai 1.1.0., Animation by Rachel Cottman

Using Numpy Style and correcting for any gaps, the comment becomes:

@staticmethod
def get_json(photon_package: str) -> Any:
"""
get_json Loads JSON file.
The class init PipelineElement('name',...)
stores the element metadata in a json file.
The JSON files are stored in the framework folder
by the name convention 'photon_<package>.json'.
(example:$HOME/PROJECTS/photon/photonai/base/registry/PhotonCore.json)
The file is of format
{ name-1: ['import-pkg-class-path-1', class-path-1)],
name-2: ['import-pkg-class-path-2', class-path-2)],
....}
Parameters
----------
photon_package: The name of the photonai package
of element metadata.
Returns
-------
[file_content, file_name]
Notes
-------
If JSON file does not exist, then create a blank one.
"""

Note: Type hints create less complicated documentation of the signature of the method.

Note: PyCharm enters the Numpy docstring template. Pycharm only inserts return because the function has no parameters. If you right-click, it lists all allowed Numpy docstring keywords. Really cool!

Image for post
Image for post

Technique #16: Choose a version control system and use it. We use Github and Gitlab. We are required to push every day, and local commits are usually several times a day. The frequency of commits is left to the developer to decide.

Note: Pycharm has a local history. Click-right in any file, and Pycharm pulls up file changes since the launch of Pycharm. local history is a very light VCS that has saved many developers.

Image for post
Image for post
PyCharm Local file History feature, Animation by Rachel Cottman

Note: Pycharm can add a file, folder, or an entire package to the VCS. I use git in the following animation.

Image for post
Image for post
PyCharm git add for all changed files in project Photonai 1.1.0., Animation by Rachel Cottman.

Technique #17: The developers of the code develop unit tests.

Note: Setting the tool pytest in Pycharm.

Image for post
Image for post
Setting the tool pytest in Pycharm for the project Photonai 1.1.0., Animation by Rachel Cottman

Technique #18: Unit tests have 80%+ Coverage. We use coverage.

Technique #19: Integration tests must have 100% coverage of all external (API) functions, classes, class attributes, and class methods.

Technique #20: The developers do NOT accomplish acceptance testing.

For example, we selected these unit tests from the 24 unit test in test_metrics.py.

# 22
def test_is_machine_learning_type_bad():
with pytest.raises(PhotoaiError):
assert Scorer.is_machine_learning_type("fred")
# 23
def test_is_machine_learning_type():
assert Scorer.is_machine_learning_type("Clustering")

Note: Once the test tool is selected, PyCharm marks individual runnable tests are marked with a filled arrow in the left bar. Individual testing using pytest:

Image for post
Image for post

Also, with PyCharm, you can run a test file or an entire test folder using pytest, and the results are shown. Reference bottom panel-ribbon. (passed 39/40 test — 32 ms)

Image for post
Image for post
PyCharm test tool specified for project Photonai 1.1.0., Animation by Rachel Cottman

Technique #21: All the package code is reviewed before the push to the master VCS. A developer can review their code using the tool Codacy.

In case you can’t tell, I like PyCharm. What my like-minded software engineers and I appreciate about PyCharm is that it enables a good hunk of our development pipeline is automated without learning a CD/CI package or an MLOps package. The natural clustering of objects in PyCharm results in us not writing any automation code. (well ok, a bash script here and there.)

I am not trying to sell you on using PyCharm. I was using EMACS during the discovery of most of the coding techniques shown here. (If you have never heard of EMACS, you probably missed the AI 1.0 disaster.)

If you use PyCharm, then hopefully, some of the animations will help you with PyCharm setup and usage.

I have given only Python examples, but most of these techniques apply to other languages.

All 21 techniques result in better understandability. You have a better understanding of results, better testing, lower number of bugs, and lower maintenance cost. (Python zen-masters refer to it as readability.)

You can get a more understandable Photonai code for your machine-learning project from a clone-able GitHub repo. I recommend monthly updates, as Photonai is an on-going project.

The Startup

Medium's largest active publication, followed by +716K people. Follow to join our community.

Bruce H. Cottman, Ph.D.

Written by

Physicist, Machine Learning Scientist and constantly improving Software Engineer. I extrapolate the future from emerging technologies.

The Startup

Medium's largest active publication, followed by +716K people. Follow to join our community.

Bruce H. Cottman, Ph.D.

Written by

Physicist, Machine Learning Scientist and constantly improving Software Engineer. I extrapolate the future from emerging technologies.

The Startup

Medium's largest active publication, followed by +716K people. Follow to join our community.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store