Lovely Python config handling using Pydantic

Anders Wiklund
10 min readFeb 23, 2023

--

Mechanic Valentine’s Day Creative Heart Gear photo

Introduction

I’m impressed with what you can do with the third party Pydantic package when it comes to flexible configuration handling. I’m going to introduce you to this wonderful world and show you what you can do. I won’t cover everything it can do, but what I cover will hopefully pique an interest in your coding heart, so that you want to try this out for yourself.

Different ways of assigning values to configuration parameters:

  • Directly assigned value, which can be updated by secrets or env files.
  • Values of previously defined parameters can be used to combine new parameter values.
  • Values can be extracted from environment variables defined in the OS.
  • Values can be read from Pydantic secrets, which are files.
  • Values can be read from the .env file and from named files, for example prod.env, when they exist.
  • Values can be read from Docker secrets when you are running in a Docker container.
  • Read values can be validated or changed into something else if needed.

If you then combine this with other techniques, like inheritance, you can accomplice quite interesting things. That is what this article is about.

To use Pydantic you need to have at least python v3.7 installed, and you can install it like this

Pydantic have just release v2 and it’s not backwards compatible. No worries I have updated this example to be compliant with the new version.

 Pip install pydantic-settings python-dotenv colorama

Example code

The structure of the example code is to have one common setup and adaptions for different environments like dev, test, stage, and prod. The idea is that every server should belong to one of these environments. This is accomplished by defining an environment variable named ENVIRONMENT on each server and set it to one of these four types.

The reason for that is that it might be different DB servers that you have access to in different environments. They could use different ports, connection parameters might differ etc.

I’m going to show you code snippets and then try to explain what they do. Let’s dive in.

configurator.py

Insert this code in the beginning of the file:

# BUILTIN modules
import os
import sys
import site
from typing import Union, Type, Tuple

# Third party modules
from pydantic import Field, computed_field
from pydantic_settings import (BaseSettings, SettingsConfigDict,
PydanticBaseSettingsSource)

# Constants
USER_BASE = site.getuserbase()
""" This is required when programs are frozen."""
MISSING_ENV = '>>> undefined ENV parameter <<<'
""" Error message for missing environment variables. """
MISSING_SECRET = '>>> missing SECRETS file <<<'
""" Error message for missing secrets file. """
SECRETS_DIR = ('/run/secrets'
if os.path.exists('/.dockerenv')
else f'{site.USER_BASE}/secrets')
""" This is where your secrets are stored (in Docker or locally). """
PLATFORM = {'linux': 'Linux', 'linux2': 'Linux',
'win32': 'Windows', 'darwin': 'MacOS'}
""" Known platforms in my end of the world. """
ENVIRONMENT = os.getenv('ENVIRONMENT', MISSING_ENV)
""" Define environment. """

# --------------------------------------------------------------
# This needs to be done before the Base class gets evaluated, and
# to avoid getting five UserWarnings that the path does not exist.
#
# Create the directory if it does not already exist. When running
# inside Docker, skip it (Docker handles that just fine on its own).
#
if not os.path.exists('/.dockerenv'):
os.makedirs(SECRETS_DIR, exist_ok=True)

The MISSING_* constants provide a nice touch that I will talk more about later.

The SECRETS_DIR constant is Docker aware when defining the path to where secrets are stored.

The last row, os.makedirs is a useful command. It will create the complete path structure if it doesn’t already exist.

Append this to the file:

# ------------------------------------------------------------------------
#
class Common(BaseSettings):
""" Common configuration parameters shared between all environments.

Read configuration parameters defined in this class, and from
ENVIRONMENT variables and from the .env file.

The source priority is changed (from default) to the following
order (from highest to lowest):
- init_settings
- dotenv_settings
- env_settings
- file_secret_settings

The following environment variables should already be defined:
- HOSTNAME (on Linux servers only - set by OS)
- COMPUTERNAME (on Windows servers only - set by OS)
- ENVIRONMENT (on all servers)

Path where your <environment>.env file should be placed:
- linux: /home/<user>/.local
- darwin: /home/<user>/.local
- win32: C:\\Users\\<user>\\AppData\\Roaming\\Python'

Path where your secret files should be placed:
- linux: /home/<user>/.local/secrets
- darwin: /home/<user>/.local/secrets
- win32: C:\\Users\\<user>\\AppData\\Roaming\\Python\\secrets'
"""
model_config = SettingsConfigDict(extra='ignore',
secrets_dir=SECRETS_DIR,
env_file_encoding='utf-8',
env_file=f'{USER_BASE}/.env')

# constant parameters.
routingDbPort: int = 8012
trackingDbPort: int = 8006
mqServer: str = 'P-W-MQ01'
apiTimeout: tuple = (9.05, 60)

# Environment depending parameters.
env: str = ENVIRONMENT
platform: str = PLATFORM.get(sys.platform, 'other')

# Secrets depending parameters.
serviceApiKey: str = Field(MISSING_SECRET, alias='service_api_key')

@computed_field
@property
def hdrData(self) -> dict:
""" Return updated API header (added serviceApiKey secret).

:return: Updated API header.
"""
return {'Content-Type': 'application/json',
'X-API-Key': f'{self.serviceApiKey}'}

@computed_field
@property
def server(self) -> str:
""" Return local server name stripped of possible domain part.

:return: Server name in upper case.
"""
name = ('COMPUTERNAME' if sys.platform == 'win32' else 'HOSTNAME')
return os.getenv(name, MISSING_ENV).upper().split('.')[0]

@classmethod
def settings_customise_sources(
cls,
settings_cls: Type[BaseSettings],
init_settings: PydanticBaseSettingsSource,
env_settings: PydanticBaseSettingsSource,
dotenv_settings: PydanticBaseSettingsSource,
file_secret_settings: PydanticBaseSettingsSource,
) -> Tuple[PydanticBaseSettingsSource, ...]:
""" Change source priority order (env trumps environment). """
return (init_settings, dotenv_settings,
env_settings, file_secret_settings)

This is the Common class that the different environments are going to inherit from. The documentation mentions default paths for some platforms. To find out which one you are on, execute the following commands at a python prompt:

>> import sys
>> sys.platform

The class starts with an model_config declaration (it’s a “reserved” word in the Pydantic lingua). Defining the secrets_dir attribute unleashes the handling of secrets.

All you need to do is to create a file in the secrets_dir directory with the alias name (case sensitive). My example here is service_api_key. If a file does not exist with the same name as the alias, the class attribute will use the default value, the MISSING_SECRET constant.

This technique makes it a lot easier during troubleshooting when things are not working. The same technique can be used for identifying missing environment variables.

You can use any declarative or primitive python type that you need, there are no restrictions.

The computed_field decorator is used for manipulating the hdrData and server class attribute values. You can do a lot with this decorator. I’m only showing you a glimpse of its capabilities. The server attribute removes a possible domain part of the computer name (for example if the server’s name returned from the OS is oden.gmail.com, the returned result will be ODEN). This might be useful in a corporate environment, normally not as much at home.

The internal class method settings_customise_sources is used for changing the source priority order, i.e. in which order the different data sources are parsed. I have switched the order of dotenv_settings and env_settings parameter parsing. The reason for this is is explained in the end of this article.

Now that we have the embryo to a very powerful configuration handling it’s time to see what I meant with multiple environment and inheritance a bit earlier.

Append this code to the file:

# ------------------------------------------------------------------------
#
class Dev(Common):
""" Configuration parameters for DEV environment.

Values from dev.env supersede previous values when the file exists.
"""
model_config = SettingsConfigDict(env_file=f'{USER_BASE}/{Common().env}.env')

mqServer: str = 'localhost'
dbServer: str = 'localhost:3306'
apiRoot: str = 'http://localhost'
mongoUrl: str = Field(MISSING_SECRET, alias=f'mongo_url_{Common().env}')


# ------------------------------------------------------------------------
#
class Test(Common):
""" Configuration parameters for TEST environment.

Values from test.env supersedes previous values when the file exists.
"""
model_config = SettingsConfigDict(env_file=f'{USER_BASE}/{Common().env}.env')

dbServer: str = 't-l-docker01:3306'
apiRoot: str = 'http://internal_api_test_host'
mongoUrl: str = Field(MISSING_SECRET, alias=f'mongo_url_{Common().env}')


# ------------------------------------------------------------------------
#
class Stage(Common):
""" Configuration parameters for STAGE environment.

Values from stage.env supersede previous values when the file exists.
"""
model_config = SettingsConfigDict(env_file=f'{USER_BASE}/{Common().env}.env')

routingDbPort: int = 8013
trackingDbPort: int = 8007
dbServer: str = 't-l-docker01:3307'
apiRoot: str = 'http://internal_api_stage_host'
mongoUrl: str = Field(MISSING_SECRET, alias=f'mongo_url_{Common().env}')


# ------------------------------------------------------------------------
#
class Prod(Common):
""" Configuration parameters for PROD environment.

Values from prod.env supersedes previous values when the file exists.
"""
model_config = SettingsConfigDict(env_file=f'{USER_BASE}/{Common().env}.env')

dbServer: str = 'ocsemysqlcl:3306'
apiRoot: str = 'http://internal_api_prod_host'
mongoUrl: str = Field(MISSING_SECRET, alias=f'mongo_url_{Common().env}')

The code above contains four classes where each symbolizes a specific environment.

Every derived class has an inner model_config declaration. This is used for defining the name of a platform specific environment file that can be used for changing already set config values.

You can use the value of an attribute from the Common class when you create an attribute in a derived class. For example, look at mongoUrl in Dev.

How can this be used then? What are the benefits?

For example:

  • the name of the database server might be different in a test environment compared to a production environment (se dbServer attribute for Test and Prod).
  • They might exist on the same server, but on different ports (see dbServer attribute for Test and Stage).
  • A service might have a default port on all platforms, except on one where it’s different (se routingDbPort attribute, it exists in Common and Stage only).
  • Passwords might be different for the same service in different environments (can be handled with an environment variable or a secret).
  • You can have Docker secrets that you can access with your config secrets configuration.

Here’s the last part that ties it all together. Append this code to the file:

# ------------------------------------------------------------------------

# Translation table between ENVIRONMENT value and their classes.
_setup = dict(
dev=Dev,
test=Test,
prod=Prod,
stage=Stage
)

# Validate and instantiate specified environment configuration.
config: Union[Dev, Test, Prod, Stage] = _setup[ENVIRONMENT]()

That the variable _setup starts with an underscore is a widely used Python convention that means that it should be considered private. and developers should not use underscore variables, procedures, or methods etc. in imported modules.

The instantiation of the config object happens by extracting the ENVIRONMENT environment value (you need to have it defined on each platform that you use) and use that value as an index in the setup dict. Notice the parenthesis at the very end of that line, that’s the actual instantiation.

Testing the configuration

his is a small test script that you can use to test your configurations, what they look like on different platforms.

It uses color to make it clearer when the config is OK, or when it’s not correct.

test_config.py

Insert this code in the file:

# BUILTIN modules
import os
import sys
import json
import argparse

# Third party modules
import colorama


# ---------------------------------------------------------
#
def _display_status(env: str, secret: str, json_dump: str) -> str:
""" Show valid parameters in green and missing values in RED.

:param json_dump: The config object as a json string.
"""
lines = []
color = {env: f'{colorama.Fore.RED}{env}{colorama.Fore.GREEN}',
secret: f'{colorama.Fore.RED}{secret}{colorama.Fore.GREEN}'}

for row in json_dump.split('\n'):
lines.append(
(
f'{colorama.Fore.GREEN}{row}{colorama.Style.NORMAL}'
.replace(secret, color[secret])
.replace(env, color[env])
))

return '\n'.join(lines)


# ---------------------------------------------------------
#
def run():
""" A utility script to test the configuration with different environments.

Usage: verify_config [-h] [{dev,test,prod,stage}]

If no environment is given, the defined operating system environment
variable "ENVIRONMENT" will be used.

A check that it's defined in the operating system is also done.
"""
form = argparse.ArgumentDefaultsHelpFormatter
description = (
'A utility script to test the configurator with different '
'environments. Default value is current the ENVIRONMENT.')
parser = argparse.ArgumentParser(description=description,
formatter_class=form)
parser.add_argument(dest='environment', nargs='?',
help="Specify environment to use",
choices=['dev', 'test', 'prod', 'stage'])
args = parser.parse_args()

# Make sure an environment is already defined.
if not os.getenv('ENVIRONMENT'):
print("ERROR: Environment variable 'ENVIRONMENT' "
"is not defined for this user!")
sys.exit(1)

# To be able to test different environments, we need
# to set this BEFORE we import the config module.
if args.environment:
os.environ['ENVIRONMENT'] = args.environment

from configurator import config
from configurator import MISSING_ENV
from configurator import MISSING_SECRET

# Show valid values in green and missing values in RED.
print(_display_status(
MISSING_ENV, MISSING_SECRET,
json.dumps(indent=4, sort_keys=True,
obj=config.model_dump())))


# ---------------------------------------------------------

if __name__ == "__main__":
run()

When I run the test_conf.py script like this:

python Test_conf.py dev

I get the following result:

If you look at the content of mongoUrl key you will see what I meant earlier when I talked about easier troubleshooting.

If I create the missing mongo_url secrets file the result looks like this:

Much better.

The last feature I want to talk about is what I would characterize as a temporary fix, for testing or if for example you need to make a temporary fix and you don’t want to start to change secrets files, or environment variables. There is a usable solution for this as well.

We can use a temporary env file that you need to place where site.USER_BASE points to (see the class documentation in the configurator.py file if you don’t want to run the command yourself). The name of the file relates to the active environment. Since my current environment is ‘dev’ I can place a dev.env in that directory. Let’s say that I want to test working against the stage database then the content of that file will look like this:

DBSERVER='t-l-docker01:3306'

Then when I run the test_conf.py script again I get the following result:

When you are done testing just remove the file, or comment out the inserted line. Then when you start your program it will access your development DB as usual again.

How to use it

The test program dumps out the config object as a dict, but that’s not how you use it in your coding. The quick answer is that you import it and then you dot-reference it. But I think a small example is nicer.

from pprint import pprint

# Third party modules
import requests

from configurator import config

response = requests.get(url='http://localhost:8100/health',
headers=config.hdrData, timeout=config.apiTimeout)

if response.status_code == 200:
pprint(response.json(), width=2)

Output from example:

{'name': 'FastAPI-MongoDB-example',
'resources': [{'name': 'MongoDb',
'status': True}],
'status': True,
'version': '0.9.0'}

The output is is from an API that I’m writing another article series about. An extensive example on API development using FastAPI.

I hope you enjoyed this article and got inspired to test these techniques yourself. Remember, if you like this article don’t hesitate to clap (I mean digitally 😊).

Happy coding in Gotham.

/Anders

--

--