Python Configuration Management using Hydra by Meta

Pragyan Subedi
Coinmonks
3 min readFeb 5, 2024

--

This article is a concise explanation of the Hydra open-source configuration management package by Meta/Facebook Research.

Main Idea

“Facebook AI’s open source Hydra framework lets users compose and override configurations in a type-safe way (validated against user-provided schemas). Hydra also offers abstractions for launching to different clusters and running sweeps and hyperparameter optimization without changes to the application’s code. This greatly reduces the need for boilerplate code and allows researchers and engineers to focus on what really matters.” — Meta AI Blog

Main advantages of Hydra

Hydra has some distinct advantages over traditional configuration management in Python:

  1. Developers do not need to setup boilerplate code for command line flags, loading configuration files, setting directory paths, logging, etc. with Hydra
  2. Configurations can be set dynamically and can be overridden from the command line as needed.
  3. It has a pluggable architecture that allows developers to integrate Hydra with other infrastructures.

How to setup Hydra to handle configuration?

Here’s a quick breakdown of how to setup Hydra to handle configurations:

  1. First, install Hydra using the following command:
pip install hydra-core --upgrade

2. Next, create a configuration YAML file that will hold all necessary configuration files. For best practices, it is recommended to keep all your configuration files inside of a conf folder.

Here’s an example conf/config.yaml file:

hyperparameters:
N_EPOCHS: 20
BATCH_SIZE: 128
N_LAYERS: 3

3. Import hydra and initialize the main function with the hydra.main() decorator. The hydra.main() decorator expects config_path as the folder holding all of your configuration files and config_path as the configuration YAML filename.

# main.py
import hydra

@hydra.main(config_path="conf", config_name="config")
def main(cfg):

# Access the cfg variable here
print(cfg)
return

if __name__ == "__main__":
main()
Output:

{'hyperparameters': {'N_EPOCHS': 20, 'BATCH_SIZE': 128, 'N_LAYERS': 3}}

If you have any bugs, here’s the folder structure for your reference,

Handling configuration variables using dataclasses

Since Hydra can be integrated directly with Python, we can make use of Hydra’s dataclassesto

  1. Create a config.py file
# config.py
from dataclasses import dataclass

@dataclass:
class Hyperparameters:
N_EPOCHS: int
BATCH_SIZE: int
N_LAYERS: int

@dataclass
class AllConfig:
hyperparameters:Hyperparameters

2. Import the file and make use of Hydra’s config store in your main.py

# main.py
import hydra
from hydra.core.config_store import ConfigStore
from config import AllConfig

cs = ConfigStore.instance()
cs.store(name="all_config", node=AllConfig)


@hydra.main(config_path="conf", config_name="config")
def main(cfg: AllConfig):

# Access the cfg variable here
print(cfg)
return


if __name__ == "__main__":
main()
Output:

{'hyperparameters': {'N_EPOCHS': 20, 'BATCH_SIZE': 128, 'N_LAYERS': 3}}

If you have any bugs, here’s the folder structure for your reference,

References:

  • Reengineering Facebook AI’s deep learning platforms for interoperability [Link]
  • Hydra Documentation [Link]
  • Hydra GitHub [Link]

--

--