Implementing Design of Experiment in Machine-Learning

Dr Lim Thou Tin
DataFrens.sg
Published in
4 min readMay 10, 2024
Photo by Med Badr Chemmaoui on Unsplash

Embarking on the journey of AI projects necessitates a systematic and experimental approach which is an important part of a machine learning practitioner’s job. By integrating experimentation into the fabric of the overall initiatives, the practitioner can not only drive innovation but also lay the foundation for success.

These are some of the main objectives of Design of Experiments (DOE) — an approach used by data analysts and others where you begin with a hypothesis, and then you systematically change the variables that you can control to see their impact on the variables you can’t directly control. The approach can be studied in relation to the input-process-output framework used in systems analysis. Here’s how these objectives align:

Objectives of DOE in ML

1. Determining Optimal Combinations of Independent Variables

The goal here is to identify which combinations of input variables (independent variables) yield the best performance when the model is evaluated. This helps in tailoring the input data to enhance model outcomes.

2. Selecting the Best ML Algorithm

This involves experimenting with different machine learning algorithms to find the one that best suits the needs of a particular problem. The focus is on comparing how different algorithms (as part of the process) affect the outputs.

3. Optimizing Algorithm Settings

Adjusting configuration parameters of the learning algorithm to maximize its performance. This includes fine-tuning hyperparameters to enhance the process that leads to better output metrics.

An Input-Process-Output Framework to DOE

An Input-Process-Output Framework to Understanding Design of Experiments (DOE)

Input

In ML, the inputs include the independent variables or predictors that are manipulated in an experiment. DOE systematically varies these inputs to see their effect on the outputs. The objective of identifying the optimal combination of these variables directly corresponds to ensuring that the inputs are capable of producing the best possible outputs.

Process

The process in ML consists of the algorithms and computational methods used to train models. In DOE, experimenting with different algorithms and tuning their settings forms a crucial part of optimizing this process. The goal is to determine the most effective process settings that lead to desirable outputs, thus aligning with the input-process-output model where the process is adjusted based on its impact on output.

Output

Outputs in ML are the results obtained from the model, such as prediction accuracy, error rates, or other performance metrics. The DOE focuses on how different inputs and processes influence these outputs. Understanding this influence helps in optimizing outputs through better inputs and process configurations.

Through DOE, ML practitioners aim to maximize the efficacy and efficiency of ML models by meticulously controlling and varying inputs and processes to achieve the most favorable outputs. This structured experimentation is fundamental in navigating the complexities of ML problems, where straightforward analysis may fall short. By leveraging DOE, practitioners can make informed decisions about how to configure their ML systems to produce reliable and robust models, effectively linking each stage of the input-process-output model.

A Practical Approach to DOE

Some thoughts towards DOE that should be considered and steps that can be undertaken for experimenting with AI projects:

1. Prerequisite to experiment with AI Projects

Embrace a Culture of Exploration: Promote curiosity and encourage exploration of unconventional ideas within the team.

Acceptance of Risk: Recognize that experimentation comes with risks and that failures are important learning opportunities.

2. Designing the Experiment

Define the Experiment: Clearly establish the scope, objectives, and expected outcomes, articulating the problem or opportunity.

Resource and Technology Preparation: Identify and secure necessary resources, technology, and data needed for the experiment.

Timeline Planning: Develop a timeline with key milestones and deadlines, planning the experiment in manageable phases for flexibility.

3. Conducting the Experiment

Execution: Implement the experiment according to the planned roadmap using the AI model and resources.

Data Collection and Monitoring: Gather relevant data for the AI model and continuously monitor the experiment, collecting user and stakeholder feedback.

4. Analyzing the Outcomes

Evaluate Results: Assess AI model performance against predefined metrics to determine if the objectives were met.

Document Findings: Create detailed documentation of the methodologies, challenges, and insights gained, serving as a resource for future projects.

5. Communicating Results and Follow-up

Communication with Stakeholders: Share outcomes, successes, and lessons learned with stakeholders.

Plan Further Actions: Based on the results, decide on scaling up the AI model, making adjustments, or pursuing new directions.

Continuous Improvement: Treat the experimental process as an ongoing cycle that fosters learning, strategic reflection, and adaptation, pushing projects towards success.

These steps create a framework for methodical experimentation in AI projects, focusing on continuous improvement and strategic adaptation thus contributing towards a project success.

--

--

Dr Lim Thou Tin
DataFrens.sg

An IT & business strategist with a doctorate in Knowledge Management & Intelligent Systems. Experienced in corporate IT & educator at global institutions.