Small sawmill data science. Ch.1

2 min readSep 15, 2019


The problem with small production enterprises is that they think they have the knowledge of the market and the intuition strong enough to avoid any obstacle on their business journey. With this the boat will sooner or later finish being aground. Sometimes the owner will go out to consultants and ask for help, and this is where we step in.

Consider a small softwood sawmill with an equipment as old as the hills. Annual production rate lays around 8.400 m3 per year. The owner wants to grow the company and buys new equipment (actually not so new — it was an 1982 ARI gang saw) to set up the goal for 12.000 m3 per year just to sit and watch how his factory is still drifting like a ship with no wind to fill the sail.

When we got a chance to look at the premises, we were skeptical to the bone. What would we find in this remote place that looked like a cemetery of used machines? How to collect data? What kind of trends to expect from the old woodworkers?

First things first, and we’ve started from scratch, meaning that we have to prepare the process map and identify early stage hypothesis of what could possible anchor future performance growth. We got a rough approximation of sawmill process soon enough:

Looking at the picture we thought it would be good to start with the very end of the process — the green boards itself. There was not much data to begin with, but we’ve asked the supervisor of the mill to collect the following every working hour: thickness, width, length of the boards. He promised to make measures 10 times a day with the series of 5 measures per parameter. This left us with 50 geometric measures a day. We gained some patience and got this at the end of the month:

We use google sheets to collect the data and R for tidy, model and visualize. Package ‘qcc’ was what we have used in this particular case:




Our mission is to make data science and statistics more user-friendly and affordable for small and medium business ~Fr-fr~