Modeling Bee Production of Honey with AI

Luke Yeager
4 min readJun 3, 2020

Imagine that the next time you go to the grocery store the entire produce section is empty. No lettuce for your salads, no bananas for your morning smoothie, and no delicious berries to snack on between meals. This reality may come true if bee populations continue to decline.

Bees are responsible for pollinating nearly 35% of all food-producing crops in the world. Without them, the food production of the world would be very low and many of the foods we eat every day would no longer exist. Unfortunately, bee populations across the world are decreasing.

Not only do bees pollinate the crops and flowers we all enjoy but they also are responsible for the creation of the OG sweetener, Honey. Fun Fact: Honey is so cool it is the only food on planet Earth that has no expiration date. However, honey production has begun to fall since 1998. When bees create less honey, it shows that they are pollinating fewer and fewer plants. If this downward trend continues a lot of the food resources that exist in the world could be in jeopardy. So it is important to create models to better understand this problem.

Raw Data

One of the best ways to model simple data points like this is using a linear regression model. This model shows the relationship between two different variables by fitting a straight line to the data. To create this type of model there are several important steps that you need to take to better understand all the information.

To begin we need to have the raw information that we want to model. In this case, I will be using two basic data points, year and total production. For every year from 1998 to 2012, the total amount of honey in the United States was recorded. When modeled on a scatter plot it looks like this.

Scatter plot showing bee production of honey.

Modeling the Best Fit Line

It is easy to visualize right about where the best fit line would be, but it is near impossible to just freehand sketch it. Using a computer’s ability to iterate the same command over and over only modifying one small thing at a time, you are able to come up with a near-perfect best fit line.

Using the built-in linear model commands from Python’s sklearn library, I was able to best model this information. Below you can see the code written to create the model.

Code to create the best fit line

The important part to notice is that I take the raw data points and assign each list to one of the variables. For year, it is the x-value, and for total production it is the y value. Then I fit those variables to the .LinearRegression function, and plot the created line. After running the code, the result is the best possible version of the downward trend is shown.

The best fit line for the information

Future Predictions

Having the information, for now, is good and all, but one of the best parts about using Linear Regression, is being able to predict the future. It is obvious that there is a downward trend but having specific numbers is more helpful when trying to use any data.

So the next part of code is dedicated to increasing the size of the best fit line to whatever year you want.

Code written to increase the X-value of the graph

The added parts in the picture above create a new line that follows the same slope that the best fit line already created. In the example above, the line would continue until 2030. To make this happen, I create a new array starting the year after the data ends, 2013, and ending in my desired year, 2030. Therefore, showing all of the predicted values for each year after the sample ends.

Graph of the new prediction line

The new orange line shows what the algorithm predicts the value for honey production will be each year. Of course, the information isn’t going to be perfect since unaccounted for variables are constantly changing, but it is the best possible representation for easy understanding.

What’s really cool about this model is the fact that I could shrink or expand the length of the prediction line to whatever I want. If I need to look even further into the future, I can change the array to a later date.

This project may seem not useful but it shows the basics of linear regression and how it easily interprets data into readable models. Models similar to this can be useful whenever you need to visualize data with two variables or predict the trend they will have in the future.

Hopefully, this model does not end up coming true, or else the way our agricultural system is set up right now could seriously crumble.

Next time you see a bee, show a little respect and remember that it is responsible for almost all of the food you eat .

If you want to check out some of my other work click on the links below.

LinkedIn

https://www.linkedin.com/in/luke-yeager-371480194/

YouTube

https://www.youtube.com/channel/UChiAG2G2LFhn5DSNmGcvcSA?

--

--