How to Use Machine Learning for an Optical/Photonics Application in 40 Lines of Code
And you don’t need to be an expert in Machine Learning…
In the last couple of years, Artificial intelligence is finding its use in all sorts of applications. It can be in medical, health and fitness, education, video calling, sports… you just name it.
If you are wondering that can you use artificial intelligence techniques in your research areas? But you don’t have much idea how to use it.
Then I say- Yes, there is a good chance that you can use it and in this article, I am going to explain how to employ already developed artificial intelligence techniques to the application of your choice within 40 lines of code.
Another thing before diving into the code, if you are thinking that do I need to be an expert in coding to understand this?
No, you don’t. If you have even basic knowledge of coding or done a little bit of it at the school or maybe at the college level, it should be sufficient to get started.
I will show you how to apply artificial intelligence or specifically machine learning to an optical or photonics application problem. I have chosen an optical application because my background is in optical engineering. Your’s can be different- it can be in chemistry, physics, material science, biology, or any other. The steps which I am going to explain are transferable to all the research areas. Also, I will be coding in python language as this is commonly used coding language for the application of machine learning.
There are various kinds of machine learning problem categories: Classification, Regression, Clustering, among others. For more details refer to this link. In this article, I am going to show you an example code for a regression problem.
The First step: for a machine learning application/problem is to have/generate a good, clean dataset. But then there is a good chance that the application that you have in mind doesn’t have the dataset freely available online. So, firstly here I briefly show the Photonics problem I am considering and generate the dataset for it.
Problem considered
The left circular structure is how the cross-section of a typical hexagonal Photonic Crystal Fiber (PCF) looks like in an optical/photonics problem. Then I need to decide what are the input parameters for this problem. I have shown five input parameters (in green color) but for this article, I am only using 3 of them (wavelength, diameter, pitch) to keep the problem small and simple. For various combinations of input parameters, I obtain desired output nodes quantities (in orange color) and store these in a pcf_data.xlsx
file, as shown below.
Again, I have only considered effective index as output to keep the problem simple. If you want you can have more output nodes as shown in the output layer nodes figure. Also, I have only taken 20 combinations of input parameters. Ideally, it is very less for a typical machine learning problem but for this article to demonstrate the method, it is sufficient.
For your case, you first need to figure out the problem, and it’s input and output parameters. Then generate the dataset in the CSV /XLSX format similar to shown above. Some sort of simulator/software or even an experimental/fabrication kit would be fine to generate the data.
The Second step: is to normalize the generated/collected data. The goal of normalization is to change the values of numeric columns in the dataset to a common scale. Here, we use the MinMaxScalar()
function form Scikit-learn
to translate each feature individually such that it is in the given range on the training set, e.g. between zero and one. Note: You may need to install, use pip install scikit-learn
.
import pandas as pd
from sklearn.preprocessing import MinMaxScaler# Read the data stored in Excel file using pandas library
df = pd.read_excel('pcf_data.xlsx', sheet_name='Sheet1')# Scale the input data in range (0,1)
scaler = MinMaxScaler(feature_range=(0, 1))
scaler.fit(df)
df_scaler = scaler.transform(df)
All the input and output values are now scaled between zero and one with minimum and maximum value in every column equated to 0 and 1, respectively. These scaled values will become the inputs to the machine learning model. In the end, we will perform the inverse transform to obtain the original values.
The Third step: is to define your input and output parameter columns for the code to understand and split the whole dataset in training and test sets. In our case, the first 3 columns are inputs and the last column is output. The train_test_split()
function is used to split the dataset. The test dataset extracted will be used to check the accuracy of the model. Here, 10% of data is stored separately as the test dataset.
from sklearn.model_selection import train_test_splitnum_inputs = 3
num_outputs = 1X = df_scaler[:,range(0, num_inputs)]
y = df_scaler[:,range(num_inputs, num_inputs+num_outputs)]X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.1)
The Fourth Step: is to define the machine learning model. Here, I use MLPRegressor()
function developed by Scikit-learn
to quickly define various layers and parameters of the machine learning model. Shuffling of data is done so that the model is not biased towards any particular inputs. For more details about the various parameters of MLPRegressor()
, check the official website link. The .fit()
function trains the model for the specified number of epochs/iterations.
from sklearn.neural_network import MLPRegressorepochs = 1000
mlp = MLPRegressor(shuffle=True, random_state=1, max_iter=epochs)
mlp.fit(X_train, y_train)print("Training set score: ", mlp.score(X_train, y_train))
The Fifth step: is to check the prediction on the test set using the already trained model from the previous step. Here, it is required to do the inverse_transform()
at the end to obtain the unscaled values. As the test set is randomly generated during the train_test_split()
so you need to carefully check and compare the results obtained from the below function with the actual stored values in the test set.
import numpy as npdef prediction(data):
# data should already be scaled
pred_output = mlp.predict(data)
final = np.concatenate((data, pred_output.reshape(-1,1)), axis=1)
return scaler.inverse_transform(final)print(prediction(X_test))
Finally: I show how to obtain the output w.r.t to inputs given by the user and not for the test set. Let us say the user wants to predict the output for these inputs: [diaBYpitch, pitch, wavelength] → [0.7, 0.8, 1.8]. To use our machine learning model we take scaled inputs (defined above) with 4 columns in total for this problem. So here I append zero with the user inputs and then do the scaling using scaler.transfor()
as we did above. You could append any integer/number in spite of zero and it won’t affect the output as we are not retraining the model. We only need the fourth column to do the scaling using the above defined scaler
.
def predict_on_user_input(user_input):
user_input = np.append(user_input, 0).reshape(1,-1)
user_input = scaler.transform(user_input)
return prediction(user_input[:,0:num_inputs])output = predict_on_user_input([0.7, 0.8, 1.8])
print('output: ', output)
All the code together
The steps described in this article are transferable to any other research area. But of course, you need to figure out the problem you are interested in and maybe collect the dataset by yourself if it is not available online. I hope this article will help you to get started with using machine learning with python even if you have very little coding experience. Cheers!!