Purpose: Searching for pulsars is a labor-intensive process that requires experienced astronomers and trained volunteers for their classification. In this article, we implement machine learning techniques to facilitate the process.
Materials and methods: The propose work will be develop and tested on a sample of pulsar candidates collected during the High Time Resolution Survey (HTRU-1). Each pulsar candidate in data base is described by 30 features and a class label. The class label determines a pulsar (1) and a non-pulsar candidate (0). We will built and train 18 different classifiers on a training set that would be balanced by under sampling the majority class. Each of these classifiers will be tune by searching for the optimal point in hyper-parameter space that maximizes its recall score. Finally, the classifiers will be compared and evaluated based on their recall score and false positive rate.
Hardware: We train and evaluate our models on a workstation equipped with Inter(R)Core(TM) i7–8700 with 12 CPU @ 3.70 Ghz and NVIDIA GeForce RTX 2080.
Note: In the case you’re starting from scratch, I will advise you to follow this article and install all the necessary libraries. Additionally, the model design and selection scripts are explained in detail here. Finally, it will be assumed that the reader is familiar with Python, Pandas, and Scikit-learn. The whole contents of this article can be found on my GitHub. You’re welcomed to fork it.
Notation: Bold text will represent either a list, dictionary, tuple, a Pandas DataFrame object, a figure, or a script.
This notation will be used to represent parameters in an object or command lines to run in the terminal.
Lately, I’ve been having an itch for analyzing data. Most of the data I collect and built models with I have no personal relationship so at the end of the day once I’m satisfied with the results I move on and take on new challenges. This life of analyzing random data as led me into a stray path. As a result of this feeling, I started to think about old projects I used to be involved with earlier in my career. As a consequence, I contacted an old friend of mine and ask him for data— he is working on it.
I decided to Google if anyone had already automated the pulsar classification task I use to work with back when I was an undergraduate. This task consisted of inspecting pulsar candidate profiles similar to the one shown in Figure 1 and rating them on a scale from 1 to 10. It turns out that I came into game a little bit too late boys. Multiple people, smarter than me, have been working on this problem for years but whatever let’s give it a shot.
On a weekly basis, one of my duties was to classify 300 of these pulsar candidate profiles. It might not sound like a lot but if in average you took two minutes inspecting these profiles that could be 10 hours a week. At times I would spend as much as 5 minutes rating a single candidate simply because it was tricky to do so.
Those were the good old days when all I had to worry about was solving textbook problems and learning math tricks. On some days in took me 8–10 hours to solve a few physics questions — that puts a man ego down so quick and keeps your ego on check. To say the least, I never discovered a pulsar. On the other hand, my friend Joey discovered a dozen of them. Yeah way to go Joey 😒 Regardless, here I find myself almost seven years later looking at some pulsar data.
A pulsar is a rapidly rotating neutron star that emits electromagnetic radiation through its poles, see Figure 1. The timing of these rotations can be precisely measured and have been used to test theory of gravity formulated by Einstein.
In his PhD thesis, Dr. Ford describes a method that obtains a recall score greater than 99 % and a false positive rate of less than 2 % which he obtained using an artificial neural network . With these metrics at hand, we will go about building our own models and attempt to replicate or improve these results. The method I will use is described in detail in an article I wrote previously for Medium. Since I developed this method on a synthetic data set we will see if it’s of any use in the real world.
The data set we will be using is described in Dr. Thorton’s PhD thesis and Dr. Lyon’s publication in where he describes simple filters and more advanced techniques for real-time classification [2 & 3].
The data set we will use was put together by Dr. Lyon and can be found here. It consist of 91,192 pulsar candidates collected during the High Time Resolution Universe survey. Each pulsar candidate is described by the following 30 features and a class label:
- Mean of the integrated profile.
- Standard deviation of the integrated profile.
- Excess kurtosis of the integrated profile.
- Skewness of the integrated profile.
- Mean of the DM-SNR curve.
- Standard deviation of the DM-SNR curve.
- Excess kurtosis of the DM-SNR curve.
- Skewness of the DM-SNR curve.
- Best Period
- Best DM
- Best SNR
- Pulse Width
- χ² value of pulse profile and sin curve
- χ² value of pulse profile and sin² curve
- χ² value of pulse profile and Gaussian fit
- Full-width half-maximum of Gaussian fit
- χ²value of pulse profile and two Gaussians fit
- Mean full-width half-maximum of two Gaussians fit
- Offset of profile histogram from zero
- max.(profile histogram)/max.(fitted Gaussian)
- Offset of feature 19 and profile gradient histogram
- SNR_data/ √((P−W_obs)/W))
- χ²value of DM-SNR curve fit
- RMS of the position of peaks in all sub-bands
- Average of correlation coefficients for all pairs of sub-bands
- Sum of correlation coefficients for all pairs of sub-bands
- Number of peaks in pulse profile22
- Area under pulse profile after mean subtraction
- Class (0 = non-pulsar & 1 = pulsar)
In total there is only 1,196 pulsar, 1.3 % of the data set. Since is known that machine learning models don’t perform well in unbalanced data sets, we will need to take care of this before we start training. So lets get going.
We will start by importing the libraries.
Then our data.
In the next step, we create our training and test set by randomly sampling without replacement the whole data set. The size of the test set is 25 % of the original data set. The
random_state in train_test_split() class has been fixed so that we can work with the same training and test data sets.
Since we know that most of the samples belong to the not a pulsar class, we will need to employ a balancing strategy on the training set else our model will perform poorly on the test set. As a first approach to solve this problem, we will first separate the pulsar and not a pulsar classes in the training data. We then under sample the not a pulsar class (majority class) and concatenate the results with the pulsar class. We then randomly shuffle the rows in the balanced_data pandas Dataframe object and create the feature matrix X_train and the target class y_train. For more details about sampling techniques you can read this article. In a few words, you really need to try all the balancing techniques if your data is unbalanced, there is no way of telling which one will work the best.
Visualizing the Data
Before we attempt to fit a classifier to the data it’s always a good idea to visually examine your features. This is done to determine the “quality” of the features as well as to detect any anomalies in the data that might negatively impact the performance of your classifier. To do so, we will use the Seaborn to visually inspect pairwise relationships in our data set.
Let’s start by inspecting the histogram distributions shown in diagonal of pairwise plot in Figure 2. You will notice that most features do a fair job at distinguishing between the two classes — that is the red and green distributions don’t completely overlap . Additionally, if you inspect the other subplots you will see that most features are well behaved and no outliers are present. Finally, some of these features are collinear and it most cases should be removed based on a criteria. However, upon further inspection only 3 out of 30 had a Pearson correlation greater than 0.95. Therefore, we will keep all these features and won’t implement any feature selection methods.
Classifiers and Hyper-parameters
Now we can create a dictionary containing 18 different classifiers.
Now we define a parameter grid for each classifier.
Performance Metric, Classifier Tuning, and Evaluation
In this task, we want to uncover pulsars (the positive class) in an unbalanced data set; therefore, we will use the recall score as the metric that determines a classifier’s performance. We will also keep track of other metrics such as the false positive rate, precision, and f1 score. To learn more about metrics I will advise you to read this excellent article.
We will iterate through each classifier and find the optimal point in parameter space that maximizes it’s sensitivity. For each iteration, we will keep track of other metrics such as the training and test recall score, the false positive rate on the test set, and the area under the receiver operating curve (AUC) obtained from the test set. These metrics as well as the trained classifier and it’s optimal hyper-parameters will be stored in a dictionary named results. If you decided to run this this will take about 30 minutes if you have a similar workstation like mine. If you have a 2 core CPU that will take maybe 3–4 hours.
Visualizing the Results
Let’s start by visualizing the recall score each classifier obtained using a bar plot, see Figure 3.
Ideally, we want for our classifiers to have a recall score as close to 100 %. This will imply that the majority of the pulsars were identified by the classifier. Our results shown in Figure 4 demonstrate that the most classifiers meet this condition. Is hard to determine the best classifier simply by looking at the recall score. We have to visualize the false positive rate to gain more insight, see Figure 5.
Well this tells us a lot. Remember we want a classifier that haves a recall score greater than 99 % and a false positive rate lower than 2 %. Figure 4 shows that 16 out of the 18 classifiers meet the recall score criteria and Figure 5 shows that 14 out of 18 meet the false positive rate criteria. Therefore, most classifiers would the job. However, the AdaBoost, KNN, and LSVC are the best classifiers having a recall score greater than 99% and false positive rate of 0.19%, 0.29%, and 0.32%, respectively.
We showed that using supervised machine learning we are able to uncover 1000s of pulsars with a recall score greater than 99% and a false positive rate as low as 0.19%. Most classifiers will do the job but the highest performing classifier was AdaBoost. Finally, we like to point out that the method we implemented was developed and tested in a synthetic data set. It’s nice too see it that it works nicely on real data. I guess I win this one Joey. 😆 Until next time!
You can find me at LinkedIn.
 Pulsar Search Using Supervised Machine Learning by John M. Ford
 R. J. Lyon et al., "Fifty Years of Pulsar Candidate Selection: From simple filters to a new principled real-time classification approach", Monthly Notices of the Royal Astronomical Society, vol. 000, no. 0, pp. 0000–0000, 2015.