[Weeks 1–2] — The Coding Period has begun

Juan Redondo
Jun 9 · 4 min read

These first two weeks have been as productive as busy. On the one hand, I have my college exams going on, as well as the delivery and presentation of my bachelor’s thesis, that will take place in a few days and will mean (I hope so) the ending of my undergraduate studies as a physicist. On the other hand, the coding phase started two weeks ago, so I have been working on my code since then. When I have fulfilled all my university related commitments I expect to start working more intensively on the GSoC project.

Progress achieved during this period:

I had already implemented a proof of concept framework for enabling Tree Parzen Estimator based Bayesian Optimization using hyperopt during the Application and Community Bonding phases, you can find it here. The included code is shown below:

  1. optimizer.py: define optimizer class.

2. bayesian_tpe.py — included functions:

  • modify_optimizable_params: CTLearn configuration file modifier.

3. common.py — included functions:

  • get_pred_metrics: get prediction set metrics.

4. opt_config.yml: optimization configuration file.

As I spoke with my mentor Daniel at our last meeting, the work carried out in the course of this two first weeks of the Coding Period has been the following:

  • Enabling the possibility to optimize a custom user defined metric, insted of limiting ourselves to just the accuracy or the area under the curve. Now the user is able to optimize any combination of the accuracy and the auc in the validation set, their average, for example. Also, if the set to be optimized is the training set, the user has access to the y_true and y_pred labels generated by the classifier, so he can take advantage of the sklearn.metrics module in order to optimize any of the metric scores availables, for example, the precision_score or the f1_score. To do so, all the user should do is write the expression of the custom metric to be optimized in the pertinent setting of the opt_config.yml file.

At the same time, I have started to perform optimization runs. So far I have tried to optimize the single-tel model of CTLearn, I hope to deal with the cnn-rnn model in the future as well, but this latter is much more time consuming. I have tested three types of telescopes from the Cherenkov Telescope Array: LST, SSTC and MSTN, the optimized metric has been the AUC. The improvements achieved have been heterogeneous, the results are shown below:

Next weeks goals:

  • First of all, I have to give the final touch to my code and create a PR so my mentors can start to review it. I plan to do this in the next few days.

So far I am ahead of schedule, I hope to continue in this way.