These first two weeks have been as productive as busy. On the one hand, I have my college exams going on, as well as the delivery and presentation of my bachelor’s thesis, that will take place in a few days and will mean (I hope so) the ending of my undergraduate studies as a physicist. On the other hand, the coding phase started two weeks ago, so I have been working on my code since then. When I have fulfilled all my university related commitments I expect to start working more intensively on the GSoC project.
Progress achieved during this period:
I had already implemented a proof of concept framework for enabling Tree Parzen Estimator based Bayesian Optimization using hyperopt during the Application and Community Bonding phases, you can find it here. The included code is shown below:
- optimizer.py: define optimizer class.
2. bayesian_tpe.py — included functions:
- modify_optimizable_params: CTLearn configuration file modifier.
- create_space_params: hyperparameters space creator following hyperopt syntax.
- objective: objective function for hyperopt input — output workflow.
3. common.py — included functions:
- get_pred_metrics: get prediction set metrics.
- get_val_metrics: get validation set metrics.
- set_initial_config: set basic configuration and not optimizable hyperparameters.
- train: run CTLearn network training.
- predict: run CTLearn network prediction.
4. opt_config.yml: optimization configuration file.
As I spoke with my mentor Daniel at our last meeting, the work carried out in the course of this two first weeks of the Coding Period has been the following:
- Enabling the possibility to optimize a custom user defined metric, insted of limiting ourselves to just the accuracy or the area under the curve. Now the user is able to optimize any combination of the accuracy and the auc in the validation set, their average, for example. Also, if the set to be optimized is the training set, the user has access to the y_true and y_pred labels generated by the classifier, so he can take advantage of the sklearn.metrics module in order to optimize any of the metric scores availables, for example, the precision_score or the f1_score. To do so, all the user should do is write the expression of the custom metric to be optimized in the pertinent setting of the opt_config.yml file.
- Till now, the available hyperparameters to be optimized were hardcoded in the framework, so the user only could choose between the following list of hyperparameters: [layer1_filters, layer2_filters, layer3_filters, layer4filters, layer1_kernel, layer2_kernel, layer3_kernel, layer4_kernel, pool_size, pool_strides, optimizer_type, base_learning_rate, adam_epsilon, cnn_rnn_dropout]. In order to improve the usability and configurability of the optimization framework, I have written two functions. The first one takes a list of keys and integers as input and creates a nested dictionary, which can also contain nested lists, and the second one takes the same list and a value to be set in the dictionary. In this way the user can provide a list of keys that point to a hyperparameter value and the framework can change its value during the optimization process.
- I have rewritten the code of the hyperparameters space creator following hyperopt syntax and I have enabled the ability to optimize conditional hyperparameters spaces by using the hp.choice function from hyperopt. This is particulary useful to optimize the number of layers in our neural network.
- Besides this, I have been properly commenting and documenting my code, making minor changes in order to improve its usability.
At the same time, I have started to perform optimization runs. So far I have tried to optimize the single-tel model of CTLearn, I hope to deal with the cnn-rnn model in the future as well, but this latter is much more time consuming. I have tested three types of telescopes from the Cherenkov Telescope Array: LST, SSTC and MSTN, the optimized metric has been the AUC. The improvements achieved have been heterogeneous, the results are shown below:
Next weeks goals:
- First of all, I have to give the final touch to my code and create a PR so my mentors can start to review it. I plan to do this in the next few days.
- Begin to implement a Gaussian Processes based bayesian optimization option, for which I am taking a look at the skopt library.
- Continue to perform optimization runs, right now I am dealing with the MSTF telescope.
- I have found a very interesting package called ray, which allows parallel optimization. I will have to study it in depth.
So far I am ahead of schedule, I hope to continue in this way.