WEEK #6 — Modeling Earthquake Damage

Beyza Cevik
bbm406f19
Published in
3 min readJan 8, 2020
Nepal Earthquake

We are Sercan Amac, Mert Cokelek, Beyza Cevik. This is our sixth blog post about Modeling Earthquake Damage.

This Week in Modeling Earthquake Damage

In the previous blog post, we trained a simple neural net. However, it was not satisfying to get %56.7 accuracy from a neural network while kNN got %71.33 accuracy. So we tried to improve these results by analyzing why neural net performs much worse than a simple algorithm like a kNN. After solving the problem of overfitting we got an accuracy of %68.46, we also tried a different loss function “Focal Loss” with different parameters, yet, it did not increase the accuracy.

So we switched our strategy to XGBoost which is a very popular boosting method used in competitions. And it got %74 accuracy.

Always Use Normalized Features!

After trying a lot of hyper-parameters, different dimensions, etc., the result was always ~%57. It seems suspicious right? It just seems like the model just got stuck in a very bad local optimum!

So we checked the answers and guess what?

The model says every single building in the data-set got medium damage which is the major class in terms of the number of classes in a data-set. The lesson is if you don’t normalize your features you will probably face overfitting.

How We Trained Our Neural Network?

We started experimenting with Adam Optimizer, CE Loss Function, and steplr lr_scheduler. While training we split the dataset as %80 of the dataset is for training and the %20 of the dataset is for validation(test). We did not include a third test split since we need to compare it to the other methods. We used leaky_relu as our activation functions between layers.

The graph of accuracy over the number of epochs. (orange curve -validation-, blue curve -training-)
The graph of loss over the number of epochs. (orange curve -validation-, blue curve -training-)

Results of Training

The best setting was the fifth row. We calculated the F1 score of the model with the lowest loss. The F1 score of the best model is 67.63. And then we trained another model with this setting using Focal Loss with gamma parameter =2. The F1 score of the model trained with focal loss is 67.13. We tried many different settings, sampling strategies, etc, However, the result didn’t get any higher. We believe the reason behind this is the data is quite noisy since it is collected from people after the earthquake. And then we tried XGBoost and got an F1 score of 73.97.

See you in our last blog post!

--

--