Complexification of neural networks NOT helping to predict earthquakes
or regarding the recent AI overhype in Applied Science
In the last few years, deep learning has solved seemingly intractable problems, boosting the hope to find approximate solutions to problems that now are considered unsolvable. Earthquake prediction, the Grail of Seismology, is, in this context of continuous exciting discoveries, an obvious choice for deep learning exploration. The artificial neural network (ANN) (shallow or deep) is rapidly rising as one of the most powerful go-to techniques not only in data science [LeCun et al., 2015; Jordan and Mitchell, 2016] but also for solving hard and intractable problems of Physics (e.g., many-body problem [Carleo and Troyer, 2017], chaotic systems [Pathak et al., 2018], high-dimensional partial differential equations [Han et al., 2018]). This is justified by the superior performance of ANNs in discovering complex patterns in very large datasets with the advantage of not requiring feature extraction or engineering, as data can be used directly to train the network with potentially great results. It comes as no surprise that machine learning at large — including ANNs — has become popular in Statistical Seismology [Kong et al., 2019] and gives fresh hope for earthquake prediction [Rouet-Leduc et al., 2017; DeVries et al., 2018].
The earthquake prediction AI buzz
There has been some media euphoria in recent months on AI supposedly predicting earthquakes. Below are some examples proving the existence of this buzz:
- New York Times, ‘A.I. Is Helping Scientists Predict When and Where the Next Big Earthquake Will Be’
- BBC News, ‘Chasing quakes with machine learning’
- Fortune, ‘Why We’ll Want AI to Help Us When the Big One Arrives’
- USA Today, ‘Harvard working with Google on AI to predict earthquake aftershocks’
- Nature News, ‘Artificial intelligence nails predictions of earthquake aftershocks’
- The Harvard Gazette, ‘Examining aftershocks with AI’
- Google AI Blog, ‘Forecasting earthquake aftershock locations with AI-assisted science’
- Science Alert, ‘This New AI Tool Could Solve a Deadly Earthquake Problem We Currently Can’t Fix’
- Science News, ‘Artificial intelligence could improve predictions for where quake aftershocks will hit’
- Phys.org, ‘OK computer: How AI could help forecast quake aftershocks’
- Futurism, ‘GOOGLE’S AI CAN HELP PREDICT WHERE EARTHQUAKE AFTERSHOCKS ARE MOST LIKELY’
- The Verge, ‘Google and Harvard team up to use deep learning to predict earthquake aftershocks’
All the aforementioned articles were related to the 2018 Nature paper by DeVries et al. titled ‘Deep learning of aftershock patterns following large earthquakes’. Note that the NYT article also mentioned the 2017 GRL paper by Rouet-Leduc et al. (Paul Johnson’s group), titled ‘Machine learning predicts laboratory earthquakes’ and where Random Forest (RF) was used. Interestingly, this work led to the first ever Kaggle competition on (lab)quake prediction and is now again buzzing in the mediasphere (as of October 2019) with the release of new articles arguing that their RF method seems to also apply to real earthquakes. Since the present Medium article is about ANNs, I will not further mention the work of Johnson’s group here (but stay tuned for a follow-up article on the RF models — that will include a discussion of the Kaggle competition results — by following ‘The History, Risk & Forecast of Perils’).
So the question is: Is all this buzz warranted? The answer is NO, as I will prove below.
Am I qualified to argue on the matter? I will let the reader judge by himself. I have an interest in earthquake predictability since my PhD (2003-2006) and have written a dozen of articles on the subject, including an invited review article [Mignan, 2011]. I am optimistic about improving earthquake predictability since there is so little we know and thus so much still to learn that could lead to a breakthrough (e.g., ‘Don’t say the P word? New horizons in earthquake Prediction’ back in 2013). However, we must also learn from the past and recognise that any phase of optimism in earthquake prediction was irremediably followed by an “earthquake prediction winter”. So let’s see the current status of research in ANN-based earthquake prediction by not only looking at the latest publications, but the entire literature on the topic, dating back to 1994.
What follows is based on the following developments: (1) my Coursera/IBM capstone project (Jan. 2019), which led to the revisit of DeVries et al.  (see Youtube video), (2) the presentation of the results at the 15th International Work-Conference on Artificial Neural Networks (IWANN 2019), leading to [Mignan and Broccardo, 2019a], (3) figuring out, with my colleague Marco Broccardo, that we could further simplify the analysis, ultimately leading to a Nature Matters Arising paper [Mignan and Broccardo, 2019b], and finally (4) an analysis of the entire ANN-based earthquake prediction literature [Mignan and Broccardo, arXiv], which is here summarised.
The complexification of ANN studies for earthquake prediction over the period 1994–2019
I developed a comprehensive corpus of 77 articles, spanning from 1994 to 2019, on the topic of ANN-based earthquake prediction. Although a few references may have been missed, the survey can be considered complete enough to investigate emerging trends via a meta-analysis. The full database
DB_EQpred_NeuralNets_v1.json is available on the GitHub repo
hist_eq_pred (note that other databases will be added there over time, based on various earthquake precursor meta-analyses [Mignan, 2011; 2014; 2019] — everyone is welcome to contribute!).
The left figure shows the annual rate of publications over time, which indicates a progressive increase in the number of studies on this topic. Only in the past ten years did important papers emerge in terms of number of citations and journal impact factor [Panakkat and Adeli, 2007; Adeli and Panakkat, 2009; Reyes et al., 2013; Asim et al., 2017; DeVries et al., 2018].
A short history of ANN-based earthquake prediction
ANNs were introduced in Seismology as early as 1990 [Dowla et al., 1990], only four years after the seminal back-propagation article of Rumelhart, Hinton & Williams [Rumelhart et al., 1986]. The earliest attempts to apply ANNs to earthquake prediction date back, to the best of my knowledge, to 1994 [Aminzadeh et al., 1994; Lakkos et al., 1994]. Few studies followed in the next years. The first deep neural network (DNN), with two hidden layers, was proposed in 2002 and the first recurrent neural network (RNN) in 2007. Panakkat and Adeli  provided the first comprehensive work on ANN applications to earthquake prediction, comparing three types of neural networks: a radial basis function (RBF) neural network, a DNN and an RNN. Highly-parameterised deep learning articles emerged in 2018, with a 6-hidden layer DNN [DeVries et al., 2018] and a convolutional neural network (CNN) made of 3 convolutional layers [Huang et al., 2018]. These two studies are sketched in the next figure.
Without going into further detail (see Mignan and Broccardo [arXiv] for more), what we observe from the corpus (check by yourself by exploring
DB_EQpred_NeuralNets_v1.json) is a progressive complexification over time of ANN models, towards deep learning. We find an increase in the number of hidden layers in fully-connected feed-forward networks, up to the extreme case of a 6-hidden layer DNN. Regarding all types of ANNs, we also find a trend towards more complex architectures with Long Short-Term Memory (LSTM) networks and CNNs used since 2017.
Occam’s razor, null-hypothesis testing, baseline models, first principles, prior dataset, and the like
Is this ANN complexification useful? Not so much…
Virtually all published studies part of the corpus claim some positive results, with ANN models able to provide “good” earthquake predictability. The metrics considered in the literature derive from the true positive TP, true negative TN, false positive FP and false negative FN counts of the confusion matrix. They are mainly the true positive rate TPR = TP/(TP+FN) (also known as sensitivity and recall), the true negative rate TNR = TN/(TN+FP) (also known as susceptibility) and the R-score, defined as TP/(TP+FN) — FP/(TN+FP) = TPR+TNR-1 (also known as True Skill Score TSS). Results vary significantly between studies but with R-scores greater that zero suggesting some predictive power. The gain of using ANNs instead of simpler methods remains unclear since performance is only compared to a baseline in 47% of cases. Of those, only 22% use a baseline such as a Poisson null-hypothesis or randomised data (although earthquakes are known to naturally cluster). The remaining 78% mostly compare ANN results to results obtained by other machine learning methods (e.g., SVM, trees, Naive Bayes…).
There are some empirical laws of seismicity, which have minimal ability to “predict” earthquakes: the Gutenberg-Richter (exponential) law, roughly saying that if an earthquake of magnitude m is observed, there is a 10% probability of getting a m+1 earthquake in the same space-time window, or the Omori-Utsu (power-)law, roughly saying (combined to Bath’s law) that if an earthquake of magnitude m occurs, there is a 6% probability of a similar earthquake occurring soon after the first (see discussion in the YouTube video comment section). Do the existing ANN models do better than those well-known laws?
A simple empirical law to “predict” earthquake magnitudes
There are some hints peppered in the corpus that published ANN models only encode what we already know: The importance of Gutenberg-Richter (GR) and Omori-Utsu (OU) features over other parameters was demonstrated in Martínez-Álvarez et al. . Reyes et al.  stated that their ANN was “capable of indirectly learning OU and GR laws.”
I created a simple model based on the GR law (a.), so simply a dataset prior, and applied it on simulations of the normal (clustered) behaviour of seismicity (b.; see Mignan & Broccardo [arXiv] for details). I then estimated the probability of an earthquake of magnitude greater than mth occurring in the next time window Δ based on seismicity observed in the past, during nΔ. Δr =1/n represents the ratio between prediction window Δ and training window nΔ. Results are given on the left figure for the TPR (c.) and R-score (d.). Both values and trends are similar to the ones published in the literature for ANN models. The works obtaining greater R-scores were all based on very small test sets (down to sometimes 2 samples, with R-score up to 1). Simulating such undersampling, the maximum possible R-score ranges from -0.3 to 1.0 and the minimum possible R-score from -0.4 to 1.0. This suggests that the stochasticity of the process and the rarity of large events combined to under-sampling can lead to any possible metric result. Recall the 2018 CNN I mentioned earlier? In all appearance also a case of undersampling.
This does not, per se, reject the conclusions of the published ANN-based earthquake prediction studies, only that new tests should be undertaken to validate or dismiss claims of machine learning models beating simple earthquake statistics. As Carl Sagan once said, “extraordinary claims require extraordinary evidence”.
1 neuron baseline (2 parameters) versus a DNN (13,000+ parameters)
A very deep ANN can be interpreted as a model of high abstraction. In computer vision, for instance, a first layer may represent simple shapes, a second layer parts of a face (such as eye, nose, ear), and a third layer, different faces. When aftershock patterns are predicted by a 6-hidden-layer DNN as in DeVries et al. , it captivates the collective imagination as to the degree of abstraction that seismicity patterns carry. This can explain the media euphoria described above. This is unfortunately misleading. Using the same 12 stress component features as DeVries et al. , I demonstrated (during my IBM capstone project) that a simpler DNN (12–8–8–1) or a shallow network (12–30–1) led to similar performances with similar prediction maps and AUC (which finally led to Mignan and Broccardo [2019a] in which Marco Broccardo developed upon the physical interpretability of ANNs in the context of a stress tensor — I will not go into that topic here to keep it short). We then figured out that a 1–1–1 “ANN”, just one neuron or logistic regression (with 2 free parameters and a single stress metric as feature), gave similar results as the 13,000+ DNN model that had been so widely publicised in the media (figure below).
Our preprint ‘One neuron is more informative than a deep neural network for aftershock pattern forecasting’, announced in April 2019, led to some interesting discussions on Reddit Machine Learning two months later:
- Misuse of Deep Learning in Nature Journal’s Earthquake Aftershock Paper
- One neuron is more informative than a deep neural network for aftershock pattern forecasting (TL;DR AUC of 2 parameter model = AUC of 13,451 parameter model)
This online discussion came up thanks to data scientist Rajiv Shah, who had criticised another aspect of the DeVries et al.  study: Data leakage. His exchange with Nature is available on his GitHub. His Medium article, ‘Stand Up for Best Practices’, was the trigger for the Reddit threads. I highly recommend everyone to scan through those threads! Some comments are more pertinent than those we could often hear at conferences.
After several months waiting for Nature’s decision, we finally obtained acceptance of our paper in early July [Mignan and Broccardo, 2019b]. Connecting those results to the GR-law-prior I proposed in Mignan and Broccardo [arXiv], Marco Broccardo noticed that the logistic regression can be rewritten as a power law, in agreement with the known empirical law of aftershock decay in space [e.g., Mignan, 2018]!
Be it a dataset prior (using known empirical laws of seismicity) or a logistic regression baseline (which mimics a known empirical law of seismicity), the obtained results are as good, if not better, than published scores obtained by artificial neural networks to “predict” earthquakes. While we cannot reject a number of published results since we do not have access to the original data, we can still conclude that ANNs so far do not seem to provide new insights into earthquake predictability, since they do not offer convincing arguments that their models surpass simple empirical laws [Mignan and Broccardo, 2019b; Mignan and Broccardo, arXiv].
It is reassuring to see the positive feedback given by the machine learning community, which continues in the latest Reddit-ML thread ‘One neuron versus deep learning in aftershock prediction’. The AI overhype is already addressed elsewhere, such as in the excellent articles by Sculley et al. , Lipton and Steinhardt  and Riley . Our work is just one example, here regarding ANN-based earthquake prediction. But we should not be surprised by the lack of success of ANNs for earthquake prediction. After all, virtually all publications on that topic use structured, tabulated data (mainly, earthquake catalogues). The corpus (
DB_EQpred_NeuralNets_v1.json) tells us that the size of the input layer varies from 2 to 94 neurons in those studies with a median of 7 and mean of 10 (excluding the only CNN scenario). It is evident in such a case that simpler models can do the job…
Mignan A, Broccardo M (2019a), A Deeper Look into ‘Deep Learning of Aftershock Patterns Following Large Earthquakes’: Illustrating First Principles in Neural Network Physical Interpretability. In: Rojas I et al. (eds), IWANN 2019, Lecture Notes in Computer Science, 11506, 3–14, doi: 10.1007/978–3–030–20521–8_1
Mignan A, Broccardo M (2019b), One neuron versus deep learning in aftershock prediction. Nature, 574, E1-E3, doi: 10.1038/s41586–019–1582–8 (author’s link, no paywall!)
Mignan A, Broccardo M, Neural Network Applications in Earthquake Prediction (1994–2019): Meta-Analytic Insight on their Limitations. arXiv: 1910.01178
Adeli H, Panakkat A (2009), A probabilistic neural network for earthquake magnitude prediction. Neural Networks, 22,1018–1024, doi: 10.1016/j.neunet.2009.05.003
Aminzadeh F, Katz S, Aki K (1994), Adaptive Neural Nets for Generation of Artificial Earthquake Precursors. IEEE Transactions on Geoscience and Remote Sensing, 32, 1139–1143, doi: 10.1109/36.338361
Carleo G, Troyer M (2017), Solving the quantum many-body problem with artificial neural networks. Science, 355, 602–606, doi: 10.1126/science.aag2302
DeVries PMR, Viégas F, Wattenberg M, Meade BJ (2018), Deep learning of aftershock patterns following large earthquakes. Nature, 560, 632–634, doi: 10.1007/s41586–018–0438-y
Dowla FU, Taylor SR, Anderson RW (1990) Seismic discrimination with artificial neural networks: Preliminary results with regional spectral data. Bull. Seismol. Soc. Am., 80, 1346–1373
Han J, Jentzen A, EW (2018), Solving high-dimensional partial differential equations using deep learning. PNAS, 115, 85005–8510, doi: 10.1073/pnas.1718942115
Huang JP, Wang XA, Zhao Y, Xin C, Xiang H (2018), Large earthquake magnitude prediction in Taiwan based on Deep Learning neural network. Neural Network World, 2,149–160, doi: 10.14311/NNW.2018.28.009
Jordan MI, Mitchell TM (2015), Machine learning: Trends, perspectives, and prospects, Science, 349, 255–260, doi: 10.1126/science.aaa8415
Lakkos S, Hadjiprocopis A, Compley R, Smith P (1994), A neural network scheme for earthquake prediction based on the Seismic Electric Signals. In: Proceedings of the IEEE conference on neural networks and signal processing, Ermioni, 681–689, doi: 10.1109/NNSP.1994.365997
LeCun Y, Bengio Y, Hinton G (2015), Deep learning. Nature, 521, 436–444, doi: 10.1038/nature14539
Lipton ZC, Steinhardt J (2019), Troubling Trends in Machine-learning Scholarship. acmqueue, 17, 1–33
Martínez-Álvarez F, Reyes J, Morales-Esteban A, Rubio-Escudero C (2013), Determining the best set of seismicity indicators to predict earthquakes. Two case studies: Chile and the Iberian Peninsula. Knowledge-Based Systems, 50, 198–210, doi: 10.1016/j.knosys.2013.06.011
Mignan A (2011), Retrospective on the Accelerating Seismic Release (ASR) hypothesis: Controversy and new horizons. Tectonophysics, 505, 1–16, doi: 10.1016/j.tecto.2011.03.010
Mignan A (2014), The debate on the prognostic value of earthquake foreshocks: A meta-analysis. Sci. Rep., 4, 4099, doi: 10.1038/srep04099
Mignan A (2018), Utsu aftershock productivity law explained from geometric operations on the permanent static stress field of mainshocks. Nonlin. Processes Geophys., 25, 241–250, doi: 10.5194/npg-25-241-2018
Mignan A (2019), A preliminary text classification of the precursory accelerating seismicity corpus: inference on some theoretical trends in earthquake predictability research from 1988 to 2018. J. Seismol., 23, 771–785, doi: 10.1007/s10950–019–09833–2
Panakkat A, Adeli H (2007), Neural network models for earthquake magnitude prediction using multiple seismicity indicators. Int. J. Neural Systems, 17, 13–33, doi: 10.1142/S0129065707000890
Pathak J, Hunt B, Girvan M, Lu Z, Ott E (2018), Model-Free Prediction of Large Spatiotemporally Chaotic Systems from Data: A Reservoir Computing Approach. Phys. Rev. Lett., 120, 024102, doi: 10.1103/PhysRevLett.120.024102
Reyes J, Morales-Esteban A, Martínez-Álvarez F (2013), Neural networks to predict earthquakes in Chile. Applied Soft Computing, 13, 1314–1328, doi: 10.1016/j.asoc.2012.10.014
Riley P (2019), Three pitfalls to avoid in machine learning. Nature, 572, 27–29
Rouet-Leduc B, Hulbert C, Lubbers N, Barros K, Humphreys CJ, Johnson PA (2017), Machine learning predicts laboratory earthquakes. Geophys. Res. Lett., 44, 9276–9282, doi: 10.1002/2017GL074677
Rumelhart DE, Hinton GE, Williams RJ (1986), Learning representations by back-propagating errors. Nature, 323, 533–536
Sculley D, Snock J, Rahimi A, Wiltschko A (2018), Winner’s Curse? On Pace, Progress, and Empirical Rigor. In: Proceedings of the 6th International Conference on Learning Representations, Workshop Track