Studying open questions in particle physics with machine learning

Tomáš Chobola
Student Success Stories
7 min readSep 14, 2020

Choosing the right thesis topic can be a struggle. Approximately eighteen months ago I had to pick my bachelor thesis assignment and my intention was to work on something new and fresh that could help at least a small area of people, and not just to repeat already existing processes. While browsing the offered topics, one stood out for me as a physics fan. The focus of the assignment was the analysis of data from CERN with the application of machine learning algorithms, that could lead to a discovery of a new particle. Needless to say, I applied immediately. Even though physics is not the main focus of my studies, the work on the thesis was very satisfying in numerous aspects and definitely enriched my understanding of physics and scientific processes overall.

Photo by Patrick McManaman on Unsplash

Background

Looking at the technological achievements in recent history, one definitely stands out. It is the construction of Large Hadron Collider at CERN, in which scientists are able to accelerate particles almost up to the speed of light and cause them to collide. When two particles collide, they release immense amounts of energy and scatter into quarks and gluons from which they are made of. These processes can lead to a creation of a new particle that only exists for a very brief period of time before it decays into other particles. Therefore the problem physicists face is how to observe and study such particles, when their lifetime is so short.

One of the approaches is to use the detectors at LHC to observe photons scattered during the interactions in the accelerator. These photons have varying invariant mass and by analysing its distribution the physicists observed that it follows a smooth falling line, which is called background. If the conditions for creating a new particle are met during the collisions, the photons into which the newly created particle decays inherit its mass, and therefore the invariant mass distribution shows a peak above the background at that certain mass. If the peak is significant enough, it could signify a production of a new particle. (For example, this approach was applied in search for the Higgs boson and the process can be seen on the following figure.)

Animation showing ATLAS evidence for the Higgs boson during Run 1 of the LHC. The blue line shows the background and the red peak around 125 GeV signifies the Higgs boson signal. The bars in the lower plot show the significance of measured deviations from the background (i.e. the estimated distribution without the signal). Image courtesy of ATLAS/CERN.

One of the particles that is currently being searched for is the axion-like particle [1] that could be mediated by the emitted photons in proton ultra-peripheral collisions [2]. In these collision the beam particles do not collide head-on, but get into an interaction range (very close proximity to each other) in which the clouds of photons that surround the protons travelling almost at the speed of light interact and create a new particle that later decays into another set of photons. This interaction is called light-by-light scattering. The beam protons lose a fraction of their energy which can be observed by the detectors on both sides relative to the interaction point as they are almost intact.

The purpose of the thesis was to simulate how the peak above the background would look like if the axion-like particle production was observed, and to analyse the measured data from CERN and model the background using machine learning methods as precisely as possible, as the current models do not achieve the desired precision.

Analysis

The work started with performing simulation runs with the SuperChic-3 Monte Carlo event generator [3]. Apart from the production of axion-like particle, the productions of photons, muons and electrons were simulated as well, and their observational probabilities, energies and invariant mass distributions were compared. This analysis showed how much the axion-like particle production differs from the already observed events. The most important observation from the analysis was that with lower coupling (the strength of the ultra-peripheral interaction between the emitted photons), the peak above the background becomes more sharper, and therefore more easily detectable. In other words, to observe a significant peak above the background, the number of events in which the axion-like particle is produced can be lower with lower coupling.

Visualisation of a collision of two beam particles with two detectors on both sides relative to the interaction point. Image courtesy of ATLAS/CERN

Once the events were simulated, the real experimental data from CERN was imported and validated. By applying multiple restrictions on the observed events, the dataset was modified and three subsets corresponding to different observational strategies were extracted. The strategies are dependent on the beam proton energy loss that is observable by the detectors. The first strategy, called no AFP matching, does not consider any limitations on the event selection. Strategy called A or C considers events in which at least one of the detectors observed a proton with relative energy loss between 2% to 10%, and A and C matching requires both protons from the event to be in that specific range. Understandably, these restrictions significantly reduce the number of analysed events and increase the modelling difficulty.

Based on these subsets, three backgrounds were modelled using the widely-used curve fitting method with the formula physicist already used in previous analyses of photon backgrounds, which was optimised by the the Levenberg–Marquardt algorithm [4,5]. By measuring the precision of the fit it was clear that the curve does not achieve the required precision, and therefore additional regression analysis was made with the Gaussian Process [6]. The Gaussian Process does not use a fixed parametric function to describe the data, but instead it lets the data to pick the functional form itself, and therefore it is a powerful tool in terms of machine learning. The process is formally defined as an infinite collection of random functions, where any finite subset of which creates a joint Gaussian distribution. This approach is gaining in popularity in particle physics and was already applied in numerous data explorations. The results of the Gaussian Process were then compared to the standard fit and it showed a significant improvement.

Overview of the goodness of fit metrics for the Gaussian Process fit and the function fit for each data subset. It is clear, that the Gaussian Process strongly outperforms the standard curve fit.

The backgrounds were then used together with the observational probabilities calculated from the simulator runs to determine how many events are needed to be observed anywhere on top of the background to observe a significant peak. For those events, the required coupling was calculated (separately for each matching strategy). As mentioned above, with lower coupling, the peak becomes sharper (i.e. it is both higher and not as wide for the same amount of events), and therefore the number of events needed to be observed is lower, because the deviation from background is achieved more easily. This means that when searching for the hypothetical axion-like particle, with lower coupling it does not need to be produced as often in order to be potentially discovered.

The analysis showed that the best approach to detect the particle is to combine to detection strategies based on the axion-like particle mass. For lower masses up to 800 GeV, A and C matching should be applied. In range between 800 GeV and 1600 GeV, A or C seems as the most promising, and for even higher masses, no AFP matching appears to be the best option.

The ATLAS inner detector that observes the decay products of the collisions. Image courtesy of ATLAS/CERN

Conclusion

The past year and a half was very challenging since diving into a completely different area of research required a lot of learning and understanding. However, thanks to the kind and patient approach of the members from the international ATLAS Collaboration, and mainly my supervisor doc. Dr. André Sopczak, working on the thesis was very fulfilling. I am glad that I had the opportunity to contribute to the ongoing research in particle physics and also that my thesis was classified by CERN as relevant to their research.

The full thesis Study of light-by-light scattering with the ATLAS Forward Proton (AFP) Detector at CERN can be found in the official CTU digital library or at the CERN Document Server.

References

  1. Peccei R. D.; Quinn H. R. CP Conservation in the Presence of Pseudoparticles. Physical Review Letters, American Physical Society, vol. 38, n. 25, p. 1440–1443, June 1977, doi: 10.1103/PhysRevLett.38.1440.
  2. Baltz A.; Baur G.; d’Enterria D.; et al. The physics of ultraperipheral collisions at the LHC. Physics Reports, vol. 458, n. 1–3, March 2008, ISSN 0370–1573, doi: 10.1016/j.physrep.2007.12.001.
  3. Harland-Lang L. A.; Khoze V. A.; Ryskin M. G. Exclusive LHC physics with heavy ions: SuperChic 3. The European Physical Journal C, Springer Science and Business Media LLC, vol. 79, n. 1, January 2019, ISSN 1434- 6052, doi: 10.1140/epjc/s10052–018–6530–5.
  4. Levenberg K. A method for the solution of certain non-linear problems in least squares. Quarterly of Applied Mathematics, American Mathematical Society, vol. 2, p. 164–168, July 1944, ISSN 1552–4485, doi: 10.1090/qam/10666.
  5. Marquardt D. W. An Algorithm for Least-Squares Estimation of Nonlin- ear Parameters. Journal of the Society for Industrial and Applied Mathematics, Society for Industrial and Applied Mathematics, vol. 11, n. 2, p. 431–441, June 1963, ISSN 0368–4245, doi: 10.1137/0111030.
  6. Rasmussen C. E.; Williams C. K. Gaussian Processes for Machine Learning. MIT Press, 2006, ISBN 978–0–262–18253–9.

--

--