Marianne Hoogeveen, from Physics via Mathematics to Machine Learning
Marianne implemented NIPS 2017 paper “Concentration of Multilinear Functions of the Ising Model with Applications to Network Data”, and is a winner of the Global NIPS Paper Implementation Challenge. See her code implementation here.
Tell us a little about yourself?
My interests have meandered through physics and astronomy, which I studied at the university, via applied mathematics, in which I did a PhD, to machine learning and data science.
In fact, all these things have something in common: bending the imagination by carefully studying reality, whether through complex physical models, or through finding complex relations in data.
Currently, I work as a data scientist at Arena, a data-analytics company for the healthcare organisations, where we use data to predict which candidate will be more suitable for a particular job. Our clients are large healthcare organisations, and turnover has always been a pressing problem. We screen candidates, and have found that what makes a good employee in one role, at one location, may not be the same elsewhere.
How did you get started in AI?
During my PhD in Applied Mathematics, I got interested in machine learning, and I did an internship at a data-driven insurance analytics startup. Coming from a physics and mathematics background did help, since the process of learning though proofs and formulas is something I’m comfortable with. After the internship, I was accepted for a Fellowship at Insight Data Science, which is a bootcamp for PhDs or postdocs to transition into Data Science. This transition was quite intense; there is a whole new set of tools to learn and new intuition to build. Learning with others, especially those with a different background but similar level, as I did with the Insight bootcamp, was key to transitioning very quickly.
For others thinking of making this transition, I would advise that understanding the original problem that an algorithm was designed to solve can really help with intuition. Furthermore, it is important not to forget that machine learning is statistical learning, and understanding statistics well is what separates hackers from data scientists. Finally, a healthy dose of skepticism never hurt anyone.
What are you most passionate about in AI?
For better or worse, AI will be a profound influence on our life, as well as on how we interact with the world around us. Whenever something is both complex and influential, it is very important for people who understand these algorithms deeply to warn against misuse, as well as advocate its benefits.
Currently, I am very interested in understanding why deep learning works so well. There are ideas from statistical physics that are being used to tackle this very deep question, such as renormalisation and entanglement. The latter has given rise to tensor networks, which can provide a very direct insight into how learning happens at different scales.
Can you give us an overview of your implementation in the Challenge?
My implementation of “Concentration of Multilinear Functions of the Ising Model with Applications to Network Data” was a statistical check of whether a grid representing social network data can be described by a well-known model in physics describing (simplified) magnetic interactions.
Were there any challenges while implementing your selected paper?
The most challenging part was reading the paper carefully for hints on subtle decisions, such as what kind of boundary conditions were chosen.
What’s next for you in your work?
I am looking forward to the next paper to implement, particularly papers that seek to borrow intuition from physics to understand why deep learning works so well.
Marianne received her PhD in Applied Mathematics from King’s College London, and MSc in Theoretical Physics from the University of Amsterdam. Currently based in New York, Marianne is a Data Scientist at Arena. To keep in touch with Marianne, check out her Github.