I recently did something I don’t normally do, I collaborated on an academic paper with my father, Donald Gillies. He is a philosopher of science, and the paper we wrote originated in the mid 90s when he published a book on AI and Scientific method, which summarised the state of the art of AI at the time and how it might impact how we think about science.
This year he was asked to present a paper about his work on AI, but since he has been working on other things (mostly the philosophy of medicine) in the intervening decades, and clearly, AI has moved on a bit since the 90s, he asked me to help update it. We were supposed to present the paper at the Conference on the Philosophy of Science today: Seven perspectives (XXV Conference on Contemporary Philosophy and Methodology of Science), Ferrol, Spain, but that didn’t quite go to plan as Spain went into COVID-19 lockdown during the conference.
One interesting thing is how well the book has aged, given how much has seemingly advanced in AI. One reason is that it identifies machine learning as one of the most interesting aspects of AI. This didn’t seem at all obvious in the 90s, but in recent years AI seems to have become virtually synonymous with machine learning.
Even in 1996, it seemed that machine learning raised some important issues for science, but these will be even more important now that AI is being more and more used in practical science. I thought I would share some of the issues discussed in the paper in a few posts, starting with this one.
The problem of Induction
The philosophy of science is a subject that asks deep and important questions about how science works. One of the most important of these questions is induction, which is a question about where scientific theories come from and how we judge their truth.
Induction is an idea about the logic of science. Traditional logic goes back to the ancient Greeks and in particular the philosopher Aristotle.
His logic was based on Deduction: drawing conclusions from facts that we know or assume to be true (called axioms). A classic example is something like:
Axiom: All humans are mortal.
Axiom: Socrates is a human
Conclusion: Therefore Socrates is mortal.
This is a very rigorous form or reasoning, and works very well, particularly in mathematics, but it isn’t how science works.
Science doesn’t have any axioms that we know are true, instead it has to make conclusions from data and evidence which are partial or uncertain. Taking the above example, it is actually quite easy to know that Socrates and Aristotle are mortal, after all they died a couple of millennia ago, but how do we know that every human is mortal?
We can look at the data. There have been lots of examples of people throughout history: Aristotle, Socrates, Confucius, Cleopatra, and many others famous people, and many more ordinary people right through to the present day. These have all, unfortunately, been mortal. So we can conclude that all people are mortal.
Rather than going from general axiom to specific examples, we start with a lot of specific examples and generalise to a scientific law. The 17th century philosopher Francis Bacon proposed that this method of generalisation from data, called induction, could be the basis for how we can discover laws of nature. It could almost be a mechanical procedure for going from data to law.
Bacon was one of the most influential thinkers in the development of science .In particular, Isaac Newton was very influenced by him and reported that he discovered his mathematical laws of physics by induction from the data of the movement of the planets. Given how important Newton’s work was to the development of science, it looked like induction was the key to how science works.
Induction is a Myth
Unfortunately, the idea of induction didn’t fair well in the centuries following Newton. The Scottish enlightenment philosopher David Hume was the first to show problems.
He said that it is impossible to prove the truth of a law by induction. However many examples you see, there is no way to be absolutely sure that the next example you see won’t be an exception to your law. The classic example of this problem is that for centuries people in Europe only saw white swans and could reasonably conclude that all swans are white. Only when Europeans arrived in Australia did they see black swans and realise that their “law” of white swans was not true (this is the origin of the phrase “black swan” that has been recently popularised by Nassim Taleb).
In the 20th century Karl Popper looked in more closely at how science worked. He concluded that induction simply wasn’t how scientists discovered their theories. They do not mechanically induce scientific laws from data. They rely on creative insights to come up with theories and then use experiments to test these theories. The process is creative not mechanical. He ended up by saying that “induction is a myth”:
“Induction, i.e. inferences based on many observations, is a myth. It is neither a psychological fact, nor a fact of ordinary life, nor one of scientific procedure.” Karl Popper. Conjectures and Refutations: The Growth of Scientific Knowledge (1963)
So by the second half of the 20th century the idea of induction was in tatters. The idea that scientists could “mechanically” induce scientific laws from data had little to do with how scientists work and was probably impossible.
But this is where machine learning comes in.
Enter Machine Learning
To recap, Francis Bacon proposed that induction could be a “mechanical” process that went from specific data to general laws. It is almost as if he wanted to create an automatic machine that could take in a lot of example data and create a generalised mathematical model.
That sounds very much like a description of machine learning.
Machine learning algorithms do indeed take in data and produce general mathematical (statistical) models. They can also be used in science, for example, machine learning is used a lot in biology for problems like understanding how proteins fold.
This is what my father realised in his 1996 book, that machine learning shows that induction is not only possible, but that is could be used in science. To quote the summary from our paper:
The successes of machine learning programs show that Popper was wrong on this point. However, it should be added that Popper was not so wrong when he made the remark in 1963. The first striking successes of machine learning did not occur until the late 1970s.
AI and the philosophy of science
So it seems that machine learning has an important impact on one of the most important ideas in the philosophy of science.
Popper wasn’t exactly wrong. Everything he said was true of human science, it’s just that computers work differently from people.
Also, nothing in machine learning invalidates Hume’s basic idea. Machine Learning cannot prove that it’s models are true. There is no way of knowing that your data set is completely representative and that you won’t discover future data that invalidates your model. In fact, this is the cause of many problems in machine learning, ranging from failure in deployment, right up to racial and gender biases.
What we can be sure of is that machine learning will be increasingly important in data rich sciences and that these philosophical issues will be a very fruitful research area for those studying the process of science.
You can read the original paper here;