AI-2016 review

How do you tell that you are in Cambridge? Maybe it’s the impressive silhouette of the King’s College you can see from far away. It could also be the above average proportion of Chinese tourists who come to visit the ‘Farewell to Cambridge’ stone. In my case, it was obvious that I’m in Cambridge after discovering a poster advertising an “artificially intelligent chatbot” next to Yoga or private language classes.

I travelled again to Cambridge to attend the AI-2016 conference as you can guess from the introduction. For three days, an international community of academics and practitioners (or a hybrid in my case) gathered together in the historic Peterhouse College to exchange ideas in the field of artificial intelligence. In this post, I will summarize the conference and elaborate a bit on topics I found most interesting.

Four different workshops were held on the first day in two different streams. I attended the two workshops Data stream mining and Deep learning meets the semantic web. Selected papers were then presented over the next two days in a technical and application stream. Michael Gleaves from the Hartree Centre held a very interesting keynote lecture on factors that affect the adoption of machine learning technologies in industry. While last year’s panel session was focused on the risks of AI, this year’s panel session aimed to shed some light on the possible benefits of AI.

Data stream mining

In fast paced environments, it is crucial to be able to hold up with the velocity of incoming data. This data can originate from social media, sensory data from machines, or network traffic data. A very interesting application was presented by Hugo Hromic from the National University of Ireland. In collaboration with RTÉ, the national broadcaster of Ireland, it was investigated whether social media can be leveraged to improve the user experience by avoiding information overload, adaptive content, and relevant program recommendations. The setting for this case was however restricted due to the lack of personal viewer data. This is where Twitter comes into play. In a nutshell, user interactions on Twitter are extracted and merged with TV catalog data about programs. The dataset is the foundation for program recommendations and adaptive content. All of this has to happen near real-time.

While I definitely welcome approaches that make TV programs more relevant, I think it is also important to consider possible risks. With the proposed approach, there is a bias towards a mobile-affine audience present. The workshop paper can be accessed here.

Deep learning meets the semantic web

Dr. Mercedes Arguello Casteleiro from the University of Manchester presented her ongoing research at the intersection of deep learning and ontology learning. In one of her experiments, the primary goal was to extract in an unsupervised manner biomedical terms from a large corpus. Word embeddings were calculated with neural language models based on 14 million PubMed publications. A gold standard was created by extracting and annotating gene and protein names from 25 selected papers. A total of 107 terms were extracted. Two methods were then compared to retrieve term variants for the genes and proteins. The first approach returned the 12 most similar terms with regards to the cosine distance of the word embeddings. The second approach leveraged the existing Cardiovascular Disease Ontology (CVDO). The manually annotated terms from the gold standard were expanded with terms from the CVDO. These additional terms were then used in conjunction with the original terms to retrieve the 12 most similar words from the word embeddings. Domain experts assessed eventually both approaches by classifying each of the 12 retrieved words as a term variant, partial term variant or as a non term variant. It was possible to extract up to 151 term variants with the first approach. The second approach returned even more term variants: 194 evaluated by one of the domain experts. The full details of the experiments can be accessed here.

I really enjoyed this workshop. A problem in a narrow domain is tackled in a creative way involving machine learning and existing human-annotated resources. I think this approach is worth evaluating in other domains as well.

AI and law

Suppose a self-driving car causes an accident, who is liable for the damages caused? The driver because he did not intervene, but instead chose to watch a movie during the drive? Or is it the car manufacturer? It could also be the company that developed the control software - most probably a deep learning start-up from the valley ;-) But why not the individual developer who shipped some buggy code on purpose into the main branch of the software? Assuming that we contribute some kind of consciousness in AI systems, why not treat AI agents as entities themselves? If we would do so, how should we hold an AI agent liable in the previously mentioned car accident? Should it be criminal or civil law? How could we exonerate the AI agent from any malicious intent?

As with every new invention, our legislation needs to adapt in order keep pace with new situations. In a paper dubbed Artificial Intelligence and Legal Liability, Dr. John Kingston from the University of Brighton shed some light on the legal implications of AI. He did not provide definite answers in his paper, but instead asks the right kind of questions. It would be great if legislators would pick these questions up for subsequent discussions. One occasion could be the International Conference on Artificial Intelligence and Law in London in June later this year.

Human learning

It is estimated that up to 10,000 hours of dedicated training are required to achieve expert status in any specific field. This estimate though is not always correct and the question arises why some people learn complex concepts and skills faster or slower than others? How can individual differences in learning speed be explained? Some factors listed in the literature are deliberate practice, intelligence, genetics, motivation and pedagogy.

Dr. Philippe Chassy from the Liverpool Hope University ran experiments with artificial neural networks to critically re-evaluate the aforementioned factors. The author assumes that drawing parallels between biological and artificial neural networks is a valid approach. For his experiment, an artificial neural network had to identify different chess strategies. More specifically, his research question was to understand how much the initial state of the network influences the learning speed.

The input to the network consists of 64 inputs representing the chess board and each chess figure was represented with a different integer value. An empty field has a value of 0, a knight 2, the queen 5, etc. 500 different networks were initialized randomly and trained until a stop performance criterion was reached. The initial state after the random initialization represents a novice, the state after reaching the stop criterion the expert level. As expected, the performance of the 500 networks on novice level was poor and close to chance. On average, the artificial neural networks reached the expert level after 9.7 epochs. But very interestingly, this figure ranges between 5 and 23 and is not normally distributed. The author says that the experimental results replicate human variance in learning speed to reach expert level. Making inference about biological neural networks based on these results, the difference in learning speed can not be purely contributed to the factors practice, intelligence, genetics, motivation and pedagogy, but can be caused from individual differences in the neural wiring. The paper can be accessed here.

In hindsight, a lot makes sense now. Some of my bad grades in younger ages were not the result of my laziness or lack of motivation, but simply because my neurons were connected a bit disadvantageous ;-)

Panel session

What benefits can, will or should AI deliver to the world in the next 10 years?

In contrast to last year’s panel session, this year’s panel discussion aimed to look at the possible upside of AI. It was however funny to observe that we ended up pretty quickly talking again about the dangers posed by AI. With the current situation in most Western countries that have an ever aging population coupled with severe budget cutbacks in public spending, a benign use of AI could overcome many challenges. Things like robots that care of elderly people, neural networks that are able to spot cancer tumors, or apps that detect mental health issues make me think very positive about a future shaped by AI.

But what if the very same technology is used with with malign intentions? Autonomous warfare robots in the hands of gangs, terror organisations or states could cause severe havoc. Does this sound too far-fetched? Last year October, the U.S. Department of Defense launched successfully 103 autonomous drones that were able to demonstrate behaviours such as collective decision-making, adaptive formation flying, and self-healing. I leave it up to the reader’s imagination what these autonomous drones equipped with arms could do.

Alongside the threat of autonomous warfare robots, there is also another danger worth highlighting. No one less than the Bank of England governor Mark Carney warned recently that up to 15 million jobs in the UK alone might be at risk of being replaced by robots. And it is not only affecting blue-collar jobs. Most recently, the Japanese insurance firm Fukoku Mutual Life replaced 34 of its employees with an AI system.

Daily Mail cover page from Tuesday, 6 December 2016

Some panel members however opposed that this is rather a political problem than a technological one. Policy makers should ensure that the wealth generated by automation is distributed fairly and the society takes care for those that are left without jobs. A speaker in the audience - if I remember correctly Prof. Max Bramer - countered very correctly that in the past 500 years not a single time wealth has been distributed after the emergence of a disruptive new technology. The panel concluded that the government should not launch any commission on the risks of AI, but instead a commission on wealth equality. I have nothing to add to that.

My presentation

Despite all these doomsday predictions and negative press, I actually still enjoy doing research in this field. So I presented the results of experiments we conducted around multitask learning. The traditional approach in machine learning is to split a complex problem into simpler ones that are then solved separately. Multitask learning in contrast is about solving multiple related problems in parallel. This concept can be illustrated with a simplified example. An insurance company could be interested in cross-selling life insurances to its customers and also observe if good customers are at risk of churn. With the traditional approach, two separate models would be created, one for the cross-selling of life insurances, and one for the churn prediction. With multitask learning however, both tasks are solved in parallel with a single model. Surprisingly, multitask learning works really well and improves the generalization performance of a classifier. It does so as the training signals of related tasks induce a bias that helps to avoid overfitting. The paper is available on Springer.