Think 2019 — Part 1
Written by Sasha Lazarevic on February 26, 2019
Please note: I work for IBM, but all what I’m writing in this post represents only my personal opinion and is not an official statement of IBM or any of its entities. If you consider buying any of the IBM products, please consult with our sales representatives.
Think 2019
February 12–15 was the week for Think 2019, according to some analysts the most important tech event of the year. IBM assembled together all major individual exhibitions and this was the second edition of Think, now the main annual IBM conference. This time the event took place in San Francisco. I attended the event as one of the speakers on the subject of Watson Machine Learning and Watson Studio, and in my session I also had as guest Anyline, AI company from Austria.
The conference was successful and it was huge. Not only based on the number of participants (around 30’000), or sessions (more than 3’000), but also on the quality of discussions and topics. IBM events are not so much about announcements, but I will list and describe the most important ones in this post, along with highlights of the most recent and interesting technologies.
But the most important were the speeches and discussions around the IBM strategy and intentions. CEO Ginni Rometti opened the conference with statements that this new Chapter Two (which starts now) in the digital transformation is all about scaling of digital and AI, and moving mission critical workloads of large organizations to the cloud. She said that the future is multicloud, that many workloads will never move to public cloud and that in less regulated industries the ratio (public/ private cloud) will be 60:40, and in more regulated 40:60. What I have seen later, made me confident that IBM is well prepared for this Chapter Two.
AI and Data
Let me start with what was the dominant subject of Think 2019: AI and Data. The breaking news from the conference was Watson Anywhere announcement. Watson AI platform is going to be available on AWS, Azure, Google clouds and on-premise. So, those users who already have their applications and data with these cloud providers will be able to benefit from the full portfolio of Watson tools. I remember several months ago the initiative “Bring AI to the Data” and this is just a logical extension of that approach. I think this is excellent news for our customers and for IBM.
Next, four flagship AI products received very high visibility during the conference:
- Watson Assistant
- Watson Studio
- ICP4Data, and
- Watson OpenScale
Watson Assistant is already very well-known tool , used by a large number of companies to improve their customer experience through modern and sophisticated conversation interface, together with other NLP and speech-related services. There are some new features that were already announced several months ago. What was showcased during Think 2019 are numerous examples of how different organizations from various industries implement and benefit from this tool. We have heard also that VMWare deployed Watson Assistant, together with NLC and NLU, to help their support professionals, and improve satisfaction of their hundreds of thousands of customers.
Watson Studio is integrated development platform for Data Scientists and ML developers that has received a lot of attention and visibility during the last 12 months. I remember of a recent Gartner report featuring Watson Studio as one of the most modern data science platforms. New version includes very nice visualizations tools based on Cognos, and features like NeuNets for neural network synthesis and AutoAI for automatic ML feature generation, algorithm selection, and view of the progress tree through hyperparameter optimization. There is also integration with Watson OpenScale,
Watson Studio is now available on Cloud, Local (on-prem) and in Desktop version (on PC)
ICP4Data platform is equivalent to Watson Studio, but it is installed on premise (you need a 16 core server). It has even more capabilities for data transformation, including for example data virtualization, so for some use cases, you don’t need to move the data to ICP4Data. The data can stay where it is, on remote systems. You get better performance this way for example if you will select only small portion of data for your analytics algorithms than if you move it over. There was a lot of interest in ICP4Data during the conference. This couple of tools : Watson Studio on Cloud and ICP4Data on-premise are actually the cornerstone of IBM AI and Data strategy. They allow the data to be processed where it is, and AI models to be trained in the proximity of data, and everything to be managed and deployed in a consistent manner.
Watson OpenScale is a new tool, just several months old. It has been created to address the need for trust and explainability of AI, especially of those models based on deep neural networks where machines learn themselves and it is very difficult to identify the logic of actually very accurate predictions. Bias in AI can come from various sources, either from the data itself, or from flawed models, or from the new data as it is coming during the production phase. OpenScale will monitor these models for bias while they are in production. When you start using it, you define the protected classes (gender, race, age etc). OpenScale will then perform what is called perturbation testing: it will withhold some data, and test to see if the model will show any abnormal predictions (gender predictions for example should always be 50–50%). If their are any deviations in this result, the model will alert for potential bias. OpenScale can identify bias even if some other features (like for example film preferences) can cause the model to generate gender bias. Very powerful tool.
The bottom line is that many organizations will create their own ML models, or download open-source models from internet to reuse them, and will eventually end up with many different non standard models. Management of these models then becomes very difficult, and that is where Watson OpenScale comes into play.
IBM AI portfolio is much deeper of course (don’t forget Discoery, Visual Recognition, Speech-to-Text and Text-to-Speech, and numerous other vertical applications for health, compliance etc , but I selected these four tools based on their transformation potential in the context of new IBM’s strategy for Chapter Two.
There were also some other very interesting news and demonstrations:
AI Fairness 360 : This is another bias detection mechanism that can be embedded into your own ML algorithms. It includes 30 different bias metrics. You can check here for more information on how it works and how to implement it:
https://developer.ibm.com/open/projects/ai-fairness-360/
https://www.youtube.com/watch?v=X1NsrcaRQTE
Adversarial Robustness Toolkit : This is excellent tool for the security hardening of your ML models. It applies different attacks and helps you to improve your model. You can check here on how it works and download the libraries:
https://github.com/IBM/adversarial-robustness-toolbox
Project Debater
This was a great show. IBM has showcased AI solution that can debate with humans on various topics and make arguments to persuade in a discussion. The show was actually a competition between Project Debater and a human, Harish Natarajan, the world champion in political and social debates. The debate was organized around a topic: “Should preschools be subsidized by government ?”. Project Debater argued for and Harish against. The audience voted for or against this issue before the debate and voted again after the debate, and the winner was one who managed to persuade more people to change their mind to his/ her side.
Project Debater lost. But capabilities that this machine demonstrated were amazing. There were three rounds in which each opponent was able to bring his/her points and refute the points of the other side. Project Debater started the first. All sentences were completely correct and relevant for the topic, the structure of the speech was natural and effective, the argument was exposed in a very rational, comprehensive and structured way. And not only that. After exposing why, Project Debator started bringing dozens of relevant scientific studies, statistics, research reports, hard data about this subject. Multitude of data to support the decision. It was simply amazing. And simply impossible for a human to argue with such a machine which knows everything that can be known on this subject.
The second and third rounds were less successful for Project Debater. Harish didn’t know so many statistical and scientific data about this topic, but his way of arguing was, well, like we humans do: using many contrasting parts of the speech, finding holes and exceptional cases. He began the speech with “Yes, I agree with one point, but I disagree about the others” and used similar rhetorical figures. Project Debater was not able to follow these subtle nuances in the human speech where people expand the subject, then change the subject, and make it looks like your arguments suddenly become irrelevant.
I don’t know what algorithms are behind Project Debater, but it looks like there is a lack of a sophisticated attention mechanism that could identify all these rhetorical constructions and prepare the right arguing strategies. And to argue with humans, you need very sophisticated algorithms. We are getting there, but it takes time.
But even though Project Debater didn’t win, it surprised everyone. It was an amazing achievement. Competitors had 4 minutes to prepare themselves for the next round. In these 4 minutes, Project Debator had to use speech-to-text to create transcription of what Harish said, analyze the text, find the new arguments and prepare the speech, and then synthesize the speech based on that text. And all that had to be delivered without any grammatical or semantic inconsistencies.
What are the potential use cases for this AI debating engine ?
Some analysts say : marketing and sales. It can sell products or services better than humans. Also other professional and consulting domains, as it can help justify the decision based on hard data and scientific results. If you have one hour of time and you like to watch , check here
But maybe we should be careful to whom we give such a tool, as it is really powerful.
As a conclusion about this section, AI was central topic during the Think 2019 conference. But it is becoming the central topic for most of IBM’s customers. This is actually where we want to deliver the most business value. And to do that, IBM needs first to resolve the biggest problem of this Chapter Two — migration of mission critical workloads to Cloud.
Continue to Part 2 of this story to understand IBM’s Cloud strategy.