Ground-to-Cloud Data Science puts Machine Learning at your Fingertips

Steven Astorino
Inside Machine learning
8 min readOct 6, 2017

The potential to change the world for the better.

There’s no doubt in my mind that machine learning (ML) as part of a data science strategy can help revolutionize many aspects of everyday life. Below I highlight a few examples of how different industries are able to leverage machine learning for competitive differentiation and customer benefit.


There are tens of thousands of daily published journals and papers across the world. It is impractical for every clinician to read and absorb these. ML can help identify patterns and correlations that humans alone would otherwise miss — possibly resulting in diagnosis and treatment plans that are suboptimal. ML in these circumstances can help save lives by more accurately predicting if a course of actions or change in treatment plan might enhance a patient’s quality of life or help redesign a cancer treatment or identify a gene or genetic disorder that renders a particular treatment unsuitable (Watch this Bloomberg aricle)


Humans alone cannot possibly analyze every transaction or trade in real time across the billions of transactions that happen every second. Identifying fraud based on a set of static rules that might only be updated once a month enables cyber criminals to exploit weaknesses in banking and trading systems. Business managers and investors may miss patterns and correlations that reflect business opportunities that could have been exploited much earlier — in fact they may miss them all together. ML can help more accurately predict and take prescriptive actions to identify and prevent fraud or identify and maximize profit and business growth opportunities and exploit them ahead of their competition. Some of these opportunities may only last moments based on market or individual customer opportunities. Machine learning can also help remove personal bias from decision making when deciding on a mortgage or loan approval which can help with accountability from a compliance perspective.


Mass marketing campaigns seem to be becoming a thing of the past due to low ROI. This broad-brush approach of mass mailing an entire customer segment is being replaced with individual more targeted campaigns using machine learning to learn about the likes, needs and sentiment of each customer. In this way media companies are able to insert targeted advertisements in between scheduled program viewing based on individual historical customer patterns resulting in potentially greater ROI.

On-line Retail

We each leave a digital footprint when on-line. Retail companies are able to leverage machine learning to predict our propensity to buy other items based on historical purchase patterns, engagement and viewing of online merchandise thus targeting individual customers with personalized offers they can’t refuse. The result should be a far better customer experience with repeat business.


Today’s cars are becoming sophisticated interconnected machines that generate ever greater volumes of data about our driving habits. Embedding machine learning into vehicles that are linked to other devices (IoT) helps optimize the driving experience by advising the optimal route at any particular time of the journey based on traffic congestion, preferred routes, scenery, time, ecology, weather conditions, gas consumption or even routes via your favorite coffee shop or grocery store. By linking the contents of your refrigerator with your personal calendar and current GPS location ML can advise the driver or passengers of the need to reroute the journey via the nearest supermarket to purchase groceries for tomorrow’s dinner party.


In IBM, ML is being used to help provide better customer support and service by monitoring progress regarding customer issues and satisfaction. The system is able to look at faults raised by customers, compare them with other similar issues, as well as the length of time of the issue, the people engaged in resolving the fault, customer comments and sentiment and how these impact IBM’s key metrics and satisfaction ratings. It can also help identify any existing or potential quality issues. Simply by speaking with Ginger, a conversation bot, support staff can hone in on issues and their causes.

There are many more examples across utility, manufacturing, government, defense, telecommunications which would equally apply that can be found here and selecting “cognitive” as the category.

Hybrid Cloud and Data — The Need for Data Science Everywhere.

As part of an organization’s hybrid cloud strategy, customers need the flexibility to take existing data science assets and investments (data, applications, tools) and deploy them wherever it makes best business sense — whether on-prem, private cloud, public cloud or a mixture of all three. Data movement will naturally occur as the boundaries of an organization’s hybrid cloud morph and grow. How that data movement is managed is key and must be transparent. It follows also that governance needs to be transparent across the hybrid environments ensuring the same levels of control, security and lifecycle of all data and applications.

While this might seem like a lot to tackle, IBM Cloud Data Services provides a hybrid, open source-based approach that application developers, data scientists and IT architects can leverage to help address their data-intensive needs and deliver both immediate and longer-term benefits. In particular IBM Information Server on Cloud can help address many data integration challenges across hybrid environments. By running on technologies like Bluemix organizations can manage, integrate and govern their data in a hybrid cloud environment — managing data assets across both private cloud, public cloud and on-premise. Every organization needs to think about security. IBM Cloud Security solutions can help protect organizations from potential security breaches and threats with sophisticated access management, protection of data, applications and infrastructure — as well as advanced security monitoring, breach prevention, audit, intelligence and compliance.

Successful Data Science includes Consumability, Ease of Use, Flexibility and Availability.

A key success factor to ML adoption and pervasiveness is delivering the technology in a consumable, easy-to-embrace way across the hybrid environments described above.

The IBM Data Science Experience (DSX) is a collaborative visual tool that puts advanced machine learning capabilities in the hands of the masses without the need to be an expert data scientist. I know this from first-hand experience and encourage you to take a look at this short video.

So, if you are a developer, data scientist or data engineer that just wants to try machine learning, DSX can deliver ML across numerous environments — Cloud, Local and Desktop — giving the end-user options for a full public cloud deployment and management of data and models, or within an on-prem / private cloud environment and for users on-the-go that want to create and test models and deploy later on when they have connectivity to their corporate infrastructures.

In addition, DSX Developer Edition on IBM Cloud Private and DSX Local aims to provide the combined benefits of developing and running workloads in public cloud, but with control of a private cloud while leveraging reduced footprint options of Openstack or Vmware. The customer will have the choice of managing either their own ML infrastructure themselves — or a managed service by IBM. In recognition of its design for IBM Data Science Experience, IBM Design in San Francisco has been awarded this year’s Red Dot Award for Communication Design.

ML has become pervasive throughout the IBM portfolio. IBM Db2, Db2 for z/OS, dashDB, the recent announcement of IBM Integrated Analytics Solution, Db2 Analytics Accelerator for z/OS, HortonWorks, Watson Explorer, Power, Project Event Store, IBM Deep Decisioning (in beta), Linux on Z, BlueMix, Cognos Analytcis, SPSS, PureApplication — as well as DSX mentioned above — all embed or exploit ML capabilities. This breadth of DSX and ML enabled offerings can help customers map out their data science strategy and entry points by starting with their most pressing cognitive needs and adopting others when necessary. (see figure #1)

Figure #1: Infusing DSX and ML for a more complete data science strategy.

A Closer Look at some of the entry points.

The IBM Integrated Analytics System (IIAS) is built on the latest IBM Power 8 technology. It is designed for massive parallel performance leveraging in-memory BLU columnar processing with dynamic movement of data from storage. It skips unnecessary data processing of irrelevant data and patented compression techniques help preserve order so data can be processed without decompressing it. IIAS is preconfigured with DSX. DSX Local instances from an expanded IIAS can be joined to create a larger DSX Local cluster to support additional users. Spark is embedded into the core engine therefore being co-located on the same data node which removes unnecessary network and hardware latencies.

For those using the highly popular Cognos Business Intelligence suite of products DSX Integration with Cognos BI which makes it possible for business analysts, data scientists and data engineers to create dashboards to visualize results and visually analyze data to derive insight. Visualizing data at different stages of the analytical life cycle, whether to understand the type of data that is available, or to perform data exploration and data cleansing, or derive hypothesis or model visualization, can help drive deeper understanding — ultimately resulting in smarter business outcomes.

Also leveraging visualizations and dashboarding for story telling is an important way of representing results and driving engagements with line of business users and executives. It is a structured approach for communicating data insights involving a combination of three key elements: data, visuals, and narrative. Additionally generating live dashboards help in driving different what-if scenarios to help make decisions much quicker and in real time.

Predicting what might happen next is valuable but knowing what the optimal decision is in order to take prescriptive actions is equally important. IBM Deep Decisioning (see Figure #2) provides this capability as part of DSX Local as an add-on and as part of a DSX Notebook experience. It offers new dashboards for non-programmers as well as a cognitive assistant to add rules and conditions, pipelines & extended IDE/modeling.

Figure #:2 DSX ML with IBM Deep Decisioning

Your Move.

Wherever an organization may be in their data science or cognitive strategy the IBM Data Science Experience and IBM hybrid cloud capabilities can help data scientists, data engineers and developers embrace machine learning regardless of skill level — I tried it myself. So your next move should be to click here and experience the power of DSX today as part of your hybrid cloud data science strategy.

Steven Astorino, Vice President of Development, Private Cloud Platform and z Analytics

Follow me on twitter @astorino_steven

  1. IBM may withdraw anything that is considered future at any time without notice.



Steven Astorino
Inside Machine learning

Vice President of Development, Data and AI. Tweets and opinions are my own