Machine Learning and Cognitive Systems, Part 2: Big Data Analytics

In the first part of this series, I described a bit of what machine learning is and its potential to become a mainstream technology in the industry of enterprise software, and serve as the basis for many other advances in the incorporation of other technologies related to artificial intelligence and cognitive computing. I also mentioned briefly how machine language is becoming increasingly important for many companies in the business intelligence and analytics industry.
 
 In this post I will discuss further the importance that machine learning already has and can have in the analytics ecosystem, especially from a Big Data perspective.
 
 Machine learning in the context of BI and Big Data analytics
 
 Just as in the lab, and other areas, one of the reasons why machine learning became extremely important and useful in enterprise software is its potential to deal not just with huge amounts of data and extract knowledge from it — which can somehow be addressed with disciplines such as data mining or predictive analysis — but also with complex problems in which the algorithms used need to adapt to frequent changing conditions. This is the case for successful applications of machine learning techniques in software applications such as those for spam detection, or those from Amazon to automate employee access control or Cornell for protecting animals.
 
 But the incorporation of machine learning techniques within enterprise software is rapidly expanding to many other areas of business, especially those related to business intelligence or analytics, or in general as part of the decision support framework of an organization. As I mentioned in Part 1, as information collection increases in volume, velocity, and variety (the three Vs of big data) and as business pressures to expedite and decrease the latency of the analysis grow, new and existing business software solutions are incorporating improved ways to analyze these large data sets, taking new approaches to perform effective analysis over large and complex amounts of data sets, but most importantly, furthering the reach of what analytics and BI solutions can do.
 
 As data sources become increasingly complex, so do the means of analyzing them, and the maturity model of the analytics BI platform is forced to accommodate the process and expand to the next level of evolution — and sometimes even revolution — of the decision-making process. So the role of a BI and analytics framework is changing from being solely a decision support companion to a framework that can trigger decision automation. To show this, I have taken the following standard BI maturity model from TEC’s BI Maturity and Software Selection Perspectives report (Figure 1) to show in a simple form some of the pressures that this complexity has on the maturity process. As a consequence, the process is expanded to a double-phase decision-making process, which implies giving the system an increased role in the decision.

Figure 1. Standard BI maturity model is being expanded by complexity of data and processes

The decision phase can happen in two ways: as a supported decision made by users, or by enabling the system to delegate the ability to make a decision to itself, automating the decision-making process based on a previous analysis and letting the system learn and adapt. By delegating the decision to the system, for the process extends the reach of analytics to prediction analysis, early warning messaging, and data discovery.
 
 At this stage we might find more permutations of analytics platforms and frameworks that combine both assisted and automated decisions, ideally increasing the effectiveness of the process and streamlining it (Figure 2).

Figure 2. Standard BI maturity model expands to be able to automate decisions

In this context, due to new requirements coming from different directions, especially from Big Data sources in which systems deal with greater and more complex sets of data, BI and analytics platforms become, most of the time, hubs containing dynamic information that changes in volume, structure, and value over time.
 
 In many cases decisions are still made by humans, but with software assistance to different degrees. In some more advanced cases, decisions are made by the system with no human intervention, triggering the evolution of analytics systems, especially in areas such as decision management, and closing the gap between analytics and operations, which can mean boosting tighter relations between the operations, management, and strategy of an organization.
 
 Opportunities and challenges
 
 The opportunities for implementing machine learning within the context of Big Data, and especially Big Data analytics, are enormous. From the point of view of decision support, it can enhance the complete decision management cycle by

  1. Enhancing existing business analytics capabilities such as mining and predictive which enable organizations to address more complex problems and enhance precision of the analysis process.
  2. Enhancing the level of support for decisions by providing increased system abilities for performing adaptable data discovery features such as detecting patterns, enabling more advanced search capabilities, reinforcing knowledge discovery by identifying correlations, and many other things, much along the same line of what data mining and predictive analytics can do.
  3. Boosting the incorporation of early detection capabilities within traditional or new BI and analytics systems, a key component of modern organizations that want to anticipate or detect short-term trends that might have great impact on an organization.
  4. Enabling the process of enabling a system to perform autonomous decisions, at least at early stages, to optimize the decision process in cases where the application can decide by itself.

Many organizations that already use machine learning can be considered to be exploiting the first level of this list — improving and enabling the analysis of large volumes of complex data. A smaller number of organizations can be considered to be transitioning to the subsequent levels of Big Data analysis using machine learning.
 
 At this point in time, much of the case for the application of machine learning is based on reinforcing the first point of the list. But aside from its intrinsic relevance, it is, in my view, in the area of early detection and automation of decisions where machine learning has a great deal of potential to help boost BI and analytics to the next level. Of course this will occur most probably alongside other new information technologies in artificial intelligence and other fields.
 
 Many organizations that already have robust analytics infrastructures need to take steps to incorporate them within their existing BI and analytics platforms, for example, building machine learning into their strategies. But organizations that wish to leverage machine learning potential may encounter some challenges:

  1. The complexity of applying machine learning requires a great deal of expertise. This in turn leads to the challenge of gaining the expertise to interpret the right patterns for the right causes.
  2. There may be a shortage of people who can take care of a proper deployment. Intrinsically, the challenge is to find the best people in this discipline.
  3. As an emerging technology, for some organizations it still is a challenge to measure the value of applying these types of advance analytics disciplines, especially if they don’t have sufficiently mature BI and Big Data analytics platforms.
  4. Vendors need to make these technologies increasingly suitable for the business world, easing both deployment and development processes.

Despite these challenges, there is little doubt that over time an increasing number of organizations will continue to implement machine learning techniques, all in order to enhance their analytics potential and consequently mature their analytics offerings.
 
 Some real-life use cases
 
 As we mentioned earlier, there are a number of cases where machine learning is being used to boost an organization’s ability to satisfy analytics needs, especially for analytics applied to Big Data platforms. Following are a couple of examples of what some organizations are doing with machine learning applied to Big Data analytics, which surprisingly are tied to solving not complex scientific projects but more business-oriented ones. These cases were taken from existing machine learning and Big Data analytics vendors, which we will describe in more detail in the next post of this series:
 
Improving and optimizing energy consumption

  • NV Energy, the electricity utility in northern Nevada, is now using software from Big Data analytics company BuildingIQ for an energy-efficient pilot project using machine learning at their headquarters building in Las Vegas. The 270,000-square-foot building uses BuildingIQ to reduce energy consumption by using large sets of data such as weather forecasts, energy costs and tariffs, and other datasets within proprietary algorithms to continuously improve energy consumption for the building

Optimizing revenue for online advertising

  • Adconion Media Group, an important Media Company with international reach, uses software from machine learning and Big Data analytics provider Skytree for ad arbitrage, improving predictions for finding the best match between buyers and sellers of web advertising.

Finding the right partner

  • eHarmony, the well-known matchmaking site uses advanced analytics provided by Skytree to find the best possible matches for prospective relationship seekers. Skytree machine learning finds the best possible matching scenarios for each customer, using profile data and website behavior along with specific algorithms.

This is just a small sample of real use cases of machine learning in the context of Big Data analytics. There is new but fertile ground for machine learning to take root in and grow.
 
So what?
 
 Well, in the context of analytics, and specifically Big Data analytics, the application of machine learning has a lot of potential for boosting the use of analytics to higher levels, and extend its use alongside other disciplines, such as artificial intelligence and cognition. But the applications need to be approached within the context of machine learning as enabler and enhancer, and must be integrated within an organizational analytics strategy.
 
 As with other disciplines, the success of the implementation of machine learning and its evolution to higher stages needs to be ensured by an organization’s extensive adaptability to business needs, operations, and processes.
 
 One of the most interesting trends in analytics is its increasing pervasiveness and tighter relation with all levels of an organization. As the adoption of new features increases the power of analytics, it also closes the gap of two traditionally separated worlds within the IT space, the transactional and the non-transactional, enabling analytics to be consumed and used in ways that just a decade ago were unimaginable. The line between business operations and analysis is blurrier than ever, and disappearing. The new IT space will live within these colliding worlds with analytics being performing at each level of an organization, from operations to strategy.
 
 In upcoming posts in this series, we will address the machine learning market landscape and look at some vendors that currently use machine learning to perform Big Data analytics. And we will go a step further, into the space of cognitive systems.
 
 In the meantime, please feel free to drop me a line with your comment. I’ll respond as soon as I can.


Originally published at dataofthings.blogspot.ca

Next Story — Zyme: Emergence and Evolution of Channel Data Management Software
Currently Reading - Zyme: Emergence and Evolution of Channel Data Management Software

Zyme: Emergence and Evolution of Channel Data Management Software

Courtesy of Zyme

Previous to the official launch of the new version of Zyme’s solution, I had the opportunity to chat and be briefed by Ashish Shete, VP of Products and Engineering at Zyme, in regard to version 3.0 of what Zyme describes as its channel data management (CDM) solution platform.
 This conversation was noteworthy from both the software product and industry perspectives. In particular, the solution is relevant to an industry that needs software and technology solutions to help control, streamline, and improve the management of a fascinating and complex ecosystem called the distribution channel.
 Zyme aims to increase the efficiency of this ecosystem through its CDM platform.
 
 The distribution channel: a hidden monster
 According to the United Nations Conference on Trade and Development (UNCTAD):

Driven by favorable policies, technological innovation and business models bringing down the costs of cross-border transactions, international trade in goods and services added about 20 trillion US$ during the last 25 years, going from about 4 trillion US$ in 1990 to about 24 trillion US$ in 2014.

Global business is now “business as usual,” as it is the norm for a global economy. As manufacturers and service providers put goods on the market that are worth trillions of dollars, a huge infrastructure of distributors, resellers, retailers, and value-added resellers (VARs) — what we call the channel — is responsible for selling and moving them the globe.
 As more goods and services reach new markets and new trade and commercialization models are created, the channel becomes an increasingly complex ecosystem that moves an immense flow of goods from many different places (see Figure 1 below).

Figure 1. A simple version of the channel (Image courtesy of Zyme)

As a result, manufacturers and service providers are experiencing challenges in handling the increasing volume and diversity of data coming from the channel and still be able to maintain visibility into the channel as well as garner insight on when and how their products and services are being sold and moved within the channel.

Simply managing this data is typically a complex and cumbersome task. This is because the data collected from the channel originates from different sources, and comes in different formats (text files, spreadsheets, via Open Database Connectivity (ODBC) connectors to third-party systems, etc.) and diverse structures (plain text, XML files, etc.). The challenge then is to find the most efficient way to collect, clean, organize, and consolidate this variety of data in order to gain visibility and insight from all these data points. 
 Companies like Zyme offer CDM software solutions as a concrete means to address this challenge. But what is a CDM solution? Well, in the words of Zyme, CDM is:

a discipline concerned with the acquisition and use of data originating from the channel. It enables companies to significantly grow their business by offering transformative insights into the way business is conducted in the channel.

In other words, a CDM solution offers a series of tools that enable customers or users to efficiently manage the data coming from the channel. This includes the following:

  • Integration with third-party systems
  • Automated data collection
  • Data enrichment functionality
  • Advanced analytics and reporting capabilities

Zyme aims to achieve complete channel visibility through its cloud CDM platform, which collects the raw data originating from partners, and pushes it to the proprietary technologies and content libraries, which then transform it into usable data for intelligence gathering. Once the data is ready, it can be processed and consumed for analysis and visualization through dashboards and/or other specific third-party analytics systems.
 
The channel data management market has enormous potential for growth and evolution. And a company such as Zyme, with its combination of expertise and innovative technology, keeps constantly developing this segment of the data management market.
 
Proof of this is the consistent growth of Zyme―which accounts for more than 70% of market share. The company expects to process more than $175 billion in channel revenues and more than 1 billion transactions this year thanks to its set of big customers, which includes Microsoft, VMWare, and GE, just to name a few of the players.

Figure 2. Zyme Screencap (Courtesy of Zyme) Zyme adds power with version 3.0

On June 30th, Zyme announced the release of version 3.0 of its CDM solution. This news keeps with its mission to expand the platform and provide channel visibility to global enterprises. Zyme’s new version has been enriched with several improvements, three of which are core to the new direction of the company:

  • The addition of zymeEcommerceSM to the platform. This new e-commerce offering will give companies more visibility into online shelf space. This new solution can keep track of metrics such as competitors’ product positioning, pricing, and customer perception across e-commerce channels — and consequently delivers market intelligence.
  • The addition of the new zymeIncentives solution. This solution allows companies to perform incentives management, and thus automatically calculate and validate rebates and credits earned by partners based on Zyme’s existing decision-grade data. The solution is also able to communicate as well as facilitate incentives payments to channel partners quickly and seamlessly.
  • Zyme’s approach to the Internet of Things (IoT) called zymeCDMSM. This enhances Zyme’s existing functionality with capabilities for tracking connected devices down to individual serial numbers in real time. This in turn improves visibility into product movement, such as mapping out a product’s complete route to a customer for a manufacturer, with the ultimate goal of closing the loop between manufacturers and end users.

In regard to its new version, Chandran Sankaran, Zyme’s CEO mentioned:

The Zyme cloud platform 3.0 makes our proprietary technologies and comprehensive content libraries, including more than 1.5 million channel partners and the largest directory of products and retailers, available to customers through a modern, scalable, SaaS platform. Global enterprises have immediate access to complete, accurate and timely data from resellers and distributors to unlock the enormous value that had previously been trapped in the channel due to inefficient and outdated reporting systems and processes.

On the other hand, on the customer side, Kevin Nusky, Director of Marketing and Sales Operations at Schneider Electric’s IT Business Unit had the following to say about Zyme’s new release:

More than 65 percent of our sales go through a distribution system, so we can’t make informed business decisions without accurate data from channel partners. Zyme delivers unprecedented partner reporting accuracy, which has led to improved inventory management, reduced rebate overpayments, increased revenue through better partner development and accelerated channel growth and success.

Building on its core mission to deliver channel visibility to global enterprises, the company offers a targeted solution that provides complete channel visibility. Zyme’s cloud platform 3.0 aims to empower companies to obtain the maximum value from the channel sales.
 
Zyme in a blue sea
It appears that Zyme has encountered in channel data management a market with huge potential where competitors appear to be scarce and users willing to consider these new types of software offerings. In this market, the IoT could empower companies like Zyme with the tools for improving the mechanisms driving complete channel visibility for its customers.
As with many other types of enterprise software applications, Zyme’s success will depend on how efficiently it can integrate with the existing software stack (customer relationship management [CRM], enterprise resource planning [ERP], and other systems) in order to ensure data management agility and timeliness, as well as accurate visibility and natural interactivity with other business operations. Zyme appears to be on a right path to achieving these goals.


Originally published at www.jgdot.com

Next Story — An Interview with Dataiku’s CEO: Florian Douetteau
Currently Reading - An Interview with Dataiku’s CEO: Florian Douetteau

An Interview with Dataiku’s CEO: Florian Douetteau

Courtesy of Dataiku

As an increasing number of organizations look for ways to take their analytics platforms to higher grounds, many of them are seriously considering the incorporation of new advanced analytics disciplines, this includes hiring data science specialists and solutions that can enable the delivery of improved data analysis and insights. As a consequence, this also triggers the emergence of new companies and offerings in this area.

Dataiku is one of these new breed of companies. With its Data Science Studio (DSS) solution, Dataiku aims to offer full data science solution for both data science experienced and non-experienced users.
 
 In this opportunity I had the chance to interview Florian Douetteau, Dataiku’s CEO and be able to pick some of his thoughts and interesting views in regards to the data management industry and of course he’s company and software solution.
 
 A brief Bio of Florian
 
In 2000, at age 20, he dropped the prestigious “Ecole Normale Supérieure” math courses and decided to look for the largest dataset he could find, and the hardest related problem he could solve.
 
 That’s how he started working at Exalead, a search engine company that back at the time was developing technologies in web mining, search, natural language processing (NLP) and distributed computing. At Exalead, Florian scaled to be managing VP of Product and R&D. He stayed in the company until it was acquired in 2010 by Dassault Systèmes for $150M (a pretty large amount for French standards).
 
 Still in 2010 when the data deluge was pouring into to new seas, Florian worked in the social gaming and online advertising industry, an industry where machine learning was already being applied on petabytes of data. Between 2010 and 2013 he held several positions as consultant and CTO.
 
 By 2013 Florian along with other 3 co-founders creates Dataiku with the goal of making advanced data technologies accessible to companies that are not digital giants, since then one of Florian’s main goals as CEO of Dataiku is to be able of democratizing access to Data Science.
 
 So, you can watch the video or listen to the podcast in which Florian shares with us some of his views on the fast evolution of data science, analytics, big data and of course, his data science software solution.

Of course, please feel free to let us know your comments and questions.


Originally published at Data of Things on June 16, 2016.

Next Story — Altiscale Delivers Improved Insight and Hindsight to Its Data Cloud Portfolio
Currently Reading - Altiscale Delivers Improved Insight and Hindsight to Its Data Cloud Portfolio

Altiscale Delivers Improved Insight and Hindsight to Its Data Cloud Portfolio

Logo courtesy of Altiscale

Let me just say right off the bat that I consider Altiscale to be a really nice alternative for the provisioning of Big Data services such as Hortonworks, Cloudera or MapR. The Palo Alto, California–based company offers a full Big Data platform based in the cloud via the Altiscale Data Cloud offering. In my view, Altiscale has dramatically increased the appeal of its portfolio with the launch of the Altiscale Insight Cloud and a partnership with Tableau, which will bring enhanced versatility and power to Altiscale’s set of services for Big Data.
 
 The new Altiscale Insight Cloud
 
 On March 15th, Altiscale released its new Altiscale Insight Cloud solution. In the words of Altiscale, this is a “self-service analytics solution for Big Data.” Altiscale Insight Cloud aims to equip business analysts and information workers with the necessary tools for querying, analyzing, and getting answers from Big Data repositories using the tools that they are familiar with, such as Microsoft Excel and Tableau.
 
 According to the California-based company, with this new offering, Altiscale will be able to provide its customers with a robust self-service tool and an accessible and easy-to-query data lake infrastructure. As such, companies will be able to avoid many of the complexities involved in the complex and difficult preparation process of providing users with easy and fast access to Big Data sources.
 
 To achieve simplicity and agility, Altiscale relies on having a converged architecture, so that on the one hand it can minimize the need for data movement and replication, especially across Big Data sources, and on the other hand, it can eliminate the need for separate relational data stores in order to reduce organizational costs and management efforts.
 
 According to Raymie Stata, chief executive officer (CEO) and founder of Altiscale, the Insight Cloud:

Solves the challenge of bringing Big Data to a broader range of users, so that enterprises can quickly develop new offerings, better target customers, and respond to shifting market or operational conditions. It’s a faster and easier way to get from Big Data infrastructure to insights that drive real business value.

Altiscale considers that its Insight Cloud will be able to replace many more complex and expensive alternatives, allowing organizations to get their hands on Big Data broadly and quickly, without heavy information technology (IT) involvement. As such, Altiscale Insight Cloud will have a significant impact on the speed and facility with which organizations will be able to access and analyze Big Data sources.
 
 As a high-performance, self-service analytics solution, some of the core features of the Altiscale Insight Cloud include:

  • interactive Structured Query Language (SQL) queries,
  • dynamic visualizations,
  • real-time dashboards, and
  • other reporting and analytics capabilities.

The big news is that with its Insight Cloud offering, Altiscale will be delivering not only a reliable Big Data platform, but also an extension to its infrastructure that can simplify the connection between Big Data and the end user, which is currently a complex, slow, and expensive process for many organizations. This can also significantly reduce the need for expensive, proprietary solutions — not to mention that this new offering can avail many business analysts easier and faster access to an organization’s existing Hadoop data lake.
 
 Of course, organizations interested in this offering will need to consider a number of things including Altiscale’s power to perform data preparation and cleaning to ensure high-quality data and profiling. But without a doubt, this is a wise step from Altiscale: to provide its customers with the next logical step in the Big Data infrastructure, which is the ability to perform fast and efficient analysis.
 
 Altiscale and Tableau: Business intelligent partnership?
 
 Within a few short weeks of the Altiscale Insight Cloud launch, Altiscale announced a partnership with data discovery and visualization powerhouse Tableau. The partnership with Tableau will, according to both vendors:

make it easier for business analysts, IT professionals, and data scientists to access, analyze, and visualize the massive volumes of data available in Hadoop.

Additionally, according to Dan Kogan, director of product marketing at Tableau:

Altiscale shares our mission to help people see and understand their data. Partnerships with leading Hadoop and Spark providers such as Altiscale help us to bring rich visual analytics to anyone within the enterprise looking to derive value from data.

Now users can use Tableau connected to the Altiscale Insight Cloud directly via Open Database Connectivity (ODBC), the standard application programming interface (API) for accessing database management systems (DBMSs). Once connected, Altiscale Insight Cloud will enable users to create visualizations and perform analysis similarly to working with other databases.
 
 User will be able to use Tableau’s easy features to drag and drop fields, filter data, analyze data, and derive insights to create visualizations that can later be published to Tableau Server. Additionally, there is a noteworthy feature that allows users to reuse intermediate solutions provided by Altiscale partners, so that users can first aggregate and catalog data prior to creating visualizations with Tableau, thus providing extra flexibility and power to the Altiscale-Tableau connection.
 
 Of course, the first thing that stands out from this partnership is the opportunity for thousands of users on both ends of the partnership and from different disciplines to, on the one hand, be able to use an appealing and easy-to-use tool such as Tableau, and on the other hand, to easily crack the data coming from large and complex data repository residing in Hadoop.
 
 This partnership shows how Big Data and analytics and business intelligence (BI) providers are moving in an industry-wise manner to increasingly narrow the functional gaps between Big Data sources and their availability for analysis, while widening the number of options for incorporating Big Data within enterprise analytics strategies.
 
 While such a partnership is not at all surprising, it is relevant to the continuous evolution and maturity of new enterprise BI and analytics platforms.
 
 But what do you think? Of course, I look forward to hearing your comments and suggestions. Drop me a line, and I’ll respond as soon as possible.
 (Edited by my friends at Medit Global.)


Originally published at Data of things blog

Next Story — Microsoft and the Revolution… Analytics
Currently Reading - Microsoft and the Revolution… Analytics

Microsoft and the Revolution… Analytics

You say you want a revolution

Well, you know

We all want to change the world

You tell me that it’s evolution

Well, you know

We all want to change the world

(Revolution, Lennon &McCartney)

With a recent announcement Microsoft took another of multiple steps towards what is now a clear internal and external revolution regarding the future of the company.
 
 By announcing the acquisition of Revolution Analytics, a company that in a just a few years has become a leading provider of predictive analytics solutions, Microsoft looks not just to strengthen its already wide analytics portfolio but, perhaps is also trying to increase its presence in the open source and data science communities, with the latter being one with huge future potential. An interesting movement no doubt, but… Was this acquisition one that Microsoft needed to boost its Analytics strategy against its biggest competitors? Will this movement really give Microsoft’s revolution a better entrance to the open source space, especially within the data science community? Is Microsoft ready for open source and vice versa?
 
 The Appeal of Revolution Analytics
 Without a doubt Revolution Analytics is quite an interesting company, founded lest than 10 years ago (in 2007) it has become one of the most representative software providers of predictive analytics in the market. The formula has been, if not easy to achieve, simple and practical, Revolution R software has been created on top of the increasingly popular programming language called ‘R’.

As a programming language, R is designed especially for the development of statistical and predictive analytics applications. Because this is a language that emerged from the trenches of academia and because of its open source nature, it has grown and expanded to the business market along with a vibrant community which develops and maintains its Comprehensive R Archive Network (CRAN), R’s wide library of functions.
 
 Revolution Analytics had the apparently simple yet pretty clever strategy of developing and enhancing its analytics platform on top of R in order to offer a debugged and commercial ready R offering. It also has been clever to offer different flavors of software, ranging from a free version to a version ready for the enterprise.
 
 At the same time, Revolution Analytics has maintained its close relation with both the R and open source communities and has developed a wide range of partnerships with important vendors such as Teradata, HP, IBM and many others, increasing its market presence, adoption and continuing technical development.
 
 At first glance of course, Revolution Analytics is quite an interesting bet not just for Microsoft but for many other software providers eager to step big into the predictive analytics arena but.
 
 Not so fast Microsoft…Was it a good idea?
 In an article published recently on Forbes, Dan Woods states that Microsoft’s acquisition of Revolution Analytics is the wrong way to embrace R. He explains that the acquisition represents a step forward for the R language but will limit what R could bring to Microsoft’s own business. According to Mr. Woods:

It is vital to remember that R is not a piece of software created by software engineers. Like much of the open source world, R was created by those who wanted to use it — statisticians and data scientists. As a result, the architecture of the implementation has weaknesses that show up at scale and in other inconvenient ways. Fixing this architecture requires a major rewrite.

And,

While Microsoft will be able to make its Hadoop offering on Azure better with what Revolution has done, the open source model will inhibit the wider deployment of R throughout the rest of the Microsoft ecosystem.

Both points are absolutely valid especially considering how the open source code would need to be accommodated within the Microsoft analytics portfolio. However, I would not be surprised if Microsoft had already taken this into account and had contemplated putting R on Azure as a short-term priority and the immersion of R with the rest of the portfolio as a medium-term priority –considering that they have not just acquired the software, but the expertise of the Revolution Analytics team. Important will be then to maintain cohesion on the team to pursue these major changes.
 
 Another interesting aspect is Mr. Woods’ comparison of Microsoft’s acquisition vs TIBCO’s approach which took a radical posture and re-implemented R to make it suitable for high performance tasks and highly compatible with its complete set of analytics offerings and, thus creating TERR.
 
 While TIBCO’s approach is quite outstanding (it deserves its own further post), it was somehow more feasible for TIBCO due to its experience with Bell Labs S, a precursor and similar offering to R and, its longtime expertise within the predictive analytics field. Microsoft by the contrary, is in the need for shortening distances with IBM, SAS and many others to enter the space with a strong foothold, one R can certainly provide, and also to give the company some air and space to further work on an already stable product such as the one provided by Revolution Analytics.
 
 One thing to consider though is Microsoft’s ability to enter and maintain active a community that at times has proven to be hostile to the Seattle software giant and, of course, willing to turn their backs on them. About this David Smith, Chief Community Officer with Revolution Analytics, mentioned:

Microsoft might seem like a strange bedfellow for an open-source company, but the company continues to make great strides in the open-source arena recently. Microsoft has embraced Linux as a fully-supported operating system on its Azure cloud service.

While it’s true that Microsoft has increased its presence in the open source community, whether the inclusion of Linux under Azure, contributing to its kernel or maintaining close partnerships with Hortonworks — big data’s big name — being able to convince and conquer the huge R community can prove to be difficult yet highly significant to increase its presence in market that has huge potential.

This of course, considering that Microsoft has changed its strategy regarding its development platforms by making them available to enable free development and community growth, like with .NET, Microsoft’s now open source development platform.
 
 Embracing the revolution
 While for Microsoft the road to embrace R can potentially be bumpy, it might still prove to be the way to go, if not the only, in order to foresee a bright future in the predictive analytics market. Much work perhaps will need to be done, including rewriting and optimizing but, at the end of the day, it might be a movement that could catapult Microsoft to compete in better shape in the predictive analytics market before it is too late.
 
 At this point it Microsoft seems to rely that the open source movement is mature enough to accept Microsoft as another common contributor, while Microsoft seems to be ready to take what appears to be a logical step to reposition itself in line with modern times and ready to embrace new tech trends.
 
 Like any new relationship, adjustment and adaption is needed. Microsoft’s (R) evolution and transformation seems to be underway.
 Have a comment? Drop me a line below. I’ll respond the soonest I can.


Originally published at dataofthings.blogspot.ca on February 15, 2016.

Sign up to continue reading what matters most to you

Great stories deserve a great audience

Continue reading