Preface: Before you spend the next few minutes reading this post only to find out it’s not a tutorial, I'll inform you now; it is not! However, if you are a somewhat ‘non-technical’ person or even somewhere in management and need a good read on ‘What’ you can do with Data Science — come on in and welcome to our world.
Defining Data Science
I am often asked by students and colleagues alike to give my best definition of Data Science. You’d think after working for years as a Senior Data Scientist and mentoring hundreds of professional development students in this area that this would be an easy task for me to solve, but the more and more I think about it, the more complex my answer gets — and I think this is the very centrist concept supporting a great deal of confusion in the industry.
In my opinion, Data Science shouldn’t be defined around what it is, but more so around what we do with it.
In doing this, one can easily ascertain that there many different types of Data Scientists in many different roles and, while the point could be made that data is centric to all of these, I’d argue that is not the case. I’d argue value is the common denominator among all different roles in Data Science.
Everyone and Every Role Should Bring Value
Bringing value to a company is what we all do in each and every role we play within an organization. Data Scientists are no different except in the fact we have a different medium in which to work and we work in data of all shapes, sizes, and forms, which has ushered in many new supporting roles as well.
Data Translators, Data Engineers, ML Engineers, AI Engineers and a whole host of other new and cool sounding job titles are currently rushing in to fill the void our newfound ability to analyze larger and new(er) types of data has created.
Let’s Look at a Typical Example of ‘How’ This Value is Obtained
Here at Hashmap, we often have clients that come to us with large amounts of data and simply no starting point for which to gain value from it. One of those common areas that we work together on with many of our clients is better understanding their customers or creating a 360° view of the people that they value most.
The solutions we use most often here at Hashmap for Customer 360 projects are Snowflake for fast, efficient, and cost-effective Cloud Data Warehousing (available in AWS, Azure, and soon to be GCP) and Databricks which is a unified analytics platform available in both Azure and AWS.
We are a data consulting firm at our core with Data Science and AI verticals and we often use both of these solutions together because they meet our scalability and Data Science requirements.
Simply put: The more a business understands its customers; the better it can serve them.
We use Data Analytics, Machine Learning, and Big Data to satisfy many different areas when building our Customer 360 solutions, but most are centered around four main areas:
- Understanding customers— i.e. who is our typical customer?
- Predicting the performance or sales of an item or set of items.
- Aggregating previously unknown possible cross-selling opportunities.
- Building “what if” scenarios for our clients that provide more detail on the best and most probable places to spend their valuable dollars with respect to return.
For the purposes of this post, I’ve simplified and combined all of these areas into Building Customer Profiles.
What’s In a Customer Profile?
Customers are the breadline of any business and understanding everything we can about the ones we serve can allow us to not only provide better services but also build brand loyalty and even streamline Supply Chains. Simple demographics from historical purchases can provide a wealth of information such as purchasing behavior, brand preference and can even paint a much better picture of the specific reasons why customers may or may not buy products or services.
For example, we can pull apart (that is scientific speak for analyzing) simple datasets containing over 500,000 purchases along with simple demographics like sex, age, zip code, etc. and other variables that are typically gathered by having customers sign up for loyalty cards and other marketing vehicles. Often, we also know product details like price, features and sometimes even the other products that were bought during the transaction.
From this simple data, we can use Data Science to build a Customer 360 Purchasing Profile to provide a better understanding of who purchased what and when.
But how is this helpful if we already know it from the historical data?
Achieving Value with Data Science
Data Science methods allow us to uncover likely previously unknown relationships between customer related variables that can help us not only better understand who bought what, but perhaps even make correlations that paint a better picture of why, although causation is not the ultimate goal of Data Science; as we already stated, value is.
Better yet, we can use these newly learned rules to ‘model’ our customer’s purchasing behavior and predict not only what they might buy next, but also what else they might purchase along with it. We can also better understand and model how well our marketing efforts are working within individual areas of concentration or spending — what does this mean? It means we can ‘model’ the performance of the individual areas where companies spend their dollars during a process.
This is where Data Science really brings value — by allowing us to better understand the past and apply what we’ve learned to the future in faster, more productive and more reliable and reproducible ways than we have typically ever accomplished the same goals prior.
Lastly, we are learning things about business and processes that we previously simply didn't have the capability of analyzing and in many cases, didn’t even have the data we needed to do so.
Rapid advancements in the cloud to take the infrastructure effort out of the equation and provide a simpler, more streamlined approach to working with technology across a number of areas including diverse datasets, elastic compute, low cost storage, data sharing, ML/AI packaging, and easier-to-use visualization tools are making discoveries easier to understand and easier to validate than at any time prior and we have only seen the beginning.
The more data that is produced, the more impactful Data Science and its added value when applied to Customer 360 challenges.
Upcoming Webinar: Implementing a Customer 360 Solution with Snowflake & JupyterHub
If you’d like to learn more about how we are applying cloud-native solutions to the Customer 360 challenge, please check out this upcoming webinar where we will discuss a reference architecture for Customer 360 using Snowflake Cloud Data Warehouse and JupyterHub:
Feel free to share on other channels and be sure and keep up with all new content from Hashmap on Medium — you can follow us here.
Dr. Benjamin Manning is the Lead Data Scientist at Hashmap specializing in growth across all industries and partners served by Hashmap. He has been a Machine Learning and Data Analytics consultant for seventeen years and specializes in engineering disciplines such as Solar Energy, Oil and Gas and Iot/IIot Systems.
Ben is also the Senior Data Science Mentor for Experiential Teaching Online and teaches Data Science and Cybersecurity online for the University of Texas-Austin and Rutgers University.