Why a Degree in Data Science?
Learn to tell a story with data
One of the top things I’m always looking to improve is my story-telling. Leaders from top organizations around the world agree that it is an important leadership trait and one which needs a lot of practice to master. There are many books, TED talks, and articles that offer examples of how this trait can be used to sell your vision to your team members, your bosses and your customers. But how do you tell a story using data?
Due to the importance of using data in making business decisions we need to be able to connect sound statistical analysis with an actionable strategy. This means that we must first be able to communicate the results of a technical analysis to colleagues in a way that is easy to understand. However, this is not enough. Simply communicating results should satisfy the logic requirement of your argument but it’s unlikely to get anyone really excited about your vision.
The “So What”
Business managers, executives and entrepreneurs must invoke emotion from their teams, their customers and potential investors. They must sell the grand vision for their solution that is the reason they get up in the morning and do what they do. Even though not everyone will have the same passion about the product or service, they must believe in it. This is where good technical training meets strong interpersonal skills. Being able to communicate at many levels is so important.
When I started my graduate studies in Data Science, one of the primary aspects I wanted to enhance was my applied math/stats skills. I wanted to be able to break down a problem and learn how to apply the right models to come up with a meaningful quantitative analysis. I wanted to learn the tools of the trade to supplement my now sufficiently dated engineering and development background. Most importantly however, I wanted to be able to apply all of this to my role in building product strategy to succeed in our business Objectives and Key Results (OKRs).
“All models are wrong, but some are useful” — George Box
As I progress through the program I’ve come to appreciate the question frequently asked by one of my accomplished professors, which is to simply ask the “So What?” question. It turns out this is useful check when working through building various models for statistical or predictive analysis but I also find that it’s helpful when assessing how to apply the academic concepts to real-world business problems. And just like the famous quote from statistician George Box, “All models are wrong, but some are useful” I find that the same cautious optimism can be utilized when seeking way to apply my learning to actual problems. There are many things I could try to apply that will likely not produce meaningful results for the product or the company, but a few may turn out to be very useful.
Some have asked me, what are some of the technical concepts I’ve learned in the Data Science program that have been immediately useful? I always break this down into two aspects. The first aspect is the surface-level knowledge and the tools of the trade. The second is the problem solving approach and knowing which questions should be analyzed.
- Technical Knowledge & Tools of the Trade:
It is important to study mathematical concepts and sound statistical methods that form the foundation from which to approach proper data analysis. It is important to review linear algebra, calculus and other concepts to keep knowledge about the core competencies fresh. It is also important to any technical practitioner of data science to fully understand the various type of data, the importance of normality in data distributions, the variance and the effects of outliers in various models to name a few. We need to build at least a surface-level knowledge of the multitude of formulas and methods for cleaning, transforming and visualizing data. We also need to establish a library of resources both digital and print that we can refer back to when needed as we encounter new problems in the future.
Make no mistake, this takes time and practice!
It is important to build our skills in using computational resources. Yes, we must be programmers. We must learn how to put tools like Excel, Tableau, R, and Python to use to perform the work of acquiring, cleaning, manipulating, calculating, optimizing and visualizing the data. Make no mistake, this takes time and practice! Like learning any spoken language or becoming proficient in any programming language, we must practice! In the end, this gives a valuable skill of performing our own analysis without relying on others to carry out the calculations for us.
2. Deeper Knowledge of the Problem Solving Approach:
The second major part that needs to be learned is the process and approach to solving data problems. This is of course the more important one. This is where we learn how to apply the concepts to actually answer real-world questions and solve actual problems. This is the true value that data practitioners can provide to organizations. Regardless of the role or position in the company, from entry level data analysts to data-minded executives, those that can efficiently apply the methods to create useful solutions will be able to make a difference. This is about finding meaningful problems to solve given the information available. At its core this involves combining available information in novel and unique ways to create new insights or solutions. After all, isn’t that precisely the nature of intelligence? Again, this process takes practice and it is a primary goal of mine to achieve a level of mastery in this domain. It is really impressive to see an experienced professional apply their processes to a problem and be able to clearly communicate their analysis with data and visuals.
At its core this involves combining available information in novel and unique ways to create new insights or solutions. After all, isn’t that precisely the nature of intelligence?
Applying exploratory analysis techniques, choosing the right model and describing an actionable recommendation is an art form in the truest sense. It sometimes involves knowing when the problem is not worth addressing or when data is not sufficient to support the original goal of the project. One of the reasons this topic interests me is because of the parallels with the engineering mindset. Both require an analytical mindset, a propensity for problem solving and the appeal to computational or algorithmic thinking. Engineering degrees are not supposed to simply teach you math and engineering concepts, they should teach you how to problem solve. In this way, Data Science degrees are in the same camp as engineering.
If mathematics has been the universal language of the past, then I believe structured computational thinking will be the universal language of the future. This means not just learning how to write basic code, but also building an understanding of how to perform task in a programatic & replicable form. After all, mathematics and computational thinking are both just formalized representations of an abstract concept. While coding languages, frameworks and platforms may not be standardized, the way in which applications are built to make computers perform tasks are all remarkably similar. Data structures, command execution control, searching and sorting, memory management, computation algorithms, objected-oriented design, exception handling, and permissions are just some of the many standard concepts that are required. Some of these concepts are made easier by high level languages and not all of these concepts are required for basic coding, especially the coding required in data science. However, the practice in breaking down problems into smaller pieces and forming a programatic way to solve it is a fundamental skill nonetheless. Learning this skill is one important part of any technology-focused graduate degree.
While the nature of higher education may be changing in favor of more specialized degrees and online courses, obtaining accredited credentials of higher learning are still highly regarded. Completing courses from established institutions and learning from some of the best professionals in the field contributes to the credibility. The commitment and time required to finish a certificate or degree demonstrate personal interest and long term investment and dedication in the craft. What’s more, the opportunity to collaborate and learn from other like-minded students from all types of backgrounds may be worth the effort alone. I am still amazed by the diversity of students in our graduate program; it seems we have the full spectrum of interests represented. There are students coming from backgrounds in business, finance, non-profits, legal, IT, engineering, education, and of course computer science. Some are early career, some are re-inventing their career in the middle and some already have advanced degrees in other fields. These professional contacts and friendships will be a blessing in my future endeavors without a doubt.
So how will all this turn out? I don’t know for sure what will be the hallmark of my current graduate process in terms of future direction, opportunity or impact however I am optimistic. At nearly half way through the program I am beginning to “get it” when reading about types of problems that can be solved using a variety of data science related techniques an how to apply them. I am beginning to understand what contributes to a compelling data-driven business proposition or a worthwhile personal data project. There is still much to learn but I firmly believe that telling a story in an engaging and actionable way relies on data. Organizations globally are looking for compelling solutions to save cost, increase revenue or find entirely new opportunities using the ever increasing access to data and the hopeful change that can be discovered. My work must include compelling proposals leveraging my newly minted expertise supported by the numbers and sound analyses. When it comes to data and programming in just about any industry, you will either be “in the know”, or you will be left behind.