Lessons learnt in Silicon Valley

Introduction

7 min readJul 1, 2018

When I got an opportunity to study analytics at the heart of Silicon Valley, I quickly grabbed the chance to bridge the gap between my technical understanding and business knowledge related to technology. I was part of the charter MSBA program which was co founded by Hemant Bhargava at UC Davis which required us to partner with companies based out of San Francisco to do an industrial practicum project for the full length of the program.

Today I am jotting my overall experience studying for MSBA and lessons I learnt hard way after coming to San Francisco both inside the class and outside.

We kicked off our program with practicum project at AutoDesk which is pivoting its business model from a licensed based version to a subscription one. Like every other cloud company, success of subscription based businesses (like Netflix) rely heavily on the robustness of their cloud infrastructure and one of the few ways to achieve this is to reduce the number of failures occurring in the cloud.

Events, faults successes and failures including other activities is logged by cloud system in the form of log data. Analyzing such kind of data can not only reveal a lot of insights about what happened in the past but can also help us in predicting future state of the system.

The overarching objective of our project was to extract intelligence from the cloud log data and put it into one dashboard giving AutoDesk a single lens view of the overall cloud environment. This increases the real-time visibility of the particular component of the cloud infrastructure thus helping them to fix issues in the real time and forestall failures which might occur in the future.

The project was fairly complex dealing with a large chunks of unstructured log files which require sophisticated BigData processing techniques like AWS EMR (Elastic Map Reduce) and Spark. We also used machine learning to find patterns in the sequential log data by training models on historical data. This allowed us to predict the nature of unseen logs. These metrics were then piped out to a dashboard for business stakeholders to take decisions in real time.

Because of the complexity of the project, there were unique learning opportunities ranging from technical skills to storytelling complex information in a simple yet comprehensive dashboard.

“Essentially, All models are wrong but some are useful” — George Box

Learning one : Value, not models

When I kick-started MSBA program with UC Davis’ bookstore analytics competition we were given a dataset to predict the bookstore sales. Most of the students used complex analysis where they were comparing Amazon book prices for the same book using web scraping. Few other students used the most advanced models to improve the accuracy. I being similar to most of the students also focused mainly on the accuracy and selection of models which resulted in a big failure.
Though I was able to create a RandomForest model (with 94% accuracy) for forecasting demand, I failed to communicate the business insights which are important for stakeholders (here UC Davis Bookstore) to make decisions over the top of the models. What I failed to realize here was given the accuracy of the models, what Bookstore can do about it? What are the controlling variables?
In most of the analytics projects, the business stakeholders are looking for what value we are bringing to the table using any of the methodologies that are available under the Sun. They are little concerned with the accuracy of the models or whether you have used the most cutting-edge machine learning to predict the outcome. I am glad that I learned this aspect in early phases of my MSBA journey.

Learning two : Asking the right Questions
To maximize the value addition to the client you need to take a step back and see the bigger picture and imagining yourself as a user. For the given project we came up with the following (but not limited) questions :
“What kind of value you are trying to provide?”
“Are you saving time? Reducing cost? Increasing revenue?”
“How stakeholders will be benefited from the value?”

“How the solution can simulate the stakeholders to ask the right questions?”
“Can the solution be implemented right now?”

Once we have these questions we can tailor our solution effectively.

Learning Three :
Maximizing value : Not just predictions but also recommendations

In our case we have forecasted different states of the system by predicting the failure or success of different logs using machine learning. Accuracy of a machine learning model is valuable from a technical point of view but from a business perspective, accuracy alone doesn’t provide any value.

We talked iteratively with the MIP to understand the completely life-cycle of the error handling which allowed us to figure out how we can reduce the number of steps to solving an error. Following this we came up with an idea of recommending next steps alongside with predicting the errors.

When you forecast the state of the system a little ahead of the time there is an added value in it as necessary steps can be taken to avoid such an error. It is even better if we can predict the error a little ahead of time with the right recommendations to assist debugging such errors. These recommendations can be produced in the form of tags associated with that particular type of error which can allow an engineer to quickly raise a ServiceNow ticket thus reducing the number of steps which can be considered as reduction in cost.

Learning four : Dash-boarding

While I was working on this project one of my key learning from the last phase of the project came from storytelling and dash-boarding as I owned these two parts completely.

Showing an insight is surprisingly easy even with the right analysis and intention but showing the same analysis with a persuasive story which can allow the client to connect with it is a completely different game.

Dash-boarding and storytelling essentially require telling a story about the facts and insights that we have discovered during the analysis. Thw two parts of this process are dashboards and slides.

First part starts with intuitively wire-framing a dashboard followed by preparing the datasets. After preparing the dataset, metrics are created which are represented in the form of visualizations.

We had multiple dashboard tools to choose and hit the ground running but we were primarily constrained with the streaming nature of the data which kicked out Tableau from the list. Since AutoDesk’s whole cloud infrastructure sits on AWS, Quicksight made a lot of sense. Also, QuickSight provides an easy connection between the dashboard and streaming data coming from S3. Hence we selected QuickSight as our Primary dashboard.

Designing Dashboard

A poorly designed dashboard can fail to convey the right story and can inhibit to provide the value in a comprehensible manner. To avoid this we followed four basic principles of designing a dashboard :

Reducing Visual Clutter :
This facilitates answering most valuable questions in less than 10 seconds. (which can be considered as a Rule of thumb). The charts should display most amount of the information with least amount of space.
Simple Design :
Generally we have a tendency to provide as much information as possible by covering all the metrics related to all the challenges that the business if facing. The second principle illustrates that the design of the dashboard should be as simple as possible which can be achieved by displaying as few metrics as possible. These few metrics can be selected by prioritizing what are the key objective of the business.
Inverted Pyramid
Presenting the information in the form of inverted pyramid[1] where you present the most important (read big picture) thing at the top followed by less important information as you go down the board.
In our case the total number failed jobs came at the top followed by error categorization in the failed jobs followed by average wait time in all these categories.
Choosing Visualization
Selecting the right chart to visualize the data is also critical. If we are trying to present the relationship between two variables then choose a different chart . If we are presenting the overall share of one variable among all the variables then a pie chart is a better alternate.

Selecting Metrics

Selection of metrics can be done by mapping the most important objective of the business with the right measures that we want to put on the dashboard.
In our case, average run-time by error category of the failed jobs gives us the amount of resources that are going down the drain. A job which failed after running for 4 hours is more costlier than a job which lasted only for an hour. Hence the average time of failed jobs can allow decision makers to dig deeper into the system and ask the right questions about why jobs are failing.
There are different ways to approach debugging these errors which depends mainly on the type (category) of errors. Hence the second most important metric would be average time of failed jobs by category.

Such kind of prioritization and selection require continuous engagement with the stakeholder to assess the particular department’s business objectives along with the overarching objectives of the whole organisation.

Conclusion

Through the experience that I had with this project I have learnt the skills which I can leverage to effectively bridge the gap between the technical and business aspects of most of the Analytics projects.

References :

[1] https://www.sisense.com/blog/4-design-principles-creating-better-dashboards/

Lessons learnt in Silicon Valley

Introduction

Designing Dashboard

Selecting Metrics

Conclusion

Written by sharad jain