How to Build a Data Science Department

Jim Sagar
REHINGED.AI
Published in
5 min readApr 30, 2019

Data science is a hot field right now — so hot that we’re in a job bubble for data science positions. Salaries are escalating and anybody remotely associated with data, from database administrators (DBAs) to analysts to business intelligence to engineers are labeling themselves as a “data scientist.”

Data scientists can be incredibly valuable when used properly, but that isn’t the case in most companies; Gartner estimates that 85% of big data projects fail.

This is because most fields of data science are moving from what Gartner calls the “Innovation Trigger” phase, when industry interest builds rapidly, to a “Peak of Inflated Expectations.” According to Gartner, “During this phase of over enthusiasm and unrealistic projections, a flurry of well-publicized activity by technology leaders results in some successes, but more failures, as the technology is pushed to its limits. The only enterprises making money are conference organizers and magazine publishers.”

Gartner Hype Cycle

Why Many Data Science Projects Fail

Forward-thinking business leaders are willing to invest in new technologies and approaches, but at some point, they must begin showing a return to justify continued investments. In looking at the numbers of many corporate investments in early 2019, the picture is bleak:

Most corporate data science initiatives will fail and will become one of the worst investments a company can make.

One reason for failure is that data science is evolving too fast for top down organizations to keep up. The tools, datasets, and relevant skill sets of the data scientists are changing every 12 months.

Another reason these projects don’t succeed is that they don’t evolve from the data scientists, instead being directed from the top down. Organizational hierarchy can cause data scientists to spend months on projects that will fail to yield a discernible outcome.

How to Make Your Data Science Succeed

For companies to beat the odds and succeed, we recommend the following four-step approach.

1. Define the end goal.

Data science can be a valuable tool if used properly. But instead of jumping on the bandwagon and adding data scientists to your organization, determine how you’re planning on using data science to answer questions about your business. For example, do you already have large data sets of customer activity that can yield a deeper understanding of your customer interactions if properly analyzed by a data scientist? Do you wish to make predictive models? Or create algorithms to classify and rank data?

2. Design the department before making your first hire.

Some data scientists are excellent hires for specific types of work, but not a great fit for other types of data science work. For example, a data collection specialist probably wouldn’t have the right skill set for an applied machine learning engineer. So define the skills and experience you need for each role in your department and write the job descriptions for each. Have this reviewed by a knowledgeable data scientist.

3. Have a solid hiring and evaluation plan.

As more developers and business intelligence analysts move into the data science field, it becomes more difficult to weed out the beginners from the deeply skilled data scientists. Design an evaluation process for determining the true skills of each applicant. Use testing and specific technical questions when needed, and don’t rely on years of experience or accreditations.

For example, a trained physicist who learned SQL on his own, but hasn’t applied it in a business setting might be far more skilled than a data scientist who has deployed point-and-click models for the past three years in his job, but doesn’t understand the underlying mathematics of the models and won’t be able to deploy anything other than specific algorithms known to fit a specific use case. You can avoid hiring the wrong candidate by diving into the specifics. Consider hiring a data science consultant to assist you with the design of the questions and the candidate evaluation.

4. Commit to data engineering.

If you don’t already have massive sets of data, data engineering will likely consume more than 80% of your department’s time. What data will you need over the next 12 months? Building a solid data architecture can take a lot of work, from finding the data sources, defining the data pipelines, and setting up the flows. Investing in this upfront can give you a greater return on your work in subsequent years.

What Makes a Good Data Scientist?

Put yourself in a new hiring manager’s shoes: you’ve done a bunch of reading and decided you need data mining, statistics, and machine learning skills for your project. You have the budget to hire three people and there are ten resumes on your desk with “data scientist” on them.

In his book, Building Data Science Teams, DJ Patil asks the question you are likely asking yourself in this moment: “What makes a good data scientist?” Here’s what Patil suggests:

• Technical expertise: the best data scientists typically have deep expertise in some scientific discipline.

• Curiosity: a desire to go beneath the surface and discover and distill a problem down into a very clear set of hypotheses that can be tested.

• Storytelling: the ability to use data to tell a story and to be able to communicate it effectively.

• Cleverness: the ability to look at a problem in different, creative ways.

Let’s think about the roles you’re looking to fill, starting with your data mining role. You’re not looking for a candidate here who is meticulous about precision and cautious decisions. You want a speed demon who will make sense of as much data as possible as quickly as possible. Among your data scientist applicants, look for the ones who are experienced at producing fast insights from large amounts of data.

Next, consider your statistician. This role will keep the big picture in mind while helping your team draw safe conclusions and make realistic decisions. This candidate should be pragmatic and practical, prepared to be a voice of reason when it comes to making decisions that reach beyond your team’s datasets. Look for a core skill set and previous experience that are relevant to the goals of your organization.

Finally, the machine learning expert you hire will need to be skilled at using algorithms, not building them. Their personality and background should reflect patience with trial and error, including a strong stomach for failure. You’re looking for someone who is experienced with A/B testing and wrangling algorithms to work with your datasets until their iteration eventually leads to solutions.

Your Successful Data Science Department

Don’t assume that data scientists need a background in computer science. In fact, data scientists come from a wide variety of backgrounds, from neuroscience to marketing. What they have in common, however, is a deep curiosity which drives them to perform data-intensive work. They are driven to understand a business, industry, or technology in a new way.

When building your data science department, beyond determining the roles you need and the right candidates to bring on board, it’s important to ensure that the department has strong support from the rest of your organization. Ultimately, your data scientists won’t succeed because they’re qualified for the job you’ve hired them for, they’ll succeed because they are supported by a data-driven organization.

--

--