Data Science teams — what they’re not!

Try never to be the smartest person in the room. And if you are, I suggest you invite smarter people … or find a different room” — Michael Dell.

Data wragling for fun and profit is something that you hopefully get to enjoy, and with a bit of luck, get well paid while you’re at it. There’s typically a few people in each business who do this kind of work, and are often labelled the “data science” or “analytics” group depending on the fashion of the times. Basically their raison d’être is to get customer data into a workable form, distill some insight out of it, and let the corporate decision makers know what they should be looking at and what actions to take.

This is the ideal world, and while there are lots of reasons this doesn’t often play out too smoothly, the data science team(s) often find themselves as a very square peg in an organisation of round holes, and can be hammered into all sorts of uncomfortable accommodations with corporate bureacracy, IT governance, HR dictats and/or the whims of individual managers. Ultimately, it’s the quality of the data science work which suffers, when data science teams are re-purposed into other roles for which they’re not suited.

Your boss does what now?

The simplest, clearest, most readily accesible signal of the value an organisation places on data science is always the boss. If the boss of the data science team hasn’t the foggiest idea about data, professes only minimal exposure on the underlying mechanics, or has no professional skin-in-the-game, how can they be expected to effectively lead a team of analysts? There are many facetious LinkedIn click-bait bullshit stories of why a boss should lead by example, how to be a thought-leader etc. etc. but, people need to feel assured that even though the person heading a highly technical team might not necessarily know all the details, they should know enough to know what’s going on. Herein lies the dichotomy, confident leadership requires an assured understanding of everything great and small… or a necessary ignorance of the all of the subtleties.

Any data science team lead who hasn’t, directly, themselves, written SQL queries, parsed CSV files, written software professionally, or built statitical models outside of Excel has no place leading a team of data scientists. They will be unable to arbitrate between the demands of the leadership team and the capabilities of their team. The difficulties and intricacies of performing detailed analysis will be lost on any leader who equates doing a vlookup on 100 rows in Excel with clashing 100’s of millions of rows of semi-structured data.

Conversely, the manage can act as the canary-in-the-mine when trying to figure out what emphasis higher management places on data quality, forecasting and good data governance. As the manager of a data science team, they will be responsible also for keeping everyone in the loop on management priorities, and having an input into those priorities where needs be. If your team are deemed too inconsequential to be kept in the loop, then you’re probably considered so unimportant that the smallest update is deemed above your pay-grade. If the team and it’s manager are the last to know about actions which directly affect them, then alarm bells should start dinging like crazy… your bosses boss has disappeared off into the sunset and left yourself and the rest of the suckers holding the can.

Substitute for IT support

The data science, data engineering and analytics teams within most companies are probably the most capable employees within the office as a whole. These data-savvy girls and guys can take the most opaque mish-mash of data sources and magically weave it into some useable form that can drive company strategy for years to come. Either that, or the same people can spend their days trying to fix the printer drivers, projectors, email and all other assorted nonsense for senior management and their families. I’ve had colleagues go around trying to fix house alarms, Nintendo systems, TVs and all sorts of broadband routers in the homes of company execs of a weekend simply because they couldn’t be bothered to take the time and read a manual.

When you see data science teams being re-purposed as an ipso-facto IT support it’s time to get worried. It basically means one of two things,

  1. Management doesn’t consider the expertise of the data science team to be worthy of doing data science. In other words, they view data science, insight and forecasting as low-valued baloney, yielding nothing more reliable than their gut-feeling on a topic (more of that later).
  2. Alternatively, they’ve under-funded the IT infrastructure, either deliberately or through mis-management, and have ended up with gormless dupes who cannot sort out the basics when called upon. Panic sets in and they turn to the people who seem most likely to sort things out in a pinch.

The key realisation here is that nothing you can do will be likely to change this behaviour. The determination has been made that data science is of secondary importance to the latest IT emergency, and that this slide is likely to continue unless fundamental changes are made at a management level. However, here’s the crux — real management change in any sufficiently large organisation is extremely unlikely until some disaster takes hold that forces the current incumbents to flee elsewhere.

Whatever their rationale, the signs are obvious, it’s time to go. Enough of the long hours getting Skype working through the corporate firewall for the boss, enough of being the flunky who backs up the database each night to some archaic tape-drive because nobody else can. When your role is being denigrated, ignored, or obviously mis-managed, it’s time to call it a day.

No training or investment

The professional data scientist should be mature enough to realise that the times they are a-changing every second week… and everyone needs to constantly keep up with developments to stay fresh. As we all know well, learning new things takes time, but so does working a full-time job. So, the old trade-off used to be that a company would invest in training up their capable employees to make them even more capable, and then reap the rewards over time as these smart people get better doing smarter things.

This seems, on paper, like a win-win scenario. The company invests in their staff, who both appreciate the development opportunity and use the learnings to improve themselves in their role. Staff retention rates sky-rocket and people get to do more interesting and valuable things for the company.

But say we don’t want to invest in this sort of thing. That we think anyone can do this job, because the people who do it now didn’t get any special provision in the first place for training. The old law of short term gains comes to the fore… we’ve no budget for X, the assumption is that the team will continue to produce at the same rate as last year, therefore by eliminating any training we’ve become a whole lot more cost-effective straight away. When a business won’t fork out a couple of hundred bucks for some analytics software while the execs take paid-for golf junkets on the company dime, then you get a very clear picture of what they deem worthy of investment.

The unknown jewel in the re-org

Re-orgs, for whatever reason, seem to be a monumental waste of time and headspace in most companies. The net result, more often than not, tends to be a more tangled web of reporting structures, usually with your previous internal customers still coming to you looking for analysis regardless of what group your team has been moved to. Data science and analytics teams are probably best served when they are left to their own devices, however, the common or garden business ape is not satisfied with a load of boffins squirrelled away in some cubicle real-estate and left up to their own devices. They must conform!

Data science teams have never been a particularly neat fit with any business functions in most of the companies out there. There have been data science teams which have been squeezed into teams with non-technical people, and I’m sure there’s probably been one or two data science teams who’ve found themselves making soup in the company canteen after a particularly strange re-org. Needless to say, the re-org is where the business ape, with limited appreciation of the potential for data-based insight, trades their shiny corporate jewel to another business ape because the chief business ape said they should.

When you start in your role, count the number of times you need to change desks every year. Then divide this by the number of times your company physically moved office. If this ratio is greater than 2.0, you may consider yourself a mere green plastic cog in the game of re-org monopoly.

Outdated Tech

Data science people tend to become quite adept at doing all sorts of weird and wonderful data manipulation and analysis using nothing more than a terminal and a BASH script. Why? Well, to quote Shakespeare, this is commonly accredited to “necessity’s sharp pinch”. There have been companies where a 10-year old version of Excel was the closest thing the place had to any sort of analytics software. There were companies spending millions per year on technology and infrastructure that still would not support R or Python on their corporate devices. There were companies using some highly modified niche programming language to develop and maintain their back-end data infrastructure despite there being better, cheaper and more reliable alternative technologies. It is a well-known fact that when Moses came down from Mount Sinai, he was a bit harried and pushed for time and had accidentally left behind the other stone tablet on which was inscribed the forgotten eleventh commandment of “thou shalt not continue to use outdated tech”.

I am not ill. But do not worry, one day, I will certainly die.” — Charles de Gaulle, Feb. 1965.

Outdated tech is a false economy, it’s like a credit card bill you ignore, in the false hope that everything will just mysteriously resolve itself in the future if you go away and hide under a pile of coats. While the symptoms of outdated tech may not be immediately apparent, the effects are certain — an exponential decrease in capabilities over time compared to the market competitors.

However, the allure of maintaining outdated tech is attractive, the expense and pain of any change does not appear to management as a worthwhile investment. Why would they? If they get their Excel reports on a weekly basis giving them the customer numbers, and nobody disturbs them when they go for their mid-day nap, while should they care? The world may seem to be carefree and the forever the same as it was previously.

The shortest of short-term gains

Say you can build a customer retention model, to predict who within the current customer base is most likely buy your product again. And say this takes you three or four days to build, and will possibly save you 1000 customers per year, each giving you 100 bucks per year in fees. Congratulations, you’ve built something that should earn the company 100,000 buck for four days of effort. Assuming you’re day-rate isn’t 25k then everything else is pure profit.. yay!!!!

Reality however is quite different… it is more likely that your boss needs you to create a powerpoint slidedeck to show customer churn over the preceding 12 months with some commentary and segmentation. They can’t do this themselves because they have to pick the kids up from the creche etc. etc. And again, say this activity takes you a three or four days to generate the data and complete the slides. And another half a day to present it and then do a quick follow-up. Well done, you’ve just spent the entire time it would have taken to build your customer retention model faffing around on something that will be forgotten about in a month. Your boss will get the kudos for some impressive slideware, while instead of the hypothetical 1000 customers you could have retained in this time you have zero, zilch, nada… in fact you have a negative spend for wasting analyst time.

If you have a boss who demand retroactive analysis to back up their ‘political’ posturing, then you need to consider whether your time is being wisely spent forecasting the past and not, you know, doing proper data science.

Grown-ups know better

Data-driven companies and marketing teams abound, that is, if you would lower your sceptical threshold and accept the sort of unfiltered guff that some of them come out with. In reality, nobody in any corporate decision-making position will readily accept the truth, it has a nasty tendency with interfering too much with personal bias and ‘gut feeling’.

Let me be blunt, management do not give a hoot in hell about how good your data science team may be, the expected lift from your model, or what ML approaches the team can do to improve business practices. Management only care how useful the team might be to them, right-now, as they battle their way up the corporate ladder using using you and your colleagues as some glorified ice-axe in that fight. So, when you’re asked to find out whether competitor X is driving churn, the real statement may be deduced as follows… I am using competitor X’s advantage as a wedge to ingratiate my way upwards in the corporate world, and now I need you to lie through your teeth and support my slimy progress! The old addage has proved correct — the truth is always the first casuality is any conflict.

Politics

I’ve had a former boss say that if I presented some truthful, yet commercially unsavoury facts about customer behaviour that they “couldn’t protect me politically”. Aside from thinking that this individual had seen one too many episodes of the West Wing, the real learning I had here was that certain individuals maintain roles in a company for and through overt political manoeuvring. This sort of behaviour is diametrically opposed to doing more mundane (and less stressful) things like adding value to the company through doing actual data science work.

Where you have politics, you have untalented but highly-networked sociopaths, usually willing to do what it takes to gain any marginal advantage. This does not make for a welcoming terrain for any fact-based data analysis work, especially one which may betray a contrary view. So, would you like to rely completely on having these people determine whether you continue to draw a salary or not? Hell no!!! The second you sense that overt ‘politics’ has some determination over the fate of you or your team members then you get the hell out of there quick. US presidents change every four to eight years. You’re potentially in a job for 40. It makes a whole lot more sense to differentiate your team for their abilities, rather than some political guff, because, when everything is weighed in the balance, capabilities are all that count.