What Data Scientists do.

Chris Orwa
Brave
Published in
4 min readSep 19, 2016

I’ve always found myself at odds when explaining what I do — I belong to a cult of men and women whom have been mystified by algorithms and large amounts of data. In an ancient society , we would have been designated as wizards and witches of the first order. Today, we are exalted and put on the dais as conjurors of engrossing tales. This premise demands the question, “what does a data scientist do?”.

The Longitude Problem

In the maritime dominated economies of European powers of the 18th century, a great problem had dogged merchant ships and naval commanders for centuries — the problem of determining longitude at sea, famously known as The Longitude Problem. As trade and conquests expanded through transoceanic voyages, it was of vital importance to know a ships location at anytime.

In this period, only the latitude was possible to measure using celestial bodies. Captains and merchants thus employed the services of a navigator who through intuition, new methods and sometimes guesswork estimated the position of a ship. Miscalculations always led to wondering into enemy territory and losing cargo to privateers or worse hit coral reefs. Such was the Scilly Naval disaster in 1707.

Estimating the position of a ship using a method known as Dead Reckoning

The Scilly Naval Disaster

Returning home from a victorious battle in Gibraltar, English Admiral Sir Cloudesley Shovell came across a heavy fog in the Mediterranean sea. Fearing the ships might hit coastal rocks, the admiral summoned all his navigators to put their head together. The consensus opinion placed the English fleet safely west of ‘Ile d’Ouessant’ , an island outpost of the Brittany peninsula in France. But as the sailors continued north, they discovered to their horror that they had misgauged their longitude near the Scilly Isles. Four ships sank losing almost two thousand of Sir Cloudesley troops — only two men washed ashore alive, one of them was Sir Cloudesley.

He reflected on the events of the previous twenty-four hours, when he made what must have been the worst mistake in judgement of his naval career. He had been approached by a sailor aboard his ship who claimed to have kept his own record of the fleet’s location during the whole cloudy passage. Such subversive navigation by an inferior was forbidden in the Royal Navy — Admiral Shovell had the man hanged for mutiny on the spot. It turned out he was right, the ship had been going in the wrong direction.

Solving the Problem

The hanged sailor in Sir Cloudesley’s ship had been thinking differently on how the estimate position utilizing weather, wind speed and direction, water currents and other new factors such as a ship’s weight and depth of the waters. It is astonishing that he had outdone consensus of several experienced navigators aboard the ship.

The navigator’s primary responsibility was to enter daily records of the ship’s course, speed, and significant events. Hence they became masters of gathering data — every change in wind direction and speed was noted down, the latitude recorded at 12:00 noon everyday and weather condition noted. These entries were made in a ship’s log — using mathematical knowledge, the navigators used the in the ship’s log to estimate longitude .

Ship’s Log from the 18th Century

It was in 1761 that a self taught lone genius solved the named John Harrison that solved the problem. In 1714, after the Scilly Naval Disaster, the British parliament set up the Board of Longitude to evaluate methods of establishing longitude at sea and to award prize money $20,000 to anyone who could solve the problem.

Mr. Harrison first hearing of the price in 1725 single-minded constructed a timepiece whose accuracy would be unaffected by violent motion or changes in temperature, so allowing navigators to preserve through their voyage an accurate reference time corresponding to time used to record coordinates of heavenly bodies, hence measuring longitude.

Back to Reality

What do Data Scientists do? They do what navigators did in the 18th century — provide reasoning under uncertainty. Like navigators, collecting data related to a situation provides a better means to understand and reduce uncertainty. Probability is a way of summarizing the uncertainty of a situation. It provides a means of assigning degrees of belief in a conclusion — we refer to it as confidence level. So when we make predictions based on collected data we talk in probabilities to communicate how confident we are with the results.

In scenarios when the uncertainty is solved much like John Harrison’s clock, the procedure followed is canned for re-use and we refer to the canned formula as an algorithm. A systematic way of solving a specific problem. Therefore, algorithms are all about a set of rules to be followed to solve a problem. I like Medium’s Software Engineer Jamie Talbot comment on algorithms.

Algorithms don’t really have anything to do with computers, although computers turn out to be very handy algorithm processing machines.

There is something to be noted about the type of problems solved by data scientists. They tend to be of great importance, difficult to solve and attract high reward. So, when I met a client who asked whether I could detect the emotions of a person in an audio recording, I toss myself to be sea of thoughts and hope for a longitude moment.

About Author

Chris Orwa is the former Head of Data Science for Brave Venture Labs. His exploration of the world through data can be found at BlackOrwa and on his Medium blog.

Ibanga Umanah is a Cofounder and the Head of Strategy for Brave Venture Labs. Brave is a people science company uncovering the drivers of performance for better recruiting and talent management.

Get in touch to hire, get hired, or join our team: brave.careers

--

--