A day at Data Science Retreat: Ike Okonkwo interviews Jose Quesada, founder and director

Ike Okonkwo of ‘Yet Another Data Blog’ fame interviewed me a while ago about Data Science Retreat and Data Science education in general. This ended up being the most detailed description available of what doing DSR feels like, so posting it here on medium (where there’s an audience already) made a lot of sense. Without further ado:


What’s your 1 minute bio / introduction?

I’m the founder and director of Data Science Retreat. I’ve been working on data topics for about 15 years, and I keep my knowledge up to date because I have to teach these topics to extremely eager, intelligent and motivated retreat participants. I also advise companies on their data problems.

I came from a rural background. My father grew apples, and expected me to do the same. Instead, I studied psychology and fine arts. Then I did a PhD with lots of machine learning. In it I developed a software system to teach pilots how to land commercial aircraft without the need of a senior instructor sitting next to them (which I didn’t patent; silly me).

Can you tell us a little bit about your background and role at Data Science Retreat ?

I have a PhD in cognitive science. I worked in academia and in industry before; I’ve lived in Berlin for 6 years. Data Science Retreat is my third company. The two previous ones were in tech education (B2C) and computing customer lifetime value (B2B). At DSR I’m founder and director. I was a consultant working on CLV for about a year, then one of my clients hired me full time as their first data scientist. I experienced the pain that a company goes through when adapting to the ‘whole new world’ of being data-driven, and this knowledge helped when I started Data Science Retreat in 2014. I love to see how people develop into seriously effective data practitioners.

How did you get into Data Science Education / Training?

By accident. I like to learn new tech stacks and be surrounded by people who like that too (See story below on Hacker retreat). Nearly everything I know that is really useful, I taught myself. I think we live in a self-taught paradise. But after a certain level of excellence, it’s hard to make progress. This is something most aspiring data scientists find. No matter how many MOOCs you do, there’s a barrier that very few people ever break through.

This is why Data Science Retreat started. I think I know how to create an environment where you can go “faster than the average self-taught speed” and break the barrier of excellence that most people encounter. I asked myself: “What needs to exist for this to happen?”. My answers was: you need to have access to ‘chief data scientist’-level people, contributors to leading open source packages, etc, and they need to be invested in your progress. You need to be surrounded by other people seeking excellence, too. DSR is the kind of setup that I wish I had when I started. Six batches later, all I can say is that I’m very proud of the result, as is everyone involved.

What is your definition of a Data Scientist?

I’m going for the ‘black box’ definition: data science is whatever activity produces business value out of data. It’s not even applied machine learning anymore: whatever works. There are plenty of non-ML things we can do with data nowadays that are very valuable, for example stream processing. Not much ML is happening on streams right now, but this is a very active part of data science. Things like Spark and Flink make streams fun again.

Data Science Retreat — Foundations

What’s the 1 minute bio / introduction on the Data Science Retreat bootcamp?

If you are a company, you either have existing employees that you want to upskill, or you want to hire new people with bleeding-edge skillsets.

Let me give you a quick intro to Data Science Retreat (DSR) and what we do. We train enterprise customers such as GfK and Zalando that have internal talent that they want to upskill. You can see a list of our courses we call masterclasses here; they are tailor-made for busy technologists working for companies needing to get this knowledge fast. We run these courses in our retreat center, or we can bring them to your location.

We also run a 3-month course that we call a ‘retreat’. DSR takes participants from all over the world (average: 5 years of industry experience), trains them for 3 months on bleeding edge technologies, and presents them to companies in what we call ‘hiring day’. This is the best course of action if you need to hire new people. These participants are highly filtered (about 5% acceptance rate) and then trained by very advanced practitioners, often at the CTO/chief data scientist level.

How did the idea for the Data Science Retreat bootcamp come about and what do you hope to achieve with it?

The first thing I tried (Hacker Retreat) didn’t really work. I wanted a ‘self-learner paradise’ where you are surrounded by extremely motivated people in the same quest as you to improve as much as possible. The first edition of Hacker Retreat was free (I paid for everything out of my own pocket) and had no structure. Participants could work on any problem, using any language. The hope was that they would pair up with themselves and with mentors, and move faster than going at it alone. But this rarely happened. We had to almost force people to talk to each other. The likelihood of two people liking the same project enough to collaborate is tiny. Then someone emailed something like ‘you mentors are all pretty impressive machine learners; you should do something for data scientists!’ And I thought: ‘hmm, you are right’.

How do you screen and select students for your program?

We have a pretty involved screening. Long questionnaire, an interview, and then two mentors need to approve each person who passes the interview. Nowadays, many applicants come through our network, which helps pre-filtering.

Can you describe the typical background (academic / professional) you look for in your students?

They come from very diverse backgrounds. For the most part, they studied some sort of science or a technical field. Some have PhDs, many have Master’s degrees. They usually have a few years of work experience, some typical profiles include people who have been doing business intelligence or data analysis and want to learn more about predictive methods. The important part is that they can keep up learning on their own to stay at the bleeding edge once they are out of DSR. That’s what we’re mainly looking for. Companies who hire from us are very happy and keep coming for more. We make sure we take in extremely driven people; and we support them once they are out there by letting them come for two days of training of their choice to the next batch. Because our curriculum changes quite a bit to follow the market, there’s always lots of content that alumni want to pick up from the latest batch.

What type of skills do you look for in a prospective student ?

During your three months here you’re going to do a lot of programming. If you can’t already program, at least some basic scripting, you won’t be able to keep up. We look for exceptional programmers, people who have spent years collaborating on large code bases. But these profiles are harder to get. And we have been positively surprised by some people who were not the strongest coders when they first came, but ended up doing quite well.

We expect that participants have had the motivation to learn some of this stuff on their own. There’s excellent free material on the web, if they have not used it at all, we simply don’t believe they will have the willpower to improve themselves dramatically at DSR. Communication skills are quite important. We also try to weed out the crazies (we have rejected all kinds of profiles that look great on paper, but ended up failing the interview). We have a mentor to improve technical communication skills, but you need to be a solid story teller to get in. Communication is a core topic at DSR, and it’s one of the few things that you cannot improve radically with focused practice in a few months unless the initial level was quite good to start with.

We look for creativity with data and at least some basic knowledge of machine learning. The interview focuses a lot on how data science creates value for companies, and how you communicate this value. That’s as much as I can tell without spoiling the interview. We have had participants tell us that this was the most useful interview they had in their career. It’s no accident: we have a single hour to determine whether someone will do well in their career, so we are ‘betting the farm’ on this interview.

Data Science Retreat — Placement

How many cohorts have you gone through?


What is your typical cohort size?

9 (some batches will be smaller; it all depends on the quality of applicants)

Can you share what your placement numbers look like for your most recent cohort?

The most recent cohort is about to graduate as I write this. The previous one is about 70% working, with several people still waiting to take an offer. Those who are not working already have had lots of interviews. It’s not uncommon to have multiple offers.

Can you also share your historical placement rate?

Sure. Counting at 3 months and then 6 months:

batch 01: 86% 100%

batch 02: 86% 100%

batch 03: 70% 100%

batch 04: 60% — -

Companies are pretty slow hiring, this is why you see a difference counting at 3 months and then 6 months. Some hire fast (one participant shook hands at demo day on batch 2!) but most take quite a few interviews, 3–5 is the norm. Do consider this: EU is not like the US. Hires here have enormous power, once hired the company has ‘married’ the employee (after the test period) and most companies take their time with their decisions.

Multiple job offers are not unheard of. One former participant just came in for help on how to decide between three offers (!).

What percentage of your students / fellows eventually get Data Scientist vs Data Analyst vs Other technical jobs?

In Germany data science is not well understood. Many traditional companies throw business intelligence, analytics, statistical analysis, and data mining into the same bucket as data science. And having to educate companies about what’s possible with data science is a serious job.

Having said that, nearly everyone has gotten the kind of job they wanted after DSR, but the titles vary. It’s not unheard of to have someone here with a silly job title such as BI that is actually doing serious machine learning or big data pipelines, or both. I don’t think anyone from DSR went into BI, but I’ve seen this happening. What is a ‘delivery lead’?, yup, I don’t know either off the bat, but this is in fact a very senior data scientist at Zalando. Some DSR people have (three) landed senior data scientist positions straight out of DSR. Some others (most) go into a middle level. I don’t think anyone got hired as a junior.

We are also seeing some of our more experienced participants go into product management and other business leadership positions that take advantage of their previous work experience combined with their new training.

Can you share with us where some of your graduates work?

Kreditech, Zalando, GfK, mobile.de, Ascribe, Fit Analytics, Catawiki, and UPDAY are good examples. There are some companies that don’t want to be mentioned (really large ones). Many companies hire more than one graduate, and they often come back for more when they are growing their teams. Some enterprises use DSR for internal training for topics that are too advanced for their internal training departments.

How do you prepare your fellows to be very competitive for Data Science jobs?

The most important part of our process is the portfolio project. Each participant has to find a real world problem with business value, then find a data set and tools they can use to solve it. Most of our participants come to us after already having completed a few online courses, but for most people, that just doesn’t cut it. Because they operate at a large scale, MOOCs have to present the material in way that gently guides you along, and prevents you from getting stuck. But because you are never truly challenged and forced to take apart a difficult problem on your own, you’re unlikely to truly master the material.

A key element of our success is our mentor network. We have world class mentors who share all kinds of hard-won experience, including horror stories, with our participants. They’re also very opinionated — they say things like “K-means sucks.” — and go into detail about what works well in practice, and which techniques and technologies aren’t worth using. These are people who (take together) have built several companies on the strength of their proprietary machine learning algorithms, have won multiple money prices at Kaggle, or pushed the state of the art of the field. Each teacher teaches at most 2–3 days on his topic, the one he knows best; you cannot provide this quality level with a single teacher for obvious reasons. To appreciate the extraordinary quality of our training you need to be pretty advanced yourself, and do quite a bit of homework: ‘who are these people?’ Feel free to check twitter, blogs, linkedin pages, what happened to the companies they founded, etc. I fear that this Unique Selling Point for DSR is too hard to appreciate and often goes unnoticed. We are really happy when people did their homework and tell us on the interview how impressed they are by what they found out about our mentors.

Being part of the retreat also gives you a kick-ass peer group: having this learning experience with 8–10 very interesting, highly motivated people with skills that are complementary to yours. It’s a very stimulating and supportive environment.

We also provide presentation training, coaching on how to create a good CV, and the opportunity to present your portfolio project to a batch of companies who are looking for data scientists.

For organizations looking to hire Data Scientists what should they look for: Ivy Degrees, PhDs, Extensive experience, Quantitative Background, grit and determination?

A fancy degree or a PhD is not a strong predictor of whether a data scientist will be able to bring value to an organization. Google has the most sophisticated hiring department money can buy, and they arrived to the same conclusion. Ideally you should be looking for someone with a blend of common sense and business experience, along with strong quantitative and computational skills. You want someone who can convincingly deliver bad news, if that’s what has to happen, and someone who can think outside the box when the wrong questions are being asked. When asked what part of communication skills they wanted to improve most, batch 01 voted ‘delivering bad news’ (!). This may qualify as ‘grit and determination’, with a layer of political know-how. We don’t mean you have to be ‘game-of-thrones-level’ to be a successful data scientist. But being comfortable in groups of people with complex, opposed interests is really necessary. Otherwise someone in the company will simply ‘use’ the data scientist as a weapon to further their interests. We are starting to run a test to each company who want to hire from us: if they don’t pass, we won’t accept them. There are plenty of companies that are not ready for a data scientist just yet. We will send them back with a list of recommendations to improve rather than letting them hire someone they cannot use. I’m sure you have heard stories of pretty good data scientists wasting away doing ETL jobs (till they quit!) on a company that looked fantastic from the outside. This is really hard to know in advance unless you have insider knowledge.

Data Science Retreat — Administration

Can you give a short summary of a typical day in the life / week in the life for your fellows?

There’s two types of typical days. In the early weeks, our fellows participate in very hands-on, interactive classes. After they start their portfolio projects, they start to do more independent work and we have fewer classes.

At that point they setup their personal work space (we supply a desk and a large monitor), and geek out. They are supported by mentors, but the idea is that they work mostly alone. A mentor should only intervene when/if the get stuck on the implementation phase. A mentor should never have to write code. We follow the Meerkat method. In it, once a project is in full force, the mentor should intervene little, as little as possible, and only to remove roadblocks; this is part of the learning method. Mentors offer participants to ‘leverage their brain’, and how exactly you’ll do that is a skill like any other. Some participants are naturally better at this than others. Some even keep a relationship with mentors after they graduate.

Not to mention they form good friendships and get to explore Berlin. In the evenings they sometimes go to meetups together. But the last 2 or 3 weeks tend to be pretty stressful, as they scramble to finish their projects.

How do you improve your process and instruction from cohort to cohort at the Data Science Retreat bootcamp?

We’re in frequent contact with the companies and mentors in our network. With each batch we take out some course content that has become less relevant and add some new things. For example, in the past two batches we’ve reduced the time we spend on Hadoop and added a few days of streaming, to the point that we only teach approximate methods (top 10, counting, nearest neighbors) that work on streams. Batch processing is not as important for us. We know this is a serious bet, but this is where we are building our expertise. For example, we interviewed for months to get someone to teach machine learning on streams. We finally got someone who we think is at the top of this field: Albert Bifet, author of Apache SAMOA.

Recently we moved the format of the retreat to a collection of short courses taught by experts that you can attend and evaluate independently and this seems to be an improvement. Any teacher that receives less than 7 out of 10 just doesn’t come back. Ratings are very good, and we want to improve at least 30% each batch. We do this by renewing the curriculum at break-neck pace. It’s scary to talk to a participant from two batches ago and realize that much of what we taught then, we are not teaching anymore. Data science must be the fastest-moving trend in technology! This means that if you want to stay current, you will have to invest your own time after work.

What skills and tools do you think should be emphasized more in Data Science education?

Business. The whole point of data science is to generate business value, and this is extremely difficult to teach in an academic setting. I suspect Universities are going to miss out big time on teaching data-scientists-to-be how to generate business value. We obsess over it here at DSR, write the curriculum around it, and I still think we are not nailing it. Everyone who graduates should have a data product that generates real business value for someone. Surprisingly worrying few of the graduates keep their project online after graduation, which to me it feels like we failed. If the project was effective generating business value you should have complaints from your users when you switch it off. But this is just me; I have high expectations.

Given the very fast progression in the field, what skills do you think will be most important for Data Scientists in the next few years?

Knowing how to work with Real-time streams. Being able to apply models in real-time.

Engineering + devops. Solving the problem on a local computer, and then ‘throw it over the wall and hope that someone else will make it work at scale’ is not a good strategy. I think we are seeing the end of that era. Really good libraries made almost anyone to be able to solve sophisticated problems on a single computer. But the added value is now somewhere else: making sure the solution works with streams and real time data. This is a very different skill, and I anticipate many ‘otherwise up-to-date’ data scientists are going to feel they have missed the train.

The Data Science bootcamp space is getting quite crowded, how does the Data Science Retreat bootcamp differentiate itself?

1. Our mentors are at the ‘chief data scientist’-level, CTOs, or contributors to leading open source packages… and they are invested in your success. There’s nowhere else in the world you can get this today.

2. We focus on the question as much as on the technical details of the solution. We provide training on technical communication; you will present often, and get one-on-one feedback from a communication expert.

3. We prepare our participants for leadership positions. That is, either being the lead data scientist, or the only one in the company. This is far harder than preparing someone to join an existing group of data scientists and solve problems picked by someone else

Data Science Retreat — Other

What do you feel is broken with Data Science education and how is the Data Science Retreat bootcamp trying to fix it ?

It’s not so much about what is broken in educating future data scientists, but about educating companies on the value of being data-driven. Many companies in EU are still not getting that the world has changed and that data-centric companies are eating their lunches. GAFA (Google, Amazon, Facebook, Apple) is feared, but large companies in Germany often have toxic environments for anyone who advocates change, including the data scientist. I hope Data Science Retreat can help demonstrate that change is possible.

What problems in Data Science keep you up at night?

I think about how to reach decision makers and help them being more data-driven. I find it dramatic that EU, as a continent, is failing at obtaining predictive, actionable, valuable insight from complex data at scale. Companies whose core value is data are not so prevalent here, and these companies are driving innovation everywhere else. This is a big problem.

Have you faced any major challenges in running the Data Science Retreat bootcamp ?

The biggest challenge was to realize that most companies are simply not ready for a data scientist. We find that C-level decision-makers are more than happy to bring a data scientist to the company, but then the battle starts with middle management. Middle managers feel threatened by the data scientist. Often for a good reason: the data scientist may prove that their past ‘gut feeling’ decisions were awful, or could replace half of their team, halving the manager’s political control.

Then any of these things can happen:

- Unrealistic expectations that set up the data scientist for failure

- Mismanagement. I have yet to see a manager who is not data-savvy pick up a good project for their flashy data scientist to work on. This is very dangerous in hierarchical environments

- Company culture that is averse to change hardens around their ‘old ways’. People campaign internally to keep the status quo, making anything new coming from data scientists extremely hard to put in practice

This is why communication (to non-technical stakeholders) is key, and we teach exactly that. It’s too easy for a data scientist to be politically incorrect. Engineers are not ‘hard to manage’, they don’t often advocate change. One thing we teach in our communication module is ‘how to give bad news to management’.

Lutz Finger has a good write-up on how to teach ‘data’ to managers.

What markets / verticals are you currently focused on ?

We are aiming for the enterprise training market because training existing employees is a very good solution for European companies. It’s extremely costly for a company to fire employees in EU, more so than in the US. And they often a have significant pool of in-house talent that only needs a nice prod in the right direction. We have just started in this direction, but results are promising.

How do you feel the European market differs from the US market?

There are two hotspots in the EU: Berlin and London.

Berlin has been doing really well in the last five to ten years with regards to Internet tech startups. When you look at figures in terms of size, how many VC-backed companies there are, how much venture funding flows into those companies etc. As a result the tech scene is huge in Berlin, there’s an interesting meetup almost every day.

London is also very interesting. There’s definitely money floating around because of so many banks. But tech-wise, choices are more conservative. If you are a bank, losing information, even if it’s a single transaction, is a big no-no. You have to stick to tried-and-true technologies. Berlin companies can afford to pick riskier, newer technologies, because they often deal with consumer-level information, which is usually not as crucial. If Twitter loses a tweet, it is unlikely they will get sued, unlike a bank. I suspect Berlin is already ahead of London tech-wise, and with time this difference will only grow. This is a good thing for data science, because companies who can take risks will use data scientists sooner than conservative companies.

Are students ever asked to leave or are kicked out mid-way through the course or at anytime during the course?

It has not happened yet; but this is bound to happen as we grow. Our interview process is thorough, but anyone who has interviewed candidates for any job will tell you it’s really difficult to predict someone’s performance with interviews and tests.

We do have a condition in our contract: if two mentors agree that a person’s portfolio project is not up to standards, then the person will not present to the companies and may be asked to leave. We never had this situation though. On the contrary, we often hire our own graduates, and companies who hire from us commend us on the high performance of their new hires.

Do you have a hiring day and what percent of students are placed from a company they meet at hiring day?

Yes, there’s a hiring day. About half of the people get hired by our partner companies. If for some reason you want to work for a particular company that is not coming to hiring day, we help you any way we can. For example, some participants may have strong geography constraints, or strong preferences for a particular role or company that we just cannot satisfy with our partner list. In those case, we use our network to help the person. Although this is of course harder several people have gotten jobs by intros on companies that were not at hiring day. And sometimes companies not coming to hiring give people offers during their time here. Two people who got a job offer while here even got their future employer to pay for their tuition here as an extra perk!

How do you help students deal with burn-out?

We used to have meditation sessions weekly. About half of participants attend them and pick up a tool to help with focus and burn-out. Our community manager organizes social events, sometimes these are mentally challenging and lots of fun! There are BBQs and parties, as you can expect.

Participants also do activities together outside class. Batch 02 ran a marathon together. Batch 03 did some crazy competitive team activity that our community manager organized.

But all in all, it’s not as stressful as you may think. This is EU after all. We have managed to produce excellent portfolio projects (the ones that make attendants to hiring day shout ‘why do you want a job? you are sitting on a gold mine, commercialize what you have!’) without working inhuman hours. There’s a lot of know-how in how we achieve these results which makes me very, very proud.

How do you work with your students to ensure they’re assimilating most of the material in a very short period of time?

We have ‘interview-like questions’ spread all over the program. There are interview simulations towards the end (although they are optional and not everyone takes them; at those times they are often trying to get their last few bugs on portfolio project ironed out). We provide around 200 interview questions (growing with every batch) to take away after you graduate and practice interviewing on your own.

Who are the Data Scientists that inspire you?

Paco Nathan, Ted Dunning, Geoffrey Hinton

Data Science Retreat — Curriculum

Can you give us a sample of the tools, languages and techniques your fellows are exposed to during the course ?

We provide two tracks: data science and data engineering.

The data science track focuses on:

- Asking the right questions

- Machine learning (including deep learning plus non-main-stream tricks)

Languages used: Python and R

The engineering track focuses on:

- Designing data products that scale gracefully, eg by using streams

- Spark

Languages used: Python and Scala

The techniques we teach are on our website; we have the most detailed online curriculum of any education shop in data science, and it’s all there for anyone to check. Some bootcamps don’t have their curriculum online, and some others have barely a topic list.

On a typical day, how much time do your fellows usually spend in formal lectures, working on problem sets, listening to guest speakers / networking , etc?

Formal lectures and problem sets combine 50%-50% for the first half of the program. Month 2 is about 60% of that lecture-problem combination, and 40% portfolio project. The last month is about 10% lectures, rest is portfolio time. People meet with the heads of each track 30 min weekly, and with the mentors ad hoc (often in unstructured time). During tuition days, people work 9:30 to 5:30, although often people stay longer. There are guest speakers after 5:30 a few times during the program, and plenty of networking opportunities when companies come visit or we go to meetups. We encourage our participants to give presentations at meetups.