Data Science for Social Good Summer Fellowship
A summer fellowship for people looking to make a positive change in society through data science
Data Science has been one of the fastest growing fields and predicted to be one of the most in-demand skills in the coming decade. It has changed the world in the way we access information and solve problems. It has created value for many businesses around the globe, but the benefit of data science has not been distributed equally. This is because not all organizations have the required resources or the expertise to harness the power of data for their initiatives. It is of great importance that government organizations and NGOs, both of which aim to improve people’s lives, are also able to achieve more with data to promote their social causes. This is where data science for social good comes in and this is what I did over the summer.
Table of Contents
What is the DSSG Fellowship?
The Data Science for Social Good (DSSG) Fellowship is a full-time summer program designed to train data scientists to work on real-world projects which have the potential to positively affect millions of lives. DSSG Fellows apply their data science skills, working closely with governments and non-profits which are referred to as partner organizations. The fellows work towards finding solutions to complex problems in the field of education, public policy, economic development, health, energy and many more.
Through this programme, the partner organizations can unravel the potential of using data to support their mission of improving lives of the public, while the fellows can hone and apply their analytical skills and learn from full-time mentors coming from industry and academia.
History of DSSG
The DSSG Foundation’s core Summer Fellowship was set up in 2013 by Rayid Ghani, former Chief Scientist for President Obama. The US Fellowship was started at the University of Chicago and is now based at Carnegie Mellon University. Since then, there have been multiple similar programs branching off from the core fellowship — one of them being the DSSGx UK Fellowship, which is the focus of this article. In 2019, the University of Warwick collaborated with The Alan Turing Institute to bring DSSG to the UK. It ran in 2019 at the University of Warwick, and online in 2020 and 2021 due to the Covid-19 pandemic.
About the DSSGx UK Summer Fellowship
This summer I worked as a fellow for the Data Science for Social Good Fellowship (DSSGx) conducted by the University of Warwick in collaboration with Ludwig-Maximilians-Universität in Munich under the DSSGx UK chapter of the DSSG Foundation. It was a 12-week fellowship conducted between June and August, as an entirely online event due to the pandemic. It is a full-time program where the fellows are paid a stipend to support themselves throughout the duration of the fellowship.
In this article, I talk about my experience during the fellowship and provide a detailed walkthrough of the selection process.
The application process is great way for you to dive deeper into your work and skills and really think about why Data Science for Social Good matters to you. The process is aimed at getting to know you better and see whether you would be a good fit for the program. The following section gives a detailed overview of the process and what to expect.
The application for the fellowship opens in early January and you can find the link to the application form on the program website. The form is quite comprehensive and requires the applicant to explain their past work in the field in as much detail as possible along with their motivation to be a part of the programme. There are questions which ask about your personal background, programming experience, data science experience, motivation to work towards social good using data science et cetera. It is important to answer these truthfully and comprehensively. The application also asks you to rate yourself on certain tools and skills. Here, as well, you should be honest but also confident in your ability. Do remember that you will be asked questions about what you wrote in your application form in further rounds. It might be helpful to write down all the questions in a word file and review them before submitting. You are also required to submit your transcripts. I don’t know how much my grades mattered but if you’re reading this in your first/second year, it might be helpful to work on your grades a little bit as well.
The application deadline was around the end of January and I remember submitting it almost a day before the deadline. It was probably because I reviewed the answers multiple times to fit the word limit but also convey as much information as possible. I think the multiple rounds of reviews really helped me sharpen the application.
Pro tip: Complete the first draft for all the answers and then leave it for a day or two and then review them after a week’s gap. This would give you a fresh perspective and you will be able to find places which can be improved.
In my opinion, the application process was a really great learning experience because it required me to really think about what matters to me, why I would be a good fit for the program and where I was lacking in my data science journey.
After almost 25 days, I received the shortlisting email. I was really happy (and a bit surprised) but the journey had just started.
As you might have seen in the email, the next round was a technical round (or a code walkthrough) where we were interviewed by a member of the organizing committee. I should mention right away that I was really nervous about the interview, mostly because of my experience with interviews in India where they are all about breaking a candidate and finding faults in them, but this one was very different. My interviewer, Florian, was really nice and made sure that I was comfortable before the interview and at no point made me feel like he was evaluating me. There were no “trick questions” (read: leetcode) which would push me into making a mistake and give him a reason to reject me. It was purely about knowing more about me, my work and past projects. This was honestly one of the best parts of the process because I saw a new way interviews could be taken where both people learned from each other. Since it was a code walkthrough, I explained how I built my project and he kept asking me questions about why I used a certain model over the other or how I solved a potential problem. There was an additional discussion on NLP (since it was related to my project) and counterfactual explanations (probably because both of us had done some research in explainable ML). Towards the end, I asked a few questions about the program and about the mlr3 package (I was relatively new to R).
My advice for this round is to be thorough with whatever project you have built. It doesn’t matter what frameworks or languages you use as long as you can explain why and how you built the project. Additionally, some of my colleagues talked about an algorithm implementation since they came from a more mathematical background. I showed a full-stack project because I had a strong programming background.
While I thought my interaction went well, I still wasn’t sure if I would make it to the next round. I was incredibly anxious for the next 20 days, waiting for the result. Finally, I got an email informing me that I made it to the final round!
I got the final interview selection email on the 22nd and I booked an interview slot for the 29th, which was the first free slot I had. This interview is more of a culture fit for the program. The interviewers want to see whether you would be a good addition to the program and how you can contribute to the goal of DSSG. There are questions about your motivation and past experience.
I got an email from one of the PhD students regarding the interview but when I joined the call, I was a bit surprised. I was expecting a single interviewer, but there were two of them, one of them being the program director, Prof. Jürgen, who is a very senior professor and researcher at the Warwick Business School. I think I was visibly surprised and nervous but both the interviewers were really nice and wanted me to be comfortable. They introduced themselves and asked me to introduce myself and talk about my motivation to join the program. The next 30 minutes were spent on getting to know me better and explaining the program better to me. Some of the interesting questions that my team and I remember are:
- Tell us about your past experience working with teams, especially in a remote setting. In a team, what sort of role do you like to play?
- If you were to pick your own team for a data science project, how would you go about it?
- Have you been involved with any social good projects in the past? Tell us a bit about any one of them.
- We would be having social sessions where fellows would present something interesting about their home countries. Do you have some idea of what you would like to present? (I talked about Holi, an Indian festival)
- You mentioned that you like law in the application form. Can you talk about an issue of law related to data science?
- What are some tools you like to use for remote working and why?
- You have some inter-disciplinary teamwork experience? What was your role and contribution?
Towards the end, I asked a few questions about the fellowship and the prospective projects. That was the end of the interview. I was quite satisfied with how my interview went and on 2nd April, I received my acceptance for the program! I was overjoyed and really excited to embark on a new journey for the summer of 2021.
I hope this application process walkthrough would be helpful. Irrespective of the outcome, if you’re reading this article, you should apply for the program because the application process itself is a really great experience. It will help you reflect on your work and your motivation to pursue data science.
This blog would be remiss without me talking about my experience during the fellowship. The program commenced on 7th June with 16 fellows coming from a variety of backgrounds and many different countries. This was one of the best things about the fellowship. I was able to meet and become friends with people across the globe. The fellows were divided into teams of 4, working on 4 different projects. My team comprised of people from Israel, Taiwan and India and we were working with the German Federal Ministry for Economic Affairs and Energy (BMWi). They represent the main governmental body in Germany for economic affairs and economic policy. Our project partners wanted to come up with a solution to forecast unemployment rate at the county level in Germany. You can read about the detailed problem statement and the solution here. I wanted to talk more about the experience and workflow in this section rather than the problem statement.
GitHub - DSSGxUK/bmwi: Unemployment Rate forecasting tool built for BMWi during the Data Science…
You can find all the relevant links for project here:- This project is a collaboration between German Federal Ministry…
Data Science Workflow
A lot of people might think that because universities and research institutes are involved, hence, the program would be a bit unstructured. However, that is far from true. As I mentioned earlier, we work with people from industry and academia. Each team was assigned project managers and technical mentors. Our project manager made sure that we followed an agile workflow throughout the project which means we followed all the standard practices that you would see at any tech company. Every Monday we would have ‘sprint planning’ to discuss and finalize the things we would do over the week and allocate story points to the tasks. It was highly organized with each member being aware of what they would be working on. Apart from that, we met every morning (actually afternoon for me because we were working during UK work hours) to update on the previous day’s progress and the plan for that day. This allowed us to be accountable to the team and also discuss any blockers that we might be facing. Every member in my team was super helpful which meant that if I was ever stuck on a problem, all I had to do was ask for help and someone or the other would work with me to solve the problem. If we were still stuck, we could reach out to the technical mentors or discuss them with the partners or in the Friday Deep Dives (I will come to these soon). There were also weekly roundups called ‘retrospectives’ on Fridays, where we would assess our weekly progress and objectively evaluate the sprint progress. This gave us insights into what went well and what could be done better during the next sprint.
So, anyone who might be thinking that this is a research internship and you wouldn’t get the industry experience, you’d be surprised by how similar our work was to a tech company.
Working with project partners
However, it wasn’t exactly like a tech company because of the focus on the social good aspect. DSSGx aims to work on social good problems by collaborating with governments and NGOs, both of which have a direct positive impact on people’s lives. Like I mentioned, our project partner was the BMWi. Now the reason I stress on the word “partner” is because of how we worked together. The organization wanted to solve a problem along with us and did not want us to solve a problem for them. This, I believe, is an important distinction with working for a company where the clients just give you a problem and expect you to come up with a presentation to solve everything for them (speaking from experience, might not be completely generalizable but it is pretty common). We met with our partners every week to discuss our progress and explain our solution (or parts of it) to them. On the basis of the weekly discussions, we would plan the next week’s work and then present our findings again. This way we had constant feedback on our work and I really appreciated the time they took out to work with us on the problem. I feel that this was an integral aspect of the success of our solution.
Since DSSG believes in research reproducibility and open science, our solution is publicly available here. Another positive aspect of our project with BMWi was that they experienced the impact that data science can have on their organization and are open to exploring more data-driven solutions to their other existing problems. This means that there would be a higher standard of database maintenance and application of data science methodology in this organization. I would call that a resounding success!
Public Lectures and Deep Dives
DSSGx was committed to our learning as well, which went beyond the projects that we were working on. This was achieved through the public lectures every Friday. These were talks given by eminent professors, researchers and industry stalwarts who had a lot of experience in data science. These talks served as a great means to break-away from the daily work and learn more about current challenges faced by the data science community. I learned so much through each of these talks and also interacted with the speakers, which would not have been possible otherwise! A list of public lectures which we attended can be found here.
After the public lectures, we had the most awaited event of the week — Deep Dives. This was an opportunity for all the teams to talk about their weekly progress and plans for the next week. I loved these sessions because we could ask any and all questions we had from professors, technical mentors and visiting guests. Since everyone came from different backgrounds, we almost always got multiple ways of solving a particular problem that we were facing. This was truly very helpful. Additionally, we were able to interact with the other fellows, know about their project and experiences which was very interesting.
Overall, it was a day of learning and collaborative problem solving. It did get hectic at times, because of multiple sessions one after the other but it was a great learning experience and we were able to make a lot of progress because of these sessions.
The public lectures and deep dives weren’t the only thing that happened on Friday though! After a week’s hard work, all the fellows relaxed and engaged in social activities. These were organized once a week by one of the teams and was a great way to talk to fellows outside of my team and get to know them better. I loved all these sessions!
We played many different games (whatever was possible online) including Skribbl, cryptic hunt, Olympic themed quizzes, group activities and so on. My personal favorite was Gartic Phone!
Friday was definitely my favorite day! It was filled with learning and fun, and it definitely made me look forward to mondays so that we could start working again. I don’t think I had ever looked forward to mondays before the fellowship.
My fellowship experience wouldn’t even be 1/10th complete if I didn’t talk about my amazing team! I could talk for hours about what a great experience I had with them but let me first introduce them:
- Amit Sasson - Masters graduate in Statistics from Tel Aviv University,
- Cinny Lin - Data Science, NYU ’22,
- Vighnesh N Ganesh - junior at BITS Pilani, India majoring in computer science engineering,
- and finally myself, Prakhar - Computer Science, SNU’22.
A good team is the most essential aspect of a successful project and I had the privilege of working with some of the best people. Our very diverse backgrounds proved to be one of the major factors in the success of our project. Our team had people who had a vast knowledge of machine learning algorithms, people who could write complicated pipelines like it was nothing, people who built the tool which would be used by BMWi and people who could explain even the most complicated results through effective visualization. This, combined with the fact that each of them was always ready to help out one another, made us work together very productively. I can confidently call this one of my best work experiences with some of the best team members.
An example of our team work can be seen through a unicorn drawing competition which was conducted by one of the technical mentors as a fun team-building exercise. We had to make a unicorn by coloring in the boxes of a google spreadsheet in limited time. It was no Van Gogh but we were still proud of it. Our unicorn was declared the best unicorn by the organizers!
While we worked really hard, it wasn’t the only thing we did. I think we were definitely the most chatty team as well. We were always willing to learn about each other’s culture, language and academic background. I know so much more about Israel, Taiwan and south of India that I could probably win a trivia competition.
Like I said, I could talk for hours about how much fun we had as a team but the blog is already too long but I am glad that you have read so far.
This summer has been an enlightening journey for me. I grew as a data scientist and as a person, all thanks to the fellowship. I have picked up a lot of new skills over these last months which I will be able to apply to other projects. I was also able to connect with academics and people in the industry which was also one of my goals.
I recommend this fellowship to budding data scientists and academics who’re looking to apply their skills to problems that matter and learn a lot in the process.
This article could not be completed without the helpful comments from Cinny Lin, one of my team members from the fellowship. She also writes about Data Science and Technology. Check out her blog here! I also received really helpful comments from Prof. Jürgen Branke and Prof. Colm Connaughton at the University of Warwick, who also worked with us closely during the fellowship. My teammates Amit and Vighnesh also provided really helpful comments.