I’m a Data Scientist — Here’s why I work at Facebook

Veronika Belokhvostova
7 min readOct 7, 2021

--

I have never posted on Medium and let my membership lapse for a bit during the pandemic. However, I felt that it would be remiss of me to stay silent during what has been a very hard three weeks for my team.

I have worked at Facebook for six years. There are two reasons that convinced me to come to Facebook after 15 years at other high tech and consulting firms. First, I was offered an opportunity to work on the Social Impact team — a team that enables billions of people to do good for their community through products like Charitable Giving, which helped raise billions of dollars over the last few years. Do good at scale? Who could pass up such an opportunity! Second, friends I talked to to get the inside scoop emphasized how much influence Data and Research teams had at Facebook, even when compared to other high tech firms where they had worked.

For nearly five of my six years at Facebook, I led the Central Integrity Data Science team. I have seen it grow by more than seven times over the last five years. We grew the Integrity team as fast as we could hire.

We partnered with Product, Operations, and Policy teams to build sophisticated violation detection AI solutions and support tools for people on our platforms, covering many different types of Community Standards violations. We published the Community Standards Enforcement Report as soon as we had reliable enforcement numbers. And subsequently added prevalence to it, which is now the main metric by which we measure our success. We were adamant about prevalence even when no one was asking for it externally, because we knew it was the right way to measure impact on our users. We have seen the prevalence of harm go down for some of the violations (such as Hate). One of the major leaps in our progress was due to the partnership with the ranking team. As our detection technology improved, we could use its output to inform content ranking decisions.

There was not much we could boast about five years ago. But the progress over the last five years has been pretty incredible, and I am proud of all we have accomplished. The more we learn, the more we continue to build and expand these efforts. It is rewarding to see insights of our work translate into product roadmap items every quarter and to see them launch.

Every day we also are keenly aware of the inevitable tradeoffs that have to be weighed in our analysis and recommendations. Every AI solution at every threshold has a level of false positives (good content or people incorrectly flagged) and false negatives (violating content and abusive accounts incorrectly missed). If you lower the threshold, you get more false negatives and fewer false positives. If you increase the threshold, you take down more violating content, but also block more legitimate content and users. This is akin to the tradeoffs many industries face. In banking, teams weigh reducing bad loans against turning away good people and businesses in the process.

This nuanced and complex work wouldn’t be possible without the effort and contributions of so many people. Here are some of my colleagues sharing why they work in this space.

Michelle, Data Science

When I received my Facebook data science offer I was flattered (how could I pass the selection process?!) but extremely hesitant. I had just made a bunch of big life decisions — my now husband and I had just moved into our first apartment together in NYC, we adopted a puppy and I finally felt like I had settled on a career path that I was excited about. I had recently started a job as a data scientist at a hospital and was beginning to shape my career around data science for good. This new focus was years in the making as I grew an interest in the cross section of the healthcare industry and data from my consulting days. However, if I was honest with myself, I wasn’t 100% happy at work and felt stuck with the limited support that I had around me as a new data scientist. So I started to think that maybe this Facebook offer was worth considering. I could always join Facebook for a couple of years to learn from the best and then refocus on my career ambitions of data science for good.

Fast forward to today — it’s been five years and I am still at Facebook and I have achieved my career ambitions of data science for good. I’ve spent these years working on our integrity efforts to reduce harm across our family of apps. I’ve been able to work on a wide range of challenging and nuanced problems from removing terrorism propaganda to cybersecurity. One project that particularly stands out was when I was focused on URL-based harm (e.g. phishing, malware, cloaking, etc.). Traditionally, in Integrity we tried to measure how many views there were on harmful content, but we quickly realized that views on content do not capture the harm when a user clicks on a harmful link from Facebook. As a result, we started focusing on measuring how many outbound clicks create a deceptive or abusive offsite experience for our users. Soon after launching this measurement, we realized how large of an opportunity we had to reduce harm. Ultimately, this resulted in updated priorities from the team leading to a 50% reduction in this metric in six months. I like highlighting this project because I think it showcases the power that DS can have in building effective measurement and leveraging it to scale a team of engineers, ultimately reducing user harm.

I am proud of the work that my team does and impressed with the progress that we have made over the years, but I also know that we are working to solve some of the hardest problems impacting our society and there is much more work to be done.

Ryan, Data Science

I’ve always enjoyed statistics and quantitative methods. I learned a lot in graduate school but felt that the work was too theoretical; I wanted my work to have beneficial applications that improve the lives of others. Working on Integrity at Facebook has given me that. I get to use statistical methods to measure the prevalence of hate speech on the platform, and I leverage product analytics to identify opportunities to reduce hate speech prevalence.

One of the projects I’m most proud of over the last two years was getting these prevalence estimates reported externally in Facebook’s Community Standards Enforcement Report last year. When I started on the team in 2018, we were uncertain how prevalent hate speech was on the platform. Anecdotes made it seem like Facebook was overflowing with hateful content. It’s fulfilling that my work allowed us to finally quantify this problem and show that 5 out of 10,000 views are hate speech. Furthermore, my analyses of prevalence have been used to improve our systems and classifiers, resulting in meaningful reductions in harm to our users. Equally rewarding is that Facebook decided to report these metrics externally so that our users could hold us accountable for progress. I hope that our users recognize the progress we continue to make: Within a year of reporting our prevalence metric, we’ve already reduced hate speech prevalence by 50% from 0.10%-0.11% to about 0.05%.

Alice, Data Science

When I got the job at Facebook I planned to stick it out for two years to get the “box ticked” on my CV. I wanted to learn about how a company like Facebook “does Data” but didn’t think a HUGE organisation would fulfil me in the long term. Once joining Facebook, you choose the team you want to work on and I was drawn to Integrity. Integrity teams are addressing issues that have never been solved before, issues that affect the future of technology and have large societal implications — and I wanted to be a part of that. The team I chose to join was “Compromised Accounts”: Stopping attackers from accessing people’s accounts without their permission.

Compromise comes in many shapes and sizes, usually with the aim of profiting from the network someone has spent years creating: This can range from small highly targeted attacks on high-value individuals and organisations, to large scale attacks trying to get into the account of anyone and everyone (like you or me).

One challenge we have is balancing the accounts we protect from compromise, versus the amount of good users we accidentally block from accessing their account in the process. This balancing act is made more complex due to the increasing level of hacker sophistication — it’s not always clear whether an account is being compromised or not in real time. An exercise comparable to balancing weighing scales with a blind fold on.

Tipping the scales too much in one direction could result in thousands of users being hacked and losing permanent access to their private messages, photos or businesses in the process, or, in the other direction we might stop millions of users from being able to communicate with friends / family and run their businesses as normal.

The work has surpassed my expectations in being challenging and interesting. I learn from intelligent and motivated people every day, and I’m proud to be working on problems that are helping real people across the globe. It’s been almost four years since I joined, doubling my planned tenure, and I haven’t considered going anywhere else.

--

--