Using data science to explore gender and race gaps in equity compensation

Erin Boehmer
Building Carta
Published in
7 min readApr 14, 2022

I sat in the audience with a cup of coffee in hand and several freshly printed resumes in my bag, waiting for the next conference speaker to take the stage. It was 2018, and I was looking for a new opportunity — something that offered career growth while not compromising on purpose. For the past three years, I had been working in East Africa as a data scientist for a solar fintech startup, making off-grid solar more affordable through pay-as-you-go financing. My colleagues and our mission provided daily inspiration, but as a one-woman data team, I was itching to learn how larger organizations structure and scale data projects. A conference on data science in New York City’s burgeoning tech scene seemed like the perfect place to find my next challenge, but thus far I had been disappointed. Were all fancy data science jobs about click rate optimizations and promoting luxury consumerism?

I took a breath and resolved to keep an open mind as the moderator introduced our next speaker: Jerry Talton, a director at Carta, presenting a report titled “Equity, Inequity, and Machine Learning.” I listened as he explained how Carta had recently analyzed its equity dataset and discovered a huge gap in the value of equity held by women versus men. Only 35% of equity-holding employees were women, and they held a mere 20% of the value. I knew he wasn’t just talking about who will get to own a private jet. The successful founders and investors of today will fund the unicorns of tomorrow. In other words, those with the most equity will decide our common future.

Data from Carta’s first report on the gender equity gap in 2018. Visit this link for the full report.

Carta had the platform, data, and mission to mitigate wealth inequality by broadening access to equity ownership. I had found the inspiration I was looking for on a team that knew how to structure and scale data projects. I pulled out a resume, determined to start the conversation about how I could join Jerry at Carta.

Carta is about creating more owners. Mapping and expanding the ownership graph, democratizing ownership in the process. Reducing income inequality by expanding the ownership of productive assets. Pulling more wage-earners out of the debt stack and into the equity stack. Making seven billion people part of the land-owning class.

— Henry Ward, CEO, Ownership Management

Two months later, I was a data scientist in Carta’s NYC office, building data products and learning the nuances of private equity. When I was told I would be responsible for producing the 2019 Equity Report, I was ready for the challenge. The study was my opportunity to use the knowledge I had gathered to advance a conversation I cared about deeply.

A few months before the event, I began polling Carta’s internal equity, total compensation, and diversity, equity, and inclusion (DEI) experts to help me expand and prioritize my research into cap table diversity. They had no shortage of curiosity! Could we talk about whether underrepresented groups hire more diverse teams? What about vesting and exercise patterns for people from underrepresented backgrounds holding in-the-money grants? Oh, we’d also love to see how founders compare in their future investments!

So many great questions! Unfortunately, as I set up each analysis, I noticed a worrisome trend: Although Carta’s equity data was reliable, contextual data used for cross-sectional analysis was sparse at best. Most notably, Carta didn’t routinely collect the most critical data we needed for a cross-sectional report on cap table diversity: race, gender, job level, or role.

I wasn’t completely surprised. In 2019, Carta was actively prioritizing data model redesign. In fact, helping with that long-term strategy had been a major motivation for me to join the company. But refactoring is slow and methodical; I had two months to deliver an analysis on cap table diversity. Hitting our deadline for the 2019 report was going to require some short-term creativity. So we found ways to be creative.

Our team fortuitously learned that another business unit at Carta had collected 120,000 self-reported job title records. The data was messy and still relatively sparse, but it was enough to infer useful job level and role categories. We also found that our approach to gender classification from 2018, which relied on U.S. Census data, failed frequently for international names, removing diverse subpopulations from our analysis. To address this in 2019, we integrated third-party APIs that boosted our gender classification coverage from 86% to 94%. With these scrappy approaches, our 2019 report showed that the gender equity gap stems primarily from men sitting in higher-earning positions than women.

Our 2019 report showed that the gender equity gap stems primarily from men sitting in higher-earning positions than women. Visit this link for the full 2019 report.

We hit our two month deadline, but the flaws in our approach were dissatisfying. For starters, our gender classification method completely failed to recognize gender identities other than male and female. Job titles were self-reported, with role and level mapping rules being hacky, at best. And what about race and ethnicity? We had no strategy for getting that data.

We had checked the box: Our report reflected the data we had and it was slightly better than the previous year. But we weren’t satisfied with being less bad; we needed a data foundation that would support nuanced insights on equity gaps. With the 2020 report already feeling like a tight deadline, I began strategizing how to collect the dataset Carta needed.

As a data team, we began advocating for the curation of a demographic dataset, voluntarily provided by Carta users. I drafted a list of questions we would ask any user who wanted to support the equity gap study; this included self-described gender, race, job title, as well as self-sorted level and role classifications. Carta’s leadership was supportive and recommended that we test data collection via a lightweight third-party survey. Within three weeks, the questions were approved to go live and data began flowing through a bespoke pipeline into the warehouse. To boost the response rate further, I worked with our marketing team to launch in-app banners and tailored email campaigns.

Some colleagues doubted that we would collect enough data, and I can’t blame them. Why would users voluntarily provide sensitive demographic data for a nascent analytics effort? Our results erased all doubt. We had released the survey in the wake of 2020s grim racial equality conversations and, after about three months of data collection, 30,000 Carta users had volunteered to be a part of our study of equity gaps. In sharp contrast to 2019, the data they provided was structured, clean, and tailored to the analysis we wanted to deliver. In 2020, Carta users paved the way for us to finally study race, and across a much wider range of job classifications.

With help from more than 30,000 Carta customers, our 2020 report included a representation of people of color across a wide range of levels. See this link for the full 2020 report.

Having proven the viability of a demographics survey, Carta released the data collection prompt as a permanent feature of carta.com. The guarantee of ongoing data collection now enables a wider, cross-functional team to add context, find new insights, create stories, and bring people together to discuss diversity on cap tables.

We have since expanded the survey to include compensation-relevant information, such as whether an employee works full time, manages direct reports, or has been promoted. More than 150,000 Carta users have become part of the equity gap conversation since 2019 — and with their help, we were able to dive even deeper into race and gender intersectionality in 2021.

The 2021 Carta Equity Report included information on race and gender intersectionality volunteered by over 120,000 Carta customers. See this link for the full 2021 Carta Equity Report.

Three years after sitting in the audience of that New York data conference, I’m proud to now be on stage at Carta’s annual Equity Summit presenting our research on equity gaps. I came to Carta wanting to lead the development of data products that impact important conversations, and my work on Carta’s Equity Report has given me that satisfaction. I’ve helped grow a one-off blog post into a respected and anticipated forum, attracting keynote speakers like Tristan Walker, Serena Williams, and Kirsten Green. Our diverse audience of founders, investors, policy makers, board members, and employees can enjoy the Equity Summit as a community-building platform for a shared ambition: democratizing access to equity ownership.

An analysis of cap table diversity will never be complete or perfect, but I’m proud of our team for bringing nuance to the conversation. To the 150,000 Carta users who have contributed data, thank you so much for your support. We look forward to continuing these conversations with you.

If you’re interested in working with me on Carta’s data team to “pull more wage-earners out of the debt stack and into the equity stack,” we’re hiring!

--

--