Race and Gender Among Computer Science Majors at Stanford


Last week, my friend Winnie Wu wrote a great post about the demographics of race and gender within the Computer Science concentration at Harvard. I’m currently studying Computer Science at Stanford, and Winnie’s post got me wondering if there was anywhere online where I could find such data for my school. Particularly, since I am Hispanic, I wanted to see what percentage of Computer Science majors at Stanford belong to my demographic. Based on my day-t0-day experience going to lecture and office hours, it was clear that the numbers were low. But were they as low as the numbers that Winnie found? (Only 5% of Computer Science concentrators at Harvard are Hispanic.) And how did the numbers compare to the demographics of major tech companies in Silicon Valley? I tried searching for the demographic data on Google, but I was not able to find the numbers that I was looking for. I decided to follow Winnie’s lead, and try my hand at collecting the demographic data for Computer Science majors at Stanford.

I think that it is important to make this demographic data readily available because Stanford’s Computer Science Department is among the top in the world, and Stanford Computer Science students have a track record of helping to shape the trajectory of Silicon Valley and the tech industry as a whole. As more and more people start to realize that cultural diversity can bring with it many benefits in the classroom and in the workplace, students, faculty, and employers alike may all be interested in developing a better understanding of the gender and racial breakdown of the Computer Science major at Stanford.

Before I proceed to explain how I collected the gender and race demographic data for this post, it is important to note that my findings have not been officially verified nor endorsed by Stanford. This is just a small personal project that I decided to undertake out of sheer curiosity, but nonetheless I hope that others will find it insightful.

Data Collection

I obtained the current list of all Computer Science undergraduate majors at Stanford from the Computer Science Department’s online student directory. As of this posting, there are 707 Computer Science majors in total, from the Class of 2015, 2016, 2017, and 2018.

The online student directory itself does not provide any gender or race data for students, so I had to figure out a way to collect and organize that information on my own. I sorted students by gender (male or female) and by race (White, Asian, Black, and Hispanic). The two gender categories of male and female and the four race categories of White, Asian, Black, and Hispanic were selected because these are the categories that are commonly used in industry diversity reports. To allow for comparisons to be made to the Harvard data from Winnie’s post, I also note the distinction between South Asian and East Asian. Similarly, I counted Middle Eastern students (with family name origins west of Iran, inclusive) as White, and I counted South East Asians (such as Vietnamese) and Pacific Islanders (such as Filipinos) as East Asian. In the case of mixed race students (which I deduced either from personal interactions with these students or from a combination of analyzing physical traits and family name origins), for each particular type of two-race mix, I assigned half of the students in that mix to one group and the remaining half to the other. For example, if there were 4 potentially White-Asian mixed students, I assigned 2 to the White category and 2 to the Asian category.

To collect the gender and race demographic data, I went down the list of 707 students, name by name, and looked up each person on Facebook and searched for the origin of their family name using this Surname Origin Tool. Since many of these Computer Science students are either my friends, friends of friends, or people whom I have taken classes with in the past, I had access to a lot of information in the form of Facebook profile pictures, status posts, and hometown names — in addition to what I knew from interacting with these individuals face-to-face at school — to help me infer each student’s gender and race.

In inferring gender, I was limited by the information that I could gather from students’ Facebook profile pictures. I used conventional (at least by U.S. standards) visual characteristics of cisgender males and cisgender females to infer gender. In a few instances in which this approach was not sufficient to classify a student as male or female, I looked at the gender pronouns used in their timeline posts to help me make a final decision.

In inferring race, I tried to make my decision based on a combination of a person’s Facebook profile picture and family name origin. Looking only at someone’s Facebook profile picture may not be enough in some cases — for example, there are many Hispanics of Argentinian decent that have Caucasian features. Looking only at the origin of someone’s family name may also not suffice — for example, there are many Filipinos that have family names that may be considered Hispanic in origin. Taking both profile picture and family name origin into account can help to more accurately classify people. Someone with Asian features and a Hispanic family name that is popular in the Philippines, for instance, can be accurately identified as Filipino and sorted into the East Asian category.

There were some students who did not have profile pictures, had profile pictures that did not show their face (a picture of a capybara was a particularly cute one), or did not have a Facebook account at all. For these students, I had to turn to alternative sources to find profile pictures, such as Google Image Search or LinkedIn profiles.

A few students required an additional level of analysis beyond just profile picture and family name origin. For these students, I looked at information such as hometown, the language used in status posts, and pictures with family members (such as parents and grandparents) to help me infer their race.

Overall, I did my best to remain consistent in how I made my sorting decisions and to classify each student as accurately as possible.

Findings

The following are my findings, accompanied by graphs that I created with Google Sheets using the data that I collected:

OVERALL

Gender

If we look at gender overall, we see that there are many more male Computer Science majors than female ones. There are 493 male students (69.7%) compared to 214 female students (30.3%).

Considering that Stanford’s undergraduate student body is 52.8% male and 47.2% female, this Computer Science-specific gender ratio is severely imbalanced. And yet, this 69.7% male to 30.3% female ratio is still an improvement over numbers from 2012, when only about 21% of Computer Science majors were women (yes, it was even worse 3 years ago).

How does this ratio compare to that of major tech companies in Silicon Valley? Well, as imbalanced as Stanford’s ratio is, the situation at tech giants such as Google and Facebook is even more severe. For instance, in its June 2015 workforce demographics report, Google revealed that only 18% of its tech employees are women. Similarly, in its own June 2015 diversity report, Facebook revealed that only 16% of its tech workforce consisted of women. What I wonder is — shouldn’t these numbers be at least a little bit closer to the 30.3% at Stanford? After all, Stanford is one of the largest — if not the largest — feeder schools for these companies. This discrepancy appears to suggest that, while more women are receiving Computer Science degrees, either these tech companies are not hiring them or the women are choosing to work elsewhere. Whatever the reason may be, whether the established hiring system is failing to identify existing female talent or whether these companies are falling short of providing an environment in which women feel welcome, this is something that definitely demands more attention.

Race

Focusing on race, we see that most Stanford Computer Science majors are Asian (46.4%), followed by White (38%), Hispanic (9.5%) and Black (6.1%).

For reference, Stanford’s undergraduate student body as a whole is 22.6% Asian, 42.8% White, 13.1% Hispanic, and 7.5% Black. This adds up to 86% — the remaining portion consists of Native American, Native Hawaiian / Pacific Islander, students who declined to state their race or ethnicity, and “International” students (don’t ask me why “International” is listed as a race or ethnicity; that’s simply how the statistics are presented in this report).

While the 9.5% and 6.1% figures for Hispanic and Black Computer Science majors at Stanford are slightly better than the average at peer institutions (6.5% and 4.5%, respectively, according to an analysis conducted by USA Today), these findings are for the most part consistent with the widely known fact that Silicon Valley’s workforce is primarily made up of White and Asian men. In that sense, these findings are not too surprising. What is surprising to me is that the 9.5% Hispanic and 6.1% Black rates at Stanford are so much higher than the rates for these demographics at major tech companies. At Google, the tech workforce is only 2% Hispanic and 1% Black, while at Facebook, the tech workforce is only 3% Hispanic and 1% Black. Just as we observed when analyzing the data for women, there appears to be a significant disparity between the number of Hispanics and Blacks pursuing Computer Science degrees and the number of Hispanics and Blacks getting hired (and choosing to stay at) major Silicon Valley tech companies.

WITHIN RACE

I found that the female-to-male ratio was the best among Asians (35.7% to 64.3%), followed by Blacks (34.9% to 65.1%), Whites (24.9% to 75.1%), and Hispanics (22.4% to 77.6%).

It’s interesting to see that among the two best ratios, we have a well-represented group (Asians) and an underrepresented group (Blacks), and among the two worst ratios, we also have a well-represented group (Whites), and an underrepresented group (Hispanics). I would have expected the two best-represented racial groups to also have the two best gender ratios.

Anyway, if we further break down the Asian category into East Asian and South Asian, we see that South Asians (36.1% female to 63.9% male) ever-so-slightly edge out East Asians (35.5% female to 64.5% male) in terms of gender balance.

WITHIN GENDER

Looking at the breakdown of race within each gender, we see that the number of Asian and White males is roughly equal, with 42.8% of all males being Asian (33.5% East Asian and 9.3% South Asian) and 41% of all males being White. Trailing behind, 10.5% of all males are Hispanic and 5.7% are Black.

Shifting our focus to female Computer Science majors, we see that Asians far outnumber other groups. Of all females, an astounding 54.6% are Asian (consisting of 42.5% East Asian and 12.1% South Asian), 31.3% are White, 7% are Black, and 7% are Hispanic. The large number of Asian females majoring in Computer Science at Stanford contributes significantly to Asians being the best-represented racial group overall.

If you are interested in seeing how the raw numbers for each demographic group stack up to each other, check out the chart below. Once again, the data reflects the dominance of Asian and White males in the Computer Science field, accompanied by significantly lower numbers for women and for Hispanics and Blacks, with 211 Computer Science majors at Stanford being Asian Male, 202 White Male, 117 Asian Female, 67 White Female, 52 Hispanic Male, 28 Black Male, 15 Hispanic Female, and 15 Black Female.

Closing Thoughts

While the race and gender demographic data that I obtained for Computer Science majors at Stanford appears to be slightly more balanced compared to peer institutions such as Harvard, these numbers still reflect a notable disparity in the field along racial and gender lines . Moreover, that disparity grows even larger when we start looking at the demographic data for major Silicon Valley tech companies such as Google and Facebook.

Also worth pointing out is that — while I did not record any country-of-origin data for students — as I was browsing Facebook profiles searching for race and gender data, I noticed that many of the Hispanic and Black Computer Science majors were international students. It would be interesting to see what percentage of Hispanic and Black Computer Science majors at Stanford are from the United States compared to those who come from abroad.

Furthermore, as a low-income and first-generation college student myself, I think it would also be valuable to obtain data on how many Computer Science majors are first-generation college students or come from low-income families. I have a strong feeling that the numbers are very low, but this hypothesis is difficult to verify, given that family financial information and educational information is not publicly available.

I hope that you learned something new from this informal study, and I encourage students from other schools to also follow Winnie’s lead and collect race and gender demographic data for Computer Science majors at their respective home institutions. The more data we have, the better equipped we will be to bring about positive change in the tech industry. Gathering the data for this report, creating graphs, and typing everything up on Medium just took one day of work (and I think that Stanford is probably near the upper bound in terms of the number of Computer Science majors in a single school), so this is something that can be done pretty easily and has the potential to be very valuable and insightful. If you do decide to conduct a demographics study for your school and want to share it with others on social media, use the #CSAtMySchool hashtag so that it’s easy to find. I think it would be really interesting to see how race and gender demographics compare across different schools.

Lastly, don’t be a passive bystander — educate yourself, educate others, and play an active role in bringing about positive change! Encourage your sisters and brothers to study Computer Science, and continue to support them throughout their journey. Start a Lean In Circle in your community. If you’re a professor, don’t make an offhand joke comparing the programming language C to a “crack mom” — one of your students may be struggling with parental drug abuse at home. If you work at a tech company, don’t assume that the female coworker you just met is a secretary, or that your Hispanic coworker is the facilities repair guy, or that your Black coworker is a security guard. If you’re interning at a tech company during the summer (or if you’re a full-time employee looking for a more impactful break from work than playing ping pong or Xbox), try to work with the recruiting team to organize outreach events to connect women and underrepresented minorities to the opportunities that exist at your company — while interning at Google this summer, for instance, I’m helping to organize visits from Girl Code, Black Girls Code, and Hack the Hood — and there are many other organizations out there making strides in this area.

There’s still a lot of work to be done to build a more welcoming and inclusive tech community, and it’s going to take a combined effort from all of us — students, professors, interns, employees, and employers alike.

Thank you for taking the time to reading this. If you liked this, click the 💚 below so other people will see this here on Medium.