A Data Linguist on a Software Team
Hannah Lindsley, Data Linguist, @hlindsley
From undergrads in music and social work, to PhDs in philosophy, to those who never graduated high school — I’ve had quite a variety of co-workers during the ten years I’ve been in tech. Of course, in every software company you’ll find your traditional computer science and engineering graduates as well. However, there’s a significant and growing population of developers who took a different path to learn to code and are building a profession out of it.
I hold a degree in linguistics, which in most universities is not a computational program, but an anthropological one. Required coursework includes topics such as historical and cultural language studies. If you need to analyze tribal dialects or piece together an ancient scroll, a linguist may be optimal for the task.
I found my technical home in natural language processing (NLP), a field where computational approaches to machine understanding of text have plateaued in their usefulness. Experts in language structure can be valuable assets to companies that want to rise above the scalability issues of clunky rule-based systems and the limited utility of mass heuristics.
Let’s take this tweet for example.
Can you imagine if you had to write a set of rules to understand everything said in weird twitter? The technical writers and grammarians out there are probably cringing at this tweet, thinking, “She didn’t capitalize properly! She didn’t use punctuation!” And she isn’t even telling the truth, for that matter.
But it’s a linguist’s job to describe how people do communicate, not how they should communicate, and that’s where I step in. How do you correlate enormous, disparate data sets, especially when people try to communicate using natural language? My teammates and I are solving this problem using special structures to maintain all the possible meanings of a chunk of text.
Companies are increasingly realizing the potential locked away in tweets, doctor’s notes, reviews, and a multitude of other data that is not easily understood by traditional models. Harnessing the value of this unstructured data is a perfect place for a linguist in tech.
Aside from fostering a totally awesome and unique work environment, the loosening of educational requirements in programming is of special interest for minority groups — who for many reasons have historically had something of a barrier to tech jobs. Women in particular stand to benefit from the increasingly inclusive nature of programming. Last year, I was introduced to Holly Gibson, the director of Austin’s chapter of Women Who Code (WWC). The organization’s mission is to empower women and girls in tech through sponsorship, opportunity advancement, mentorship, and ongoing education.
In my first call with Holly, I confessed that I didn’t really know how to be involved with the organization. I felt out of my depth trying to talk to a group of women about how to succeed in tech. Though I have my dream job and feel both valued and valuable as a female in tech, I’ve gone through what a lot of people have experienced in their career: fight or flight when somebody reviews my code, impostor syndrome, frustration upon failure, and an unwillingness to acknowledge my own successes. Holly explained that these stories are what motivate many women to continue in a field that is only recently accepting of them.
Last month, we hosted a WWC event at CognitiveScale to discuss the challenges and goals of understanding text in the era of cognitive computing. My teammates and I formed a panel to describe our unusual paths to programming, and how we’ve applied our backgrounds and experiences in unique ways to the problems we solve in cognitive computing, as well as the broader issues in machine understanding of structured and unstructured data.
Developer teams composed of individuals with different backgrounds is a burgeoning trend in tech. Inclusivity not only furthers the quest in getting computers to speak like humans — instead of humans speaking like computers — but also adds a much needed pathway for advocating diversity in tech.
#WomenWhoCode #WomenInTech #NLP #Linguistics #AI #CognitiveComputing #MachineLearning #KnowledgeRepresentation #Culture