WordMapper Part One: My JavaScript Journey into the Etymological and Geographical Origins of the English Language

3 min readNov 22, 2019

During the last few weeks of my software engineering program at Flatiron School when it was time for me to finally pitch my final project, I had an idea in mind, but figured there was no way it was gonna work. I wanted to use React, JavaScript, and Rails to create an app where users could look up any word in the English language and see its history, such as first known use and countries it originated in using Google Maps. Not at all impossible, but based on the dictionary API research I’d done, it wasn’t going to be easy. I had so many questions. Like, “how am I going to connect a complicated string consisting of country names and special characters to said country’s root languages, to said countries geographical coordinates?” and, “literally will anyone else care about this?”

In this blog mini-series, I’m going to talk about the code it took to go from looking up a word to rendering its origin countries on a map. In this first installment, I’m going to talk about how I parsed origin languages from a word’s etymology so I could use those languages to get more info.

I plan for this series to serve as not only an explanation of my process, but to also apply the more advanced techniques and theories I’ve been studying towards refactoring and cleaning up some nasty “make it work” code from the two-week crunch I was in while building this.

After lots of research on data sources, I went with the Merriam-Webster Dictionary API (shoutout to Merriam-Webster). It’s free, easy to use, and has great documentation — would recommend. Here’s what the API returns for the etymology of “coffee”:

"Italian & Turkish; Italian {it}caffè{/it}, from Turkish {it}kahve{/it}, from Arabic {it}qahwa{/it}"

Cool, got some languages in there! Now we need to extract those languages so we can associate them with an origin country or region. Here comes the fun part!

Note: this is a muddy area since languages can be traced back to so many different areas and I was unable to find an intuitive data source that associated languages with countries. I did my best research to find which regions certain languages originated in so I could associate them with modern-day countries, but it’s certainly a work in progress!

JavaScript screenshot — A snippet from my extraction function

So here I’m grabbing that string from my Redux store, splitting it into an array, and then using the new E6 Set to prevent duplicates from entering our new Array (recall the multiple instances of “Italian” and “Turkish” in the string above.)

From there, we’re going to use some good old fashioned for loops. On the 5th line, we iterate through our new currentLanguages array, while simultaneously iterating through our big, gross, disgusting allTheLanguages array. I haven’t included a screenshot of said array, but picture an array containing a list of (as many as I could find) known languages throughout the course of human history!

temp is going to split each current iteration of allTheLanguages (or j) into its own little array so we can properly compare arrays, and then finally we’ll push whatever’s a match into a new array called matchedLanguages. Now we have a brand new array of all the root languages associated with our word. What was that again? It was so long ago…

Oh yes coffee! There’s a method to all this madness. Tune in next week for second iteration (sorry) where we’ll see what to do with our matchedLanguages array in our quest to find the geographical origins of “coffee”!

Thanks for reading! And feel free to play around with WordMapper on your own at https://wordmapper.surge.sh/!

WordMapper Part One: My JavaScript Journey into the Etymological and Geographical Origins of the English Language

Written by Sean Padden