The three years I spent building a language learning platform on my own

What started out as a concept from my parents couch, quickly became a pursuit to build a product that could rival the biggest in the business, Direct Dialect was conceived out of pure curiosity.

In January of 2012, One day while sitting in my parents Florida room (small room in the back of southern houses) I saw a commercial for Rosetta stone. Intrigued with the notion of learning a new language, I proceeded to their website where the $400+ price tag quickly blew my dreams out of the water. So I did what most 19 year olds do when they want software that they can’t afford; I got a copy of the torrent, and boy was I disappointed.

The program itself seemed to be the same Rosetta stone that I grew up seeing on TV; the same “immersive” learning method, the same images and the same sounds on television, but it felt as though it wasn’t really geared towards fully acquiring a language; you’d be given words and phrases that wouldn’t really fit what you’d say on a daily basis, in any language. So I had the notion that “all you really need to know to learn a language is the essential cognates, words, verbs phrases and sentences, and then you’re set”

The concept of cognates was introduced in my high school Spanish class. My teacher was explaining to me how some words in Spanish were the same exact words in English. Later in college, my Italian and Arabic teachers both brought up the same idea of cognates; words that are the same or extremely similar in one language as they are in another.

Then my mind went back to my own language acquisition experience as a kid in Haiti on summer vacation. I was born in the United States and both of my brothers were both born in Haiti, so they both knew the language (Creole) fluently. When I was 4, my family took a trip with us to their homeland for the first time. I didn’t speak the language, but that all changed once I came across the small store that my grandmother owned.

Once arriving, I was hungry for some of the snacks that were on display, but since I couldn’t talk to her because of our language barrier, I was stuck with asking my parents or my brothers to ask her for me. For an American kid who hadn’t had any familiar sweets, the small shop was a dream come true. Cheeto’s and cookies flowed from shelves to ceilings, large bottles of Coca-Cola in glass bottles and not cans, the way that they appeared in all of the TV ads in America but never for sale (at that time), I wanted in on the ground floor of the action, without having to wait for someone to translate. What I did to learn what to say to my grandma to get what I wanted was this; I would wait for one of my brothers to ask for something or do something that I also wanted to do, in this case, it was asking for a soda.

I would then proceed to remember the phrase and modify words in the phrase to ask for what I wanted. So once I knew “Can I please have a soda” in English was “Sil vou ple sim ka pwa yo cola” in Creole, I then knew that I could modify the same “Can I get a …” with whatever I wanted. This was all before I even learned how to read and write. Two months and thirty pounds later, I had acquired the Creole language at the age of four.

Playing games with my cousins helped to further enrich my language learning experience by allowing me to quickly recall and respond to actions as they were happening. For example, when someone kicked a ball, I’d pay attention to what they said and attributed it to whatever words that sounded the same in English and words that I had already acquired in Creole.

Using my real world knowledge of language acquisition, along with countless years of taking foreign language classes, I knew that in order to speak a language both fluently and concisely, (as I stated before), all one needs to do is understand those essential cognates, words, verbs, phrases and sentences in order to transact what you need to say. Now it came time to structure a unique program that could not only allow for a user to speak words and phrases, but acquire the entire depth of speaking a new tongue altogether. Had I known it’d take me three years to complete, and all on my own nonetheless, I might’ve not taken the expedition in building Direct Dialect, but I still went forward with the project.

— — — — —— — — Part 2— — — — — — — — — —

The only programming knowledge I had was some Java and HTML from middle and high school. Only thing is, I cheated most of the time, not to mention all of the classes I skipped in the hallways, so I didn’t really know how that would translate into making Direct Dialect. Armed with a wire frame, my HP Mini, as well as a lot of ambition and naivety, I went out to find an investor in Miami, Florida for a technology company.

(Version one Direct Dialect wireframe)

Although the climate is shifting in favor of tech companies in the area, Miami still has a lot of work to get done in order to be in the conversation of top tech communities. It was even worse back in the summer of 2012. I went to a local meetup called the Micro Venture Club in Miami Beach. “It’s a great idea, but I don’t think you can get it done on your own” was the most frequent comment at the event. When I would ask “would you be interesting in potentially investing in the idea”, I was always met with resistance.

Ambition aside, I was a 19 year old African American asking for 25K from people I was just introduced to. But what really got to me was when those same individuals would change their tone when another startup idea was presented by a pretty Hispanic girl, and a business professional looking white guy, with the same amount of wireframes as I had, but better laptops. That actually started my motor in order for me to get this done, no matter what.

Don’t get me wrong, some of the people there that I pitched my company to were also black and also thought I wouldn’t have been able to build out the product, so I knew it was less about race and more about first appearances. I was a one man band with a wireframe on a laptop that couldn’t even run a full version of Microsoft word, and I had no money to assemble a language learning team, programmers, analytics, data science. It became apparent later that everyone I got in contact with that night who showed interested in investing didn’t have the means to do so when following up. So I decided that I had to become all of those jobs in order to breathe life into my concept, almost the technological living embodiment of Foreigners Juke Box Hero.

PHP tutorials on W3C schools served as my teacher, as well as “googling” for solutions whenever I was stuck, until I found out about the programming deity that is Stack Overflow. I would learn from my programming mistakes (like using PDO instead of the deprecated standard MySql connect they taught on W3C schools at that time), improve on scripts, and work for other startup companies and app ideas, sometimes for free, sometimes to generate revenue, until finally I felt comfortable enough to code on my own work.

I also entered a few hackathons. Some were very pleasant experiences, others were nightmare horror shows. One of my better experiences was Uhack, a local, annual hackathon held by the University of Miami. Although I didn’t win that hackathon, I spent the time given to us to build out some of the first code for Direct Dialect, as well as getting a solid idea of how to design the platform.

(Direct Dialect Working Demo Built at UHack Hackathon)

In October of 2013, I would enter a hackathon being hosted by Lincoln Labs at Venture Hive, a local startup accelerator/incubator in Miami. I asked one of the Lincoln Labs directors if it was ok for me to enter my concept, Direct Dialect that I’d already been working on for more than a year at this point. He ok’d my presenting my project, and I gave my best pitch to the judges, so good that they voted me first place, and I ended up winning.

(Lincoln labs hackathon photo. Me in the far-left-back row with glasses.)

But after someone from one of the losing teams complained to the sponsors providing the prize money that I’d been working on my startup for a while, something I not only disclosed to the Hackathon before the competition, but made no efforts in hiding (not only was it searchable online, but it was entered and logged in from a previous hackathon), I was disqualified and vacated of my first place position. That group would end up placing in a winning spot as a result of my disqualification, hence the reason why they informed the sponsors of the hackathon after I won. I made no effort to conceal the fact that I’d been working on it, (the Uhack hackathon made all entries known online when we submitted it, so Direct Dialect came up on all google searches).

I had deposited the $3,000 check into my Bank of America account the following day and took out the $300 the bank had allotted, only on Monday to see that my bank account had a negative balance of $300 on it, since the check was cancelled the night of the hackathon without my knowledge. This was the only time in this whole three plus year experience in building Direct Dialect in which I almost gave up; no one would invest in me or my product and the burn of that hackathon is still with me to this day. But I used that fire in order to complete the product in its entirety.

— — — — — — — — — — Part 3 — — — — — — — — — —

(The final build of Direct Dialect Spanish ~ English)

The task was arduous to say the least. Not only did I have to build a fully functioning and scalable language learning platform, but I also had to build a mobile app in order to capture the largest amount of traffic that Direct Dialect will see.

The concept of nodes has intrigued me since I learned the word playing Metal Gear Solid 2 for the PS2. In the game, it was a fancy word for a computer, but in real life it represents the structure or grouping of a smaller piece or pieces of phenomena, into a larger array, which represents one concept (learned that from Webster’s dictionary after pausing the game). I took this approach in the content structure of Direct Dialect, to which I call Language Nodes.

The first node is “Cognate Connection”. This node represents Cognates; words that are the same/similar in the language that you already speak. I compiled the lists of cognates for each program language that I planned on launching.

Second, was “Words.” I then compiled a list of the most common words that are used in the English language, and for comparative purposes, I also added the antonyms of those words (good / bad, right / wrong).

Third, was “Past Present Future”. This would contain all of the verbs that are most commonly used by regular people every day. I compiled those words, along with each language’s specific conjugation rules. Once I completed one tense, I would move on to the next until I had finished compiling all of the tenses.

Fourth, “Phrases”. I was conflicted on whether or not to use idiomatic expressions or refer to the logic of commonalities with speech and conversation, as I did with words. The way that I decided on my choice to use a fixed set of phrases instead of idiomatic expressions was imagining myself, being dropped in a foreign country. I then thought of what would be necessary for me to convey as many thoughts with as few phrases as possible. It then boiled down to the greetings, common questions, common responses and common commands.

Fifth, and finally, “Sentences”. These sentences were constructed using the programs prepositions (are, on, etc.) and verbs as a frame for the sentence. For example, I would use the preposition in, and attribute it to a sentence such as “The dog is in the pool”. After the user has a sound comprehension of the Cognate Connection, Words, Past Present Future, and Phrases nodes, the user will have a base to which they can manipulate the sentence for a similar situation. So if the user learns the word fridge as well as the word bottle from the Words node, the user is then able to manipulate the above sentence of “The [dog] is in the [pool]” to “The [bottle] is in the [fridge]” by switching out the nouns. By using this node based method of learning, users are able to manipulate known information to use for their specific purposes.

For the translations, my budget was tight, so I had to use Fiver to get it done. I looked for anyone who had “professional” credentials (i.e. University professors, etc). The translations looked good enough for production mode, with a few logical hiccups that ill address once they’re made known (2000+ words in a program is a lot to maintain), but fair warning on all of their accuracy and validity since they were $5 translations from strangers.

(Direct Dialect workspace with Redvining and Rootnunciation)

Since the content is represented in digital form, I knew it was important to focus the user’s attention on the new content being absorbed. Usually when we read, we tend to look at the content once, and as soon as it’s understood, it gets ignored; which is why I created Redvining.

Redvining uses the color red in order to focus you attention on the targeted content. Using a oscillating pattern starting with normal color text, to red on the first and last letters of a word, phrase or sentence, then making all of the text content red, and finally, back to normal color text, Redvining forces the users attention to the content at hand, thus increasing the amount of time spent looking at the word, phrase or sentence. (Adjusting the speed of Redvining and color blind support is on the way).

Rootnunciation is the perfect pronunciation of target language (language you want to speak) text in your root language (the language you know). Usually, whenever a more phonetic translation is required for deciphering new text, the International Phonetic Alphabet (IPA) is implemented. The IPA is hard to read and decipher if you’re not familiar with the character sets. For example, the above Spanish word “Curioso” would be read “kuˈɾjoso” in IPA. An English speaker with no working knowledge of Spanish would pronounce the word “Cure-E-Oh-Soh”, which would be the English speaker pronouncing the Spanish characters and vowels as English characters and vowels. The rootnunciation pronunciation, “Cooh-rioh-soh” gives the best pronunciation of the Spanish word using English characters. This serves as the actual “Dialect” portion of Direct Dialect; what this means is that with new Rootnunciation pronunciations by region, users can learn region specific pronunciations of words, phrases and sentences geared for the specific geographical region that they’re visiting (i.e. Spanish from Spain is similar to Latin American Spanish, but has its own set of rules and pronunciation, and each Latin American country has its own rules and pronunciations as well).

Live teaching still plays an important role; human interaction is crucial in learning a new language. One of the main reasons you pick up a new tounge is to talk with others, it only makes sense that you’re able to do so. I see Direct Dialect integrating perfectly with the normal flow of a class room, where students are able to go through the program, instead of drilling words from a book, and see what they’ve retained from the lesson with the retention qualifier.

(Rate of retention qualifier)

The way the retention qualifier works is as such; the user identifies the image and word with what they’ve learned from the Direct Dialect lesson (kind of like a multiple choice test) with three distracter answers which are chosen at random. The goal is to have a rate of retention of 100%, which means you remembered all of the content (or you’re really good at guessing). If you fail to score 100%, no worries; either review the lesson again or take the rate of retention qualifier again, until you get a score of 100% for that lesson.

For the sound, I used a custom iteration of the Speech Synthesis API, which uses the same voices as Google translate. I built an application that would convert the sounds to their targeted languages, and save the MP3’s.

I had to compile all of the images by hand, since each image corresponded with a specific word, verb or phrase. I relied on CC0 license images from Pixabay, Morgue Files and freeimages.com for pretty much all of the images used in the program (since they were free).

One drawback of the V1 of the program is the hyphenation. I couldn’t find a good hyphenation API, so I ended up settling with hyphenator.js. It’s accurate about 90% of the time, which means it won’t be that big of an issue while using Direct Dialect.

My plan was to launch in May of 2015 (this was the 7th lauch date I gave myself) with six different languages, at the price point of $49.95, world’s cheaper than Rosetta Stone. Upon reading an article about Duolingo, their free model and its growth, I knew that I was time to pivot towards a free model, but I didn’t want that to cheapen the value of my product.

I’m actually a longtime fan of Duolingo. It seems like they got the mix of gamified learning and actually being a fun game to learn a language right. I met with Luis Von Ahn at The LAB Miami, another local startup hub in the area. We shared similar woes about the industry, and he gave me great feedback on my product which actually helped me to keep going.

(Luis Von Ahn in the green shirt. Me, second from right)

A few months earlier, I had the notion of integrating advertising as a facilitator in content retention. We see ads everyday with popular logos, slogans and characters that have been enshrined into our conscious and subconscious. What if we could use that as a point of reference in learning new ideas and concept, for example if the word / phrase is “I drink”, the actual image being used would be a can of a well known soda brand, for example, Coke (not affiliated nor being paid for this placement by Coke or any other Coke placement in this article).

Coke’s worldwide brand recognition, along with the new content being absorbed, would build a bridge with Coke as the point of reference to the word; whenever the user attempts to recall the word drink in Spanish, they will think of Coca-Cola since. By using advertising within the educational content, it’s a win-win-win in terms of brand introduction into new buyers and countries, users being able to draw parallels between words that they need to remember with brands and logos that they will see when traveling, visiting and shopping in new countries, as well as an extremely viable business model for Direct Dialect, one that shows so much potential, that I pivoted away completely from a paid model for the programs main content.

(Example of how integrating a well known brand with new word content can help to build stronger bridges with recall and retention)

When I built Direct Dialect out, I built it for scale; I built it knowing that it would have to be able to take multiple languages, and be able to update relatively fast in order to keep its appeal. This is why I went with HTML5, and I’m glad I did. What this means is that, combined with my advertising revenue model, I could implement an API that uses Direct Dialect in place of a banner ad that not only teaches a new language, but also displays a product for advertising that the user will hone in and focus on more than if it were a simple and plain image. It will also be able to work on any platform that has a web browser or a web interface, with absolutely no hassle. I knew that this could be a game changer in the education and adverting space, forever.

Programming helped a lot in the development in the program as well; other then the coding that went into it, the logical convergences between the programs structure and how the code structure would work made a lot of new concepts come together with ease. The only other time I had ever programmed in my life was in school, and I always copied off of my friends. Having to actually make a real world product that people will use, ups the ante of what is acceptable and what needs to be changed. I completely pivoted my business model away from a fully paid program, towards a model that allows for displaying products in images that are relevant to the word, which means high quality content and advertising that the user will be fully focus and engaged with, and unlike me pirating Rosetta Stone, it makes no sense to steal something that’s free.

Even though I was able to build out the product and come up with unique revenue models, it took a lot out of me. Spending three and a half years focus on a product has its drawbacks. One of the cons in spending so much time in front of a computer screen is that you begin to forget the concept of time. Years feel like months, months feel like days, and days feel like hours that you use to get work done.

It’s not like I didn’t go to people and ask to join my cause; I even used my last $25 at one point to make a craigslist post for a Creative Director to help me out with design and formatting content. Rather, the only interested individuals in my product simply wanted part once it was working and add little to no value at this point, or their waiting on it to start making revenue (need I remind you that it is South Florida, not Silicon Valley). I find myself forgetting my age at times, thinking that I’m either 19 or 20 (I’m 23) from years of overworking and programming until the sun comes up; most birthdays and holidays get spent with your laptop. But I knew that deep down in my heart, if I ruined this God given opportunity to build something while I was young, I might have proverbially blown my one shot away and never gotten a change to do anything this fun, stressful or cool again. So I went all guns blazing and got it done, no matter how little money I had, or what obstacles were in my way. I hope that my story can inspire entrepreneurs that are thinking of a startup idea and don’t have the means to get it done; all you really need is the right mindset, the right amount of Google searches, and the right people around you for motivation, to get the job done.

Like what you read? Give Thelson Richardson a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.