In Part 1 of this 2-part series, I introduced the task of fine-tuning BERT for named entity recognition, outlined relevant prerequisites and prior knowledge, and gave a step-by-step outline of the fine-tuning process.
Here, I’ll discuss the interesting practical challenges that came up while building out the project, as well as what I plan to do differently next time I work on a similar project. These will be important to keep in mind if you’re interested in fine-tuning BERT yourself, especially on a deadline.
I read all of the blog posts and papers I linked above, and took copious notes…
Bidirectional Encoder Representations from Transformers (BERT) is an extremely powerful general-purpose model that can be leveraged for nearly every text-based machine learning task. Rather than training models from scratch, the new paradigm in natural language processing (NLP) is to select an off-the-shelf model that has been trained on the task of “language modeling” (predicting which words belong in a sentence), then “fine-tuning” the model with data from your specific task. …
seabornto produce digestible insights from dirty data
If you work in data at a D2C startup, there’s a good chance you will be asked to look at survey data at least once. And since SurveyMonkey is one of the most popular survey platforms out there, there’s a good chance it’ll be SurveyMonkey data.
The way SurveyMonkey exports data is not necessarily ready for analysis right out of the box, but it’s pretty close. Here I’ll demonstrate a few examples of questions you might want to ask of your survey data, and how to extract those answers quickly. …
Word embeddings are a powerful way to represent the latent information contained within words, as well as within documents (collections of words). Using a dataset of news article titles, which included features on source, sentiment, topic, and popularity (# shares), I set out to see what we could learn about articles’ relationships to one another through their respective embeddings.
The goals of the project were: