Roxane BoisGrooming Detection Part. 3: A BERT Multiclass Model ExperimentA BERT model can classify chat messages regarding if they contain grooming risks or not. Here we share BERT info, methodology and results.Jan 12, 20231Jan 12, 20231
Roxane BoisGrooming Detection Part. 2: A Rule-base Matching SolutionThis blog post presents a way to detect grooming in chat data thank to a spaCy Matcher filter. We share useful methodology and results.Jan 11, 2023Jan 11, 2023
Roxane BoisGrooming Detection Part. 1: How to detect grooming in chat data?This blog post is a SOTA review of grooming detection in texts. We share open-access datasets, models, features and groomers behavior info.Jan 10, 20231Jan 10, 20231
Mohamed BamouhNLP Attacks, Part 1 — Why you shouldn’t trust your text classification modelshis blog post series is about a vast and vital field that combines Artificial Intelligence and Linguistics: NLP Attacks.Dec 20, 2022Dec 20, 2022
Jade MoillicspaCy’s lemmatizer: lowercase limitationsWhy are uppercases a problem?Nov 28, 2022Nov 28, 2022
Jade MoillicspaCy matchers guidelinesIn this post, we will look at the matchers that can be used in spaCy to be able to create semantic and/or syntactic filters.Jul 11, 2022Jul 11, 2022
Jade MoillicLanguage Identification for very short texts: a reviewThe goal of this blog post is to benchmark different Language Identification models on datasets containing short texts.May 25, 2022May 25, 2022
Mohamed BamouhText Summarization, Part 4 — Twitter bot for Automatic Summarization of Paper AbstractsThis chapter shows a concrete application of Automatic Text Summarization for a specific purpose : Scientific Paper SummarizationMar 24, 20221Mar 24, 20221
Mohamed BamouhText Summarization, Part 3 — Data Pipeline and ResultsWe’re going to present the data pipeline used to train our models, as well as showcase the results of the training procedure on our…Mar 15, 2022Mar 15, 2022
Mohamed BamouhText Summarization, Part 2 — State Of the ArtIn this section, we’re going to enumerate the most interesting methods and datasets for automatic text summarization.Mar 9, 20221Mar 9, 20221