Daulet NurmanbetovinTowards Data ScienceBERT Model Embeddings aren’t as good as you thinkToward multilingual sentence embeddings·6 min read·May 31, 2020--1--1
Daulet NurmanbetovinTowards Data ScienceSQL-like window functions in PandasA Single Place for all Pandas Window Functions·3 min read·May 14, 2020--2--2
Daulet NurmanbetovinTowards Data ScienceCutting edge semantic search and sentence similaritySemantic search is a hard problem worth solving in NLP.·9 min read·May 4, 2020--2--2
Daulet NurmanbetovinTowards Data ScienceSummarization has gotten commoditized thanks to BERTState of the art Summarization available for anyone·6 min read·Mar 12, 2020--1--1
Daulet NurmanbetovinTowards Data ScienceCrowd-Sourced Data LabelingHow to increase the robustness of crowd-labelers·5 min read·Mar 12, 2020----
Daulet NurmanbetovinTowards Data ScienceBootstrapping cutting-edge NLP modelsHow to get up and running with XLNet and Pytorch in 5 mins·4 min read·Feb 20, 2020----
Daulet NurmanbetovinThe StartupWeak Supervision, Future of Data LabelingOverview of data labeling for AI, new paradigms, and size of the growing data labeling market.·4 min read·Feb 9, 2020--1--1
Daulet NurmanbetovinTowards Data ScienceExtracting Data from Financial PDFsHow to quickly extract text and data from Municipal Bond CAFR Reports·5 min read·Nov 23, 2019--4--4
Daulet NurmanbetovinTowards Data ScienceGuide on AWS Textract set-upOn how to accurately process PDF files with OCR-as-a-service·3 min read·Nov 2, 2019--2--2
Daulet NurmanbetovinTowards Data ScienceMultilingual Sentence Models in NLPOverview of two major multilingual sentence embedding models·4 min read·Oct 22, 2019--3--3