Homepage
Open in app
Sign in
Get started
Data Science Collective
Advice, insights, and ideas from the Medium data science community
About
Submission guidelines
Follow
Latest stories
I’m the Founding Editor of Data Science Collective — Here’s What’s Coming
I’m the Founding Editor of Data Science Collective — Here’s What’s Coming
Big News: I’m the Founding Editor of Data Science Collective — the New Data Science Publication on Medium.
Paolo Perrone
Feb 11
Data Science Collective: Submission Guidelines
Data Science Collective: Submission Guidelines
Welcome to Data Science Collective, a community-driven publication dedicated to exploring data science through writing. We’re to highlight…
Data Science Collective Editors
Feb 9
LOESS
LOESS
Smoothing data using local regression
João Paulo Figueira
May 24, 2019
A new community home for data science writing on Medium
A new community home for data science writing on Medium
Join Data Science Collective
Data Science Collective Editors
Feb 10
Latest
Benchmarking Our Path to AGI: Measuring AI Progress in 2025
Benchmarking Our Path to AGI: Measuring AI Progress in 2025
What is the state of play with AI in early 2025? Are we in an S-curve of diminishing returns, or actually at the early stages of an…
Aki Ranin
Mar 24
Structural Distillation for Cross-Dataset Uplift Modeling with Reinforcement Learning
Structural Distillation for Cross-Dataset Uplift Modeling with Reinforcement Learning
A Novel Approach to Transfer Partial Teacher Model Knowledge from Control to Treatment Data for Rapid AB Testing and Campaign Optimization
Shenggang Li
Mar 24
What I Learned from 3 Years Working with Chinese Tech Teams
What I Learned from 3 Years Working with Chinese Tech Teams
Busting 3 Myths and Confirming 1
Jose Parreño
Mar 24
How Qubits Are Rewriting the Rules of Computation
How Qubits Are Rewriting the Rules of Computation
From Classical Certainty to Quantum Possibility: Exploring the Science, Math, and Magic Behind the Future of Computing
Cristian Leo
Mar 24
CLIP: The Multimodal Powerhouse Transforming Computer Vision
CLIP: The Multimodal Powerhouse Transforming Computer Vision
When I first met with CLIP and its performance, it impressed me so much that whenever I start a new project, I take CLIP’s performance as a…
Yağmur Çiğdem Aktaş
Mar 24
SmolDocling: A New Era in Document Processing — OCR
SmolDocling: A New Era in Document Processing — OCR
A model that outperforms its competitors 27 times its size with the DocTags format
Buse Köseoğlu
Mar 24
Space Travel for Language Models: How SuperBPE Revolutionizes Tokenization
Space Travel for Language Models: How SuperBPE Revolutionizes Tokenization
A new tokenization approach challenges everything we thought we knew about language processing
MKWriteshere
Mar 24
Is It Possible to “Unlearn” Dangerous Knowledge?
Is It Possible to “Unlearn” Dangerous Knowledge?
Introducing the WMDP Benchmark
Andreas Maier
Mar 23
Building a RAG System with MMR for Safaricom’s Smart Assistant
Building a RAG System with MMR for Safaricom’s Smart Assistant
In this post, we continue to explore the implementation of a Retrieval-Augmented Generation (RAG) system designed for Safaricom — a Telkom…
Herman Wandabwa
Mar 23
Prompt Version Control: Why It’s Essential and How to Implement It Effectively
Prompt Version Control: Why It’s Essential and How to Implement It Effectively
The rise of large language models (LLMs) has turned prompt engineering into a critical skill for the development of LLM based apps. But as…
Edgar
Mar 23
AI Engineering (3/3): Dataset Engineering, Inference Optimization, and Architecture and User…
AI Engineering (3/3): Dataset Engineering, Inference Optimization, and Architecture and User…
Summary of Chip Huyen’s awesome new book.
Marina Wyss - Gratitude Driven
Mar 22
Building an LLM Agent with N8N and Open-WebUI
Building an LLM Agent with N8N and Open-WebUI
A Hands-On Perspective from a Machine Learning Developer
Yu-Cheng Tsai
Mar 22
Integrating LlamaIndex and DeepSeek-R1 for reasoning_content and Function Call Features
Integrating LlamaIndex and DeepSeek-R1 for reasoning_content and Function Call Features
Empowering AgentWorkflow with the strong boost from DeepSeek-R1
Peng Qian
Mar 22
What I Wish I Knew Before Becoming a Solutions Architect in Data
What I Wish I Knew Before Becoming a Solutions Architect in Data
The Human Side of Data Solution Consulting
Han HELOIR, Ph.D. ☕️
Mar 22
About Data Science Collective
Latest Stories
Archive
About Medium
Terms
Privacy
Teams