Homepage
Open in app
Sign in
Get started
Dataminded
Getting Data Done
Follow
Latest blogs
Portable by design: Rethinking data platforms in the age of digital sovereignty
Portable by design: Rethinking data platforms in the age of digital sovereignty
Recent geo-political and legal rulings have triggered us to investigate how data platforms can be designed for portability across providers
Niels Claeys
Jul 1
Cloud Independence: Testing a European Cloud Provider Against the Giants
Cloud Independence: Testing a European Cloud Provider Against the Giants
Somewhere in Europe. You might be running a small online business, managing a mid-sized automotive supplier, or leading a global…
Thorsten Foltz
Jun 18
A 5-step approach to improve data platform experience
A 5-step approach to improve data platform experience
A guide to turning user feedback into continuous platform improvement.
Frederic Vanderveken
May 13
Why the ‘Private’ API Gateway of AWS Might Not Be as Secure as You Think
Why the ‘Private’ API Gateway of AWS Might Not Be as Secure as You Think
Designing secure applications is a challenge for everyone. A big part of this is based on who can access what. In this blog I want to dig…
Crochelet Pierre
Apr 22
The Data Engineer’s guide to optimizing Kubernetes
The Data Engineer’s guide to optimizing Kubernetes
By default Kubernetes is not optimal for running batch workloads. I tackle spot instances, tweaks in autoscaling and node efficiency…
Niels Claeys
Apr 18
Integrating MegaLinter to Automate Linting Across Multiple Codebases. A Technical Description.
Integrating MegaLinter to Automate Linting Across Multiple Codebases. A Technical Description.
If you’re not familiar with linters, or specifically with MegaLinter, please take a look at my previous article on the topic. In contrast…
Thorsten Foltz
Apr 8
Stop loading bad quality data
Stop loading bad quality data
Rule number one of having good quality data: Stop loading bad quality data. Really, it’s that simple. I see so many companies make this…
Kris Peeters
Apr 1
What Is Data Product Thinking?
What Is Data Product Thinking?
Data product thinking is gaining momentum in the data world right now so we recently organised a live online learning session for Conveyor…
Miruna Suru
Mar 12
Trending blogs from Dataminded
Prompt Engineering for a Better SQL Code Generation With LLMs
Prompt Engineering for a Better SQL Code Generation With LLMs
Picture yourself as a marketing executive tasked with optimising advertising strategies to target different customer segments effectively…
Raghid Bsat
May 29, 2024
Debugging Running Pods on Kubernetes
Debugging Running Pods on Kubernetes
Exploring Kubernetes’s debugging feature, kubectl debug, and extending kubectl debug to support volume mounts
Jonathan Merlevede
Oct 25, 2023
The Mystery of Folders on AWS S3
The Mystery of Folders on AWS S3
The difference between objects and files, and what the AWS Console really does when you press the “create folder” button.
Jonathan Merlevede
Apr 20, 2022
Use dbt and Duckdb instead of Spark in data pipelines
Use dbt and Duckdb instead of Spark in data pipelines
Dbt has become very popular for transformation on top of your data warehouse. We see potential to use dbt with Duckdb on top of a data…
Niels Claeys
Apr 11, 2023
Upserting Data using Spark and Iceberg
Upserting Data using Spark and Iceberg
Use Spark and Iceberg’s MERGE INTO syntax to efficiently store daily, incremental snapshots of a mutable source table.
Jonathan Merlevede
May 25, 2023
Are your AKS logging costs too high? Here’s how to reduce them
Are your AKS logging costs too high? Here’s how to reduce them
Explore how to reduce the cost of logging on Azure by analyzing your applications and investigating Basics log analytics tables in Azure.
Niels Claeys
Mar 10
How to Effectively Structure Data for Self-Service Data Teams
How to Effectively Structure Data for Self-Service Data Teams
For years, data platforms — particularly data lakes and lakehouses — have relied on the medallion architecture. This tiered system…
Kristof Martens
Oct 6, 2024
You can use a supercomputer to send an email but should you?
You can use a supercomputer to send an email but should you?
Discover the next evolution in data processing with DuckDB and Polars
Niels Claeys
Mar 12, 2024
Run Spark Jobs on Azure Batch using Azure Container Registry and Blob storage
Run Spark Jobs on Azure Batch using Azure Container Registry and Blob storage
This is another one of those “how to” blogs that can hopefully help people get up-and-running quickly because it took me a while to figure…
Kris Peeters
Nov 9, 2018
Porting a data platform from AWS to Azure
Porting a data platform from AWS to Azure
We recently created an Azure version of our data platform and this blogpost elaborates on our learnings/issues to support a new cloud…
Niels Claeys
Mar 31, 2022
About Dataminded
Latest Stories
Archive
About Medium
Terms
Privacy
Teams