Vladimir PrusGDPR is a mistakeIn my book, GDPR is a mistaken law, a failed experiment. But many, even among software engineers, don’t quite distinguish between its good…Feb 111Feb 111
Vladimir PrusChaos and Order in Software DevelopmentIn this post, I want to share one of the most important ideas I heard in the last few years — that most companies are too well-run.Oct 1, 20238Oct 1, 20238
Vladimir PrusNext-Gen Web AuthenticationHow to use WebAuthn, hardware keys, and passkeysMay 14, 20231May 14, 20231
Vladimir PrusAnnouncing: Spark Performance AdvisorThe first post of 2023 is a bit different: it’s a product announcement. With my colleagues at Joom, we made public a tool that was very…Feb 28, 20232Feb 28, 20232
Vladimir PrusReliable and Fast Spark TablesTables are fundamental to Spark jobs, and the documentation makes them look simple. As we shall see, they are full of perils. We’ll go deep…Aug 21, 2022Aug 21, 2022
Vladimir PrusTime series trend detection with Bayesian methodsIn this post, we’ll create a do-it-yourself procedure to detect trend changes in time series data. We’ll take ideas from the well-known…Jun 19, 2022Jun 19, 2022
Vladimir PrusSpark on Kubernetes in 2022How we run hundreds of jobs, and how we migrated from EMRJan 30, 20224Jan 30, 20224
Vladimir PrusSpark partitioning: full controlIn this post, we’ll learn how to explicitly control partitioning in Spark, deciding exactly where each row should go. It is an important…Oct 25, 20213Oct 25, 20213
Vladimir PrusAdvanced custom operators in SparkCreating custom DataFrame transformation and efficient user-defined functions.Oct 11, 2021Oct 11, 2021