Published inTowards Data ScienceFaster Spark Queries with the best of both worlds: Python and ScalaHow to use Scala with PySpark for advanced Spark use-cases when Spark SQL and Python UDFs are too slowJun 9, 2021Jun 9, 2021
Published inTrainline’s BlogMeet Kronos: Trainline’s real-time data product platform and teamMaking our data run on timeJan 26, 2021Jan 26, 2021
Published inTowards Data ScienceCustomer Preferences in the Age of the Platform Business with AIHow to use Deep Learning to discover your customer’s preferences and understand your product inventory when you run a platform businessAug 10, 2020Aug 10, 2020
Published inTowards Data ScienceEvery Data Scientist needs some Spark MagicHow to improve your data exploration and advanced analytics with the help of SparkMagicMay 20, 20203May 20, 20203
Published inTowards Data SciencePEX — The secret sauce for the perfect PySpark deployment of AWS EMR workloadsHow to use PEX to speed up deployment of PySpark applications on ephemeral AWS EMR clustersApr 23, 20206Apr 23, 20206
Published inTowards Data ScienceJan Teichmann — dataIQ 100 — InterviewThe most influential people in data 2020Mar 26, 2020Mar 26, 2020
Published inTowards Data ScienceHow to embed a Spark ML Model as a Kafka Real-Time Streaming Application for Production DeploymentThe 2 Types of Model Deployment for Scoring in ProductionFeb 26, 2020Feb 26, 2020
Published inTowards Data ScienceAdvanced Use-Cases for Recommendation EnginesWhat to do when quick and simple Collaborative Filtering is no longer good enough.Dec 9, 20192Dec 9, 20192
Published inTowards Data ScienceComplete Data Science Project Template with Mlflow for Non-Dummies.Data science project best practices for everyone working either locally or in the cloud, from start-up ninja to big enterprise teams.Nov 18, 20195Nov 18, 20195
Published inTowards Data ScienceBias and Algorithmic FairnessThe modern business leader’s new responsibility in a brave new world ruled by data.Oct 3, 20191Oct 3, 20191