Bogdan CojocarHow to read data from s3 using PySpark and IAM rolesIn this tutorial we will go over the steps to read data from S3 using an IAM role in AWS.Nov 7, 20221Nov 7, 20221
Bogdan CojocarPySpark integration with the native python package of XGBoostIn this tutorial we will highlight how to use the latest XGBoost library version 1.7.0 that works natively with PySparkOct 21, 2022Oct 21, 2022
Bogdan CojocarHow to read data from AWS S3 and Athena in pandas with column validationThis is a step by step tutorial on reading data from AWS S3 and Athena into a pandas DataFrame and doing column validation to assess the…Oct 5, 2022Oct 5, 2022
Bogdan CojocarPySpark ML and XGBoost setup using a docker imageI this tutorial we will build and test a docker image where we will be able to run a jupyter notebook with xgboost fully integrated.Oct 3, 2022Oct 3, 2022
Bogdan CojocarPredicting similar political donors for UK parties using graph dataIn this tutorial we will train a ML graph algorithm that will find similar likely political donors based on their UK companies donations to…Sep 16, 2022Sep 16, 2022
Bogdan CojocarinTowards Data ScienceBuilding a Health Entity labelling service using Azure Kubernetes Service, Seldon Core and Azure…In this tutorial we will build an inference service entirely in Kubernetes in the Azure ecosystemJun 16, 2022Jun 16, 2022
Bogdan CojocarinTowards Data ScienceBuilding a Serverless Azure ML Service Using Cognitive and CDKTFIn this tutorial we will go over using cloud services such as Azure Functions and Cognitive to build a sentiment analysis serviceMay 26, 2022May 26, 2022
Bogdan CojocarinTowards Data ScienceBuilding a Credit Card Fraud Detection Online Training Pipeline with River ML and Apache FlinkIn this tutorial, we will go over writing real time python Apache Flink applications to train an online modelApr 30, 20221Apr 30, 20221
Bogdan CojocarHow to read parquet data from S3 using the S3A protocol and temporary credentials in PySparkWhen we access AWS, sometimes, for security reasons, we might need to use temporary credentials, using AWS STS instead of the same AWS…Jul 21, 20201Jul 21, 20201
Bogdan CojocarinTowards Data ScienceHow to run a PySpark job in Kubernetes (AWS EKS)A complete tutorial on deploying an EKS cluster with Terraform and running a PySpark job using the Spark OperatorJul 16, 20201Jul 16, 20201