Ivan VasquezA “deterministic” procedure to configure an NVIDIA GPU for data science on UbuntuThis article offers a structured approach to configure an NVIDIA GPU for data scienceJul 7, 2019Jul 7, 2019
Ivan VasquezDon’t let Dr.Who hijack your EMR clusterOpening port 8088 to the world will get your EMR cluster taken over by an external exploit in about 3 minutes.Apr 10, 20191Apr 10, 20191
Ivan VasquezReal-world Python workloads on Spark: EMR clustersSecond article of a series, discusses alternatives to run fully developed Python applications on EMR Spark clusters.Feb 27, 20193Feb 27, 20193
Ivan VasquezReal-world Python workloads on Spark: Standalone clustersThis article describes what it takes to deploy and efficiently run fully developed Python applications on Apache Spark.Feb 26, 20191Feb 26, 20191
Ivan VasquezExporting Cassandra time series data to S3 for data analysis with SparkThis article describes a way to periodically move time series data from Cassandra to AWS S3 for analysis using Apache Spark.Nov 29, 20181Nov 29, 20181
Ivan VasquezSetting up a scalable data exploration environment with Spark and Jupyter LabWhether you are a data scientist interested in training a model with a large feature data set, or a data engineer creating features out of…May 2, 20182May 2, 20182
Ivan Vasquez1200.aero: An IoT platform for general aviation safety1200.aero is a free flight monitoring service created specifically for general aviation by a small team of pilots and enthusiasts…Apr 19, 2018Apr 19, 2018