Ana SuzukiApache Flink on YARN with Kerberos AuthenticationSetting up Flink on YARN is written pretty much straight forward on the documentation. But what if what you need to do is much more…2 min read·Sep 17, 2019--1--1
Ana SuzukiApache Spark Custom LoggingThe example below is only applicable if your spark job runs on yarn deployed on client.1 min read·Jun 11, 2019----
Ana SuzukiScala: List and Iterator SimulationA simple simulation on the difference between list and iterator. We all know that in Scala collection’s lineage, Iterable is very much the…1 min read·May 27, 2019----
Ana SuzukiAVRO vs Parquet — what to use?I won’t say one is better and the other one is not as it totally depends where are they going to be used.2 min read·Jan 16, 2019--1--1
Ana SuzukiWho’s this Spark Listener?Jacek Laskowski made a good documentation regarding spark listeners. I made this page since we keep on encountering this ERROR:2 min read·Jun 13, 2018----
Ana SuzukiRelocating Classes using Apache Maven Shade PluginI just had a weird encounter upon running Spark Job.2 min read·Oct 24, 2017----
Ana SuzukiImport Mysql data to HDFS using SqoopThe trending topic in big data is now about AIs. It’s quite weird that I am posting something that’s not new in the big data world. This…6 min read·Oct 4, 2017----
Ana SuzukiStreaming Kafka Receiver-Less ApproachKindly read https://spark.apache.org/docs/latest/streaming-kafka-0-8-integration.html for more information about Receiver-Less Approach.2 min read·Jul 5, 2017----