Crunching Parquet Files with Apache Flink
Nezih Yigitbasi
82

Hi, I was wondering if you know of a way do this:

# Let’s say the left side of a join is filtered and has just a few tuples

# When I do an inner join with the right side, I do not want the entire right side datastream to be materialized or collected, partitioned and then joined with the left side

# Is there a way to filter the right side by paging all the filtered ids from the left side into the right side’s source *before* fetching it?

Essentially, a way to do this (Presto user group link) https://groups.google.com/forum/#!topic/presto-users/Ns0q4pvHwfo

Thanks.

Show your support

Clapping shows how much you appreciated ashwin.jayaprakash’s story.