Crunching Parquet Files with Apache Flink
Nezih Yigitbasi
794

Hi, I was wondering if you know of a way do this:

# Let’s say the left side of a join is filtered and has just a few tuples

# When I do an inner join with the right side, I do not want the entire right side datastream to be materialized or collected, partitioned and then joined with the left side

# Is there a way to filter the right side by paging all the filtered ids from the left side into the right side’s source *before* fetching it?

Essentially, a way to do this (Presto user group link) https://groups.google.com/forum/#!topic/presto-users/Ns0q4pvHwfo

Thanks.

Like what you read? Give ashwin.jayaprakash a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.