SQL optimizations
UNION vs UNION ALL
UNION removes duplicates, while UNION ALL does not. If you know that your data does not contain duplicates, use UNION ALL to avoid the overhead of duplicate removal. Alternatively, if the duplicate set is small, you can also filter out duplicates when querying the result set.OverWindow vs GroupTopN
OverWindow is a streaming operator that maintains the state of the window and computes the row number for each row in the partition. Queries look something like:GroupTopN
operator instead.
Checkout Converting StreamOverWindow to StreamGroupTopN for more details.