Joining Two Streaming Sources

More and more analytics is reliant on real-time streams. From gaming interactions to digital advertising to IoT devices, there is a growing number of use cases requiring analytics on streaming data. However, performing continuous join operations on two streaming data sources is very tricky. The timing is a major obstacle as the data rarely comes in at the same time.

Upsolver solves this problem by providing SQL extensions for streaming data. In the following example, we will illustrate how to join an ad impressions stream with an ad clicks stream. Since impressions happen before clicks, we will first select our impressions data source and join it to our clicks data source. We only are interested in whether a click happened, and we only want to record the most recent click in the case of multiple clicks on an impression.

Using the Upsolver SQL extension “WAIT” function, we will wait for some specified amount of time after the impression in order to give any click events adequate time to arrive. We will also use the Upsolver ‘WINDOW” function to limit how much data we process by only looking at a specific time range of data within the stream.

Templates

All Templates

Explore our expert-made templates & start with the right one for you.