What’s New in Upsolver – December 2018 Edition

Eran Levy
Upsolver News
December 18, 2018

We’re closing the year strong with some great new features that can help improve the breadth and versatility of your work with Upsolver. Highlights include:

Output to Qubole
Reference data
Deduplicate events

Not using Upsolver yet? Never too late to start. Get your free trial now!

Output to Qubole

What: Qubole is a popular platform used to query and process large datasets in cloud and on-premise data lakes. This new feature will allow you to leverage Upsolver’s strong ETL and data management capabilities to ingest streaming data into your data lake and prepare it for further analysis in Qubole.

If you’ve already created an output to Amazon Athena in Upsolver, you can output the same stream to Qubole without having to duplicate it with just a few clicks.

Why: Organizations that use Qubole to analyze very large datasets using SQL, can continue to do so while enjoying all the benefits of a self-service, streaming-first architecture powered by Upsolver. Qubole can be used alongside other SQL engines such as Amazon Athena and databases such as Redshift and Elastic.

Using Upsolver with Qubole can provide significant performance improvements by applying best practices around compression, formats and small file compaction.

How: Sending data from Upsolver to Qubole is quick and painless. Simply follow the step-by-step process outlined on our documentation.

Reference data

What: Upsolver now enables you to look-up an external dataset as a reference source for querying or enriching a stream.

Why: When performing various enrichments and calculations against streaming data, you might need to reference sources that are either completely static or rarely change. For example, an advertising company might want to enrich clickstream data with information such as country codes or user attributes, which do not appear in the original stream. The new feature will enable you to:

Use a static data source as a dimension table when analyzing streaming data
Include data that rarely changes without having to alter source events
Use static sources within your Upsolver queries without having to rely on workarounds

How: You can choose to include reference data in the Enrichments tab by following the instructions here (documentation).

Deduplicate Events

What: Filter events in or out of a given output, based on a key that appears multiple times in a specific time window – without writing a single line of code.

Why: There are many cases in which the first or last instance of an event is more important than subsequent instances. For example, if an analyst wants to get a clear picture of a user’s interaction with a mobile app, she might want to focus on the first time the user opened the navigation menu rather than every interaction with said menu, and to build a query around that first instance.

How: Event deduplication has never been simpler. Instructions are here.

Published in: Blog , Upsolver News

Eran Levy

As an SEO expert and content writer at Upsolver, Eran brings a wealth of knowledge from his ten-year career in the data industry. Throughout his professional journey, he has held pivotal positions at Sisense, Adaptavist, and Webz.io. Eran's written work has been showcased on well-respected platforms, including Dzone, Smart Data Collective, and Amazon Web Services' big data blog. Connect with Eran on LinkedIn

What’s New in Upsolver – December 2018 Edition

Output to Qubole

Reference data

Deduplicate Events

Templates

All Templates

Output to Qubole

Reference data

Deduplicate Events

Keep up with the latest cloud best practices and industry trends

Subscribe

All Templates