We’re closing the year strong with some great new features that can help improve the breadth and versatility of your work with Upsolver. Highlights include:
Not using Upsolver yet? Never too late to start. Get your free trial now!
What: Qubole is a popular platform used to query and process large datasets in cloud and on-premise data lakes. This new feature will allow you to leverage Upsolver’s strong ETL and data management capabilities to ingest streaming data into your data lake and prepare it for further analysis in Qubole.
If you've already created an output to Amazon Athena in Upsolver, you can output the same stream to Qubole without having to duplicate it with just a few clicks.
Why: Organizations that use Qubole to analyze very large datasets using SQL, can continue to do so while enjoying all the benefits of a self-service, streaming-first architecture powered by Upsolver. Qubole can be used alongside other SQL engines such as Amazon Athena and databases such as Redshift and Elastic.
Using Upsolver with Qubole can provide significant performance improvements by applying best practices around compression, formats and small file compaction.
How: Sending data from Upsolver to Qubole is quick and painless. Simply follow the step-by-step process outlined on our documentation.
What: Upsolver now enables you to look-up an external dataset as a reference source for querying or enriching a stream.
Why: When performing various enrichments and calculations against streaming data, you might need to reference sources that are either completely static or rarely change. For example, an advertising company might want to enrich clickstream data with information such as country codes or user attributes, which do not appear in the original stream. The new feature will enable you to:
- Use a static data source as a dimension table when analyzing streaming data
- Include data that rarely changes without having to alter source events
- Use static sources within your Upsolver queries without having to rely on workarounds
How: You can choose to include reference data in the Enrichments tab by following the instructions here (documentation).
What: Filter events in or out of a given output, based on a key that appears multiple times in a specific time window - without writing a single line of code.
Why: There are many cases in which the first or last instance of an event is more important than subsequent instances. For example, if an analyst wants to get a clear picture of a user’s interaction with a mobile app, she might want to focus on the first time the user opened the navigation menu rather than every interaction with said menu, and to build a query around that first instance.
How: Event deduplication has never been simpler. Instructions are here.