Data Lake Upserts: Updating, deleting, or inserting data into Amazon S3

Updating and inserting records in order to keep an output table consistent with the state of source data can be challenging to implement on an append-only data lake file system. Upsolver gives you the ability to create extremely fast yet up-to-date tables by combining aggregations with the Upsolver “REPLACE ON DUPLICATE” function.

The example above shows you how to use this feature to run a query 8x faster and reduce the amount of data accessed in the computation by nearly 80x. By replacing the duplicate data instead of appending it, we dramatically reduce the data to scan, which also greatly reduces the compute cost.

Templates

All Templates

Explore our expert-made templates & start with the right one for you.