<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=315693165909440&amp;ev=PageView&amp;noscript=1">

Upstream

The Big Data Blog

Roy Hegdish

Roy Hegdish is a Product Manager at Upsolver.

Recent Posts

Solving the Upserts Challenge in Data Lakes

Apr 1, 2020 5:55:49 PM / by Roy Hegdish posted in Data Lake, Data Architecture, ETL, SQL, AWS S3, Amazon S3

 

Updating or deleting data (upserts) is a basic functionality in databases, but is surprisingly difficult to do in data lake storage. In this article, we will explain the challenge of data lake upserts, and how we built a solution to enable an efficient and quick update and delete operations on object storage using Upsolver’s SQL-based data transformation engine.

 

Read More

4 Guiding Principles for Modern Data Lake Architecture

Mar 18, 2020 5:44:06 PM / by Roy Hegdish posted in Data Lake, Data Architecture, ETL, SQL, AWS S3, Amazon S3, Event sourcing

 

Data lakes are the cornerstones of modern big data architecture, but getting them right can be tricky. How do you design a data lake that will serve the business, rather than weigh down your IT department with technical debt and constant data pipeline rejiggering? In this document we cover the four essential principles for effectively architecting your data lake.

Read More

How (and Why) to Analyze CloudWatch Logs In AWS Athena

Feb 27, 2020 2:15:06 PM / by Roy Hegdish posted in Data Lake, ETL, SQL, AWS S3, Amazon S3, CloudWatch

Amazon CloudWatch is a monitoring service for AWS cloud resources and the applications you run on AWS. While CloudWatch enables you to view logs and understand some basic metrics, it’s often necessary to perform additional operations on the data such as aggregations, cleansing and SQL querying, which are not supported by CloudWatch out of the box.

Read More

Protecting PII & Sensitive Data on S3 with Tokenization

Feb 24, 2020 3:48:46 PM / by Roy Hegdish posted in Data Lake, Amazon S3, Data security, PII

 

Read More

ETL Your Kinesis Data to AWS Athena in Minutes with UpSQL

Nov 20, 2019 4:00:26 PM / by Roy Hegdish posted in Big Data, Data Lake, Amazon Athena, AWS S3, UpSQL

 

Read More