<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=315693165909440&amp;ev=PageView&amp;noscript=1">

Digital Resource Library

Welcome to our content library - gather insights on big data, machine learning, ETL streaming data & so much more!

Whitepaper

The Modern Data Lake Architecture, Powered by Upsolver

We present the infrastructural challenges of working with big data streams, thereafter presenting Upsolver's solution.

View Details >

Webinar

Frictionless Data Lake ETL for Streaming Data

Discover how Upsolver helped ironSource transform 500K events per second, saving thousands of engineering hours.

View Details >

Blog Post

7 Guidelines for Ingesting Big Data to Data Lakes

7 best practices for big data ingestion - from strategic principles down to the more tactical (and technical) issues.

View Details >

Blog Post

Joining Streams and Big Tables on S3: NoSQL/UpSQL /Spark

One of the biggest challenges when working in a data lake architecture is that you’re dealing with files sitting in a folder.

View Details >

Blog Post

Upsolver Announces SQL-based ETL for Cloud Data Lakes

The new functionality eliminates friction and complexity in big data initiatives, such as machine learning.

View Details >

DATAVERSITY

Data Streaming: 7 Unexpected Paths It’s Taking Today

With 2019 nearly over, do you know where data streaming is headed? Here's where it's heading.

View Details >

Podcast

How Upsolver Is Building A Data Lake Platform

Data Engineering Podcast - Upsolver CTO Yoni Iny discusses how to build a data lake platform in the cloud. 

View Details >

Blog Post

Joining Impression & Click Streams Easily Using UpSQL

We discuss how to use UpSQL to easily create a dataset for predicting whether a user will click on an ad.

View Details >

Blog Post

6 Tips for Querying Big Data in AWS Athena

6 things you need to keep in mind when building out ETL workflows for effectively consuming data in Athena.

View Details >

Blog Post

A Comparison: Apache Kafka vs Amazon Kinesis

Apache Kafka and Amazon Kinesis are two popular messaging queue systems. Let's compare.

View Details >

Blog Post

Alooma is Ending Support for AWS. Here's What's Next

It seems that Alooma is no longer catering to customers on AWS... so what happens next? 

View Details >

Blog Post

Athena or Redshift? Answer these 4 Questions to Decide

Here are 4 basic questions to ask when deciding to use either Athena or Redshift when working with your streaming data. 

View Details >

Blog Post

4 Key Components of a Streaming Data Architecture

Streaming data is becoming a core component of enterprise data architecture for a variety of reasons. 

View Details >

Blog Post

Data Lake ETL for IoT Data: From Streams to Analytics

For most enterprises, IoT projects have yet to cross the proof-of-concept stage and are yet to show clear return on investment.

View Details >

Blog Post

Orchestrating Streaming ETL for Machine Learning

Real-time machine learning brings about many challenges. with many projects getting stuck, failing to come into fruition.

View Details >

insideBIGDATA

Question: Do You Actually Need a Data Lake?

Here are 5 indications that should assist in deciding to join the data lake bandwagon or stick to traditional data warehousing.

View Details >

Webinar

Online Inference with User Personalization at Scale

Applying machine learning models that rely on both historical user data and real-time data is often challenging. 

View Details >

Whitepaper

AWS Athena Performance: Best Practices & Tips

In our whitepaper, we dive into the best practices and tips for maximizing value from Amazon Athena. 

View Details >

Whitepaper

5 Signs You've OutgrownYour Data Warehouse

As data grows, it may be time  to reevaluate where to store the data.  Here are the signs to know. 

View Details >

Webinar

ETL for Amazon Athena: 6 Things to Know

Listen to this webinar recording to learn the 6 core tenets of preparing data for Athena Amazon Athena. 

View Details >

Blog Post

Partitioning Data on S3 to Improve Athena Performance

In an AWS S3 data lake architecture, partitioning plays a crucial role when querying data in Amazon Athena..

View Details >

Blog Post

A Data Lake Approach to Event Stream Analytics

More recently, there's an increased demand event for log data analysis by data analytics teams & business units. 

View Details >

Blog Post

IoT Analytics: Challenges & Innovations

IoT analytics requires a lot of tools, from data lakes to stream processing frameworks and analytics tools.

View Details >

Blog Post

Real-time Machine Learning: Hype vs Reality

Getting machine learning projects off the ground is often easier said than done.  Here are some tips for your next project.

View Details >

Blog Post

Batch & Stream Processing: A Cheat Sheet

One of the questions to ask when planning out your data architecture is the question of batch vs stream processing.

View Details >

Blog Post

ETL Pipelines for Kafka Data: Choosing the Right Approach

Learn the basics of ETL for Kafka streams, and get an overview of three approaches to building a successful pipeline.

View Details >

Blog Post

Big Data Infrastructure: When to Build or to Buy

When you need a new big data infrastructure, when should you build? When should you buy? 

View Details >

Blog Post

Cloud Data Lake vs On-Premises Data Lake

Is it time to move your data lake to the cloud? Here's what you should consider... or shouldn't.

View Details >

Blog Post

Kafka vs. RabbitMQ: Architecture & Use Cases

Apache Kafka and RabbitMQ are two open-source and commercially-supported pub/sub systems.

View Details >

Blog Post

7 Stream Processing Frameworks Compared

Stream processing is a critical part of the big data stack in data-intensive organizations.

View Details >